google chrome - Get request with Python requests returning different page than seen in browser -
i have been trying request on youtube video page in order read simple information off of page. have done many times before, , quite easy reverse engineer request of google chrome's developer tools.
to demonstrate, here screen shot of request when reload youtube video in fresh incognito window (to prevent cookies being sent) seen developer menu: chrome screenshot
every time close window , reload page recieve identical html (apart authorization keys , like), bottom of can seen here: another chrome screenshot
first tried recreating request using header-less requests in python:
import requests sesh = requests.session() print sesh.get("https://www.youtube.com/watch?v=5ea8ivrqwn8").content
this returns different page still contains of data present on page chrome not of it. next tried including headers saw in chrome request using following code:
import requests sesh = requests.session() headers = { "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", "accept-encoding": "gzip, deflate, br", "accept-language":"en-us,en;q=0.8", "upgrade-insecure-requests": "1", "user-agent": "mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, gecko) chrome/60.0.3112.113 safari/537.36"} print sesh.get("https://www.youtube.com/watch?v=5ea8ivrqwn8", headers = headers).content
however strangely returns seemingly random quick paragraph of unicode characters in varying lengths, around 10 characters long, closer 50. couldn't think of other ways make closer request seeing chrome. tried fiddling couple of hours doing things running request multiple times in same session , messing headers bit, no avail.
finally out of desperation tried dropping except user agent, using following code:
import requests sesh = requests.session() headers = {"user-agent": "mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, gecko) chrome/60.0.3112.113 safari/537.36"} print sesh.get("https://www.youtube.com/watch?v=5ea8ivrqwn8", headers = headers).content
and got me page wanted.
however left unsatisfied knowledge somehow replicating seeing in chrome didn't work. missing second attempt?
Comments
Post a Comment