Why is the HTML from a website different from the HTML that Python's request library gives? -
i'm trying familiarize myself requests , beautifulsoup gave myself mini-project. i'm trying make program displays shoes on footlocker's release calendar (https://www.footlocker.com/release-dates/) so:
- shoename#1 date#1
- shoename#2 date#2
- shoename#3 date#3
- shoename#4 date#4
so far have this:
import requests req bs4 import beautifulsoup def main(): url = "https://www.footlocker.com/release-dates/" resp = req.get(url) soup = beautifulsoup(resp.content, "html.parser") print(soup)
however when load html parse beautifulsoup html code information containing dates , names of shoes not appear when use inspect element directly on website. assume because html shoe's information generated javascript. if how can load requests?
thank you.
if have selenium installed in machine, that's okay otherwise, install it. here how go.
from selenium import webdriver bs4 import beautifulsoup driver = webdriver.chrome() driver.get("https://www.footlocker.com/release-dates/") soup = beautifulsoup(driver.page_source, "lxml") driver.quit() item in soup.select(".day"): shoe = item.select_one(".productname").get_text() date = item.select_one(".date").get_text() print(shoe,date)
partial results:
jordan retro 1 hi og 1aug kids' jordan retro 1 hi og 1aug jordan retro 1 hi og 1aug kids' jordan retro 1 hi og 1aug nike kobe a.d. nxt 1aug nike dualtone racer 1aug
Comments
Post a Comment