html - Not able to select links from a module on a website using BeautifulSoup -


i have build scraper extract links company's website (i have permission), when try add in url jobs posted, i'm able retrieve of links. seems job's stored in kind of module whereby can't access them using scraper.

html parbase section html name of module can't seem access

question

why scraper not able pull urls job posts link have provided below?

link jos postings here: https://www.pwc.dk/da/karriere/ledige-stillinger.html

code scraper

import requests bs4 import beautifulsoup   url = "http://www.pwc.dk/da/karriere/ledige-stillinger.html" r = requests.get(url)  soup = beautifulsoup(r.content)  links = soup.find_all("a")  link in links:             print "<a href='%s'>%s</a>" %(link.get("href"), link.text) 

as webpage javascript-heavy one, need use selenium gatecrash. install selenium , give try:

from selenium import webdriver bs4 import beautifulsoup  driver = webdriver.chrome() driver.get("https://www.pwc.dk/da/karriere/ledige-stillinger.html") soup = beautifulsoup(driver.page_source, "lxml") driver.quit() item in soup.select(".vbtitle a"):     print(item.get("href")) 

Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -