proxy - Scrapy and Tor/Privoxy unable to crawl [Connection refused 61] -
i have following code refered here in middlewares.py i'm trying change ip in tor every request
def _set_new_ip(): controller.from_port(port=9051) controller: controller.authenticate(password='tor_password') controller.signal(signal.newnym) class randomuseragentmiddleware(object): def process_request(self, request, spider): ua = random.choice(settings.get('user_agent_list')) if ua: request.headers.setdefault('user-agent', ua) class proxymiddleware(object): def process_request(self, request, spider): _set_new_ip() request.meta['proxy'] = 'http://127.0.0.1:8118' spider.log('proxy : %s' % request.meta['proxy'])
but when try start crawling in scrapy keeps returning me following:
2017-09-10 22:36:44 [scrapy.extensions.telnet] debug: telnet console listening on 127.0.0.1:6023 2017-09-10 22:36:44 [stem] debug: getconf __owningcontrollerprocess (runtime: 0.0004) 2017-09-10 22:36:44 [stem] info: error while receiving control message (socketclosed): empty socket content 2017-09-10 22:36:44 [it] debug: proxy : http://127.0.0.1:8118 2017-09-10 22:36:44 [scrapy.downloadermiddlewares.retry] debug: retrying <get https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 1 times): connection refused other side: 61: connection refused. 2017-09-10 22:36:44 [stem] debug: getconf __owningcontrollerprocess (runtime: 0.0003) 2017-09-10 22:36:44 [stem] info: error while receiving control message (socketclosed): empty socket content 2017-09-10 22:36:44 [it] debug: proxy : http://127.0.0.1:8118 2017-09-10 22:36:52 [scrapy.downloadermiddlewares.retry] debug: retrying <get https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 2 times): connection refused other side: 61: connection refused. 2017-09-10 22:36:52 [stem] debug: getconf __owningcontrollerprocess (runtime: 0.0004) 2017-09-10 22:36:52 [stem] info: error while receiving control message (socketclosed): empty socket content 2017-09-10 22:36:52 [it] debug: proxy : http://127.0.0.1:8118 2017-09-10 22:36:56 [scrapy.downloadermiddlewares.retry] debug: gave retrying <get https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 3 times): connection refused other side: 61: connection refused. 2017-09-10 22:36:56 [scrapy.core.scraper] error: error downloading <get https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology>: connection refused other side: 61: connection refused. 2017-09-10 22:36:56 [scrapy.core.engine] info: closing spider (finished)
Comments
Post a Comment