urllib2 - Unable to download file,Python xlsx File download zero bytes -

March 15, 2012

after running code,downloaded file 0bytes. tried writing response too,also tried using buffer

what doing wrong,what else can try? please help

import urllib2 bs4 import beautifulsoup import os import pandas pd     storepath='/home/vinaysawant/bankifsccodes/' def downloadfiles():     # remove trailing / had, gives 404 page     url='https://rbi.org.in/scripts/bs_viewcontent.aspx?id=2009'      conn = urllib2.urlopen(url)     html = conn.read().decode('utf-8')     soup = beautifulsoup(html, "html.parser")      # select elements href attributes containing urls starting http://     link in soup.select('a[href^="http://"]'):         href = link.get('href')          # make sure has 1 of correct extensions         if not any(href.endswith(x) x in ['.csv','.xls','.xlsx']):             continue         filename = href.rsplit('/', 1)[-1]         print href         print("downloading %s %s..." % (href, filename) )         #urlretrieve(href, filename)         u = urllib2.urlopen(href)         f = open(storepath+filename, 'wb')         meta = u.info()         file_size = int(meta.getheaders("content-length")[0])         print "downloading: %s bytes: %s" % (filename, file_size)         print("done.")         file_size_dl = 0         block_sz = 8192         while true:             buffer = u.read(block_sz)             if not buffer:                 break              file_size_dl += len(buffer)             f.write(buffer)             status = r"%10d  [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)             status = status + chr(8) * (len(status) + 1)             print status,         f.close()          exit(1)   downloadfiles()

i tried

import urllib urllib.retreive(url)

i tried using urllib2 urllib3 well.

i not pandas , urllib2 since there no answer question. think problem trying download first url

url='https://rbi.org.in/scripts/bs_viewcontent.aspx?id=2009

you define here , doesnt change

u = urllib2.urlopen(url)

after try download thing associated url

buffer = u.read(block_sz)

instead of them guess should try download href try change this

u = urllib2.urlopen(url)

with that

 u = urllib2.urlopen(href)

Search This Blog

Enable

urllib2 - Unable to download file,Python xlsx File download zero bytes -

Comments

Post a Comment

Popular posts from this blog

Sort a complex associative array in PHP -

vb.net - How to ignore if a cell is empty nothing -

How to restore default keyboard shortcuts on Ubuntu-17.04? -