pagination - xpath for a crawler in python -

April 15, 2010

i'm working on crawler using scrapy in python, , done, have little problem. website using pagination that:

<div class="pagination toolbarbloc">             <ul>                     <li class="active"><span>1</span></li>                     <li><a href="...">2</a></li>                     <li><a href="...">3</a></li>                     <li><a href="...">4</a></li>                     <li><a href="...">5</a></li>                     <li><a class="end" href="...">>></li>             </ul>         </div>

so try catch "href" balise li after li class "active".

i try that:

next_page_url_xpath = '//div[@class="pagination toolbarbloc"]/ul/following-sibling::li[@class="active"]/a/@href'

but didn't work : indexerror: list index out of range

i begin xpath , know it's simple after read lot of doc', i'm not successful that.

thanks lot me !

try below expression:

//div[@class="pagination toolbarbloc"]/ul/li[@class="active"]/following-sibling::li/a/@href

note missed @ in [class="pagination toolbarbloc"] , li not sibling of ul

Search This Blog

Enable

pagination - xpath for a crawler in python -

Comments

Post a Comment

Popular posts from this blog

Sort a complex associative array in PHP -

vb.net - How to ignore if a cell is empty nothing -

How to restore default keyboard shortcuts on Ubuntu-17.04? -