php DOM extract links from specific table -


in code want extract links , text old website successful problem somewhere have used ol>li tags , somewhere used ul>li tags inside table , have 400 different pages can extract links have change ol ul every time easiest , time saving way me extract links , text pages define specific <table> contains links when define <table> extract links others other tables don't want

table structure target contains ol>li or ul>li tags

<table style="width:850px;" cellspacing="0" cellpadding="1" border="3">     <tbody>         <tr>         <td style="text-align: center; background-color: rgb(51, 51, 204);">             <h1>my links</h1>         </td>         </tr>         <tr>             <td>                 <ol>                     <li><a href="http://websitelink.com/page1.php">page 1</a></li>                     <li><a href="http://websitelink.com/page2.php">page 2</a></li>                     <li><a href="http://websitelink.com/page3.php">page 3</a></li>                     <li><a href="http://websitelink.com/page4.php">page 4</a></li>                 </ol>                 ...                 <ul>                     <li><a href="http://websitelink.com/a.php">page a</a></li>                     <li><a href="http://websitelink.com/b.php">page b</a></li>                     <li><a href="http://websitelink.com/c.php">page c</a></li>                     <li><a href="http://websitelink.com/d.php">page d</a></li>                 </ul>             </td>         </tr>     </tbody> </table> 

my current php code

$html = file_get_contents('http://mywebsitelink.com/pagename.html'); $dom = new domdocument; @$dom->loadhtml($html); $oltags = $dom->getelementsbytagname('ol'); // have change between ul , ol instead of can define table  foreach ($oltags $list){     $links =  $list->getelementsbytagname('a');     foreach ($links $href){     $text = $href->nodevalue;     $href = $href->getattribute('href');     if(!empty($text) && !empty($href)) {     echo "link title:     " . $text . "       location:     " . $href . "<br />";     }     }  } 

$html = file_get_contents('http://mywebsitelink.com/pagename.html'); $dom = new domdocument; @$dom->loadhtml($html);  $xpath = new domxpath($dom);  $thetags = $xpath->query('//table/tbody/tr/td/ol/li/a|//table/tbody/tr/td/ul/li/a');  foreach($thetags $onetag) {     $links =  $onetag->getelementsbytagname('a');      foreach ($links $onelink){         $text = $onelink->nodevalue;         $href = $onelink->getattribute('href');         if(!empty($text) && !empty($href)) {             echo "link title:     " . $text . "       location:     " . $href . "<br />";         }     } } [...] 

Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -