R
E
S
O
U
R
C
E
S
       Home      Products & Services      Contact Us      Links


WebHatchers will design & develop your site for you.
_______________________

Website Menu Heaven: menus, buttons, etc.
_______________________

Send us your questions.
_______________________

site search by freefind
_______________________

HOME
SEO, Google, Privacy
   and Anonymity
Browser Insanity
JavaScript
Popups and Tooltips
Free Website Search
HTML Form Creator
Animation
Buttons and Menus
Counters
Captchas
Image Uploading
CSS and HTML
PHP
AJAX
XPATH
Website Poll
IM and Texting
Databases—MySQL
   or Not MySQL
Personal Status Boards
Content Management
   Systems
Article Content
   Management Systems
Website Directory
   CMS Systems
Photo Gallery CMS
Forum CMS
Blog CMS
Customer Records
   Management CMS
Address Book CMS
Private Messaging CMS
Chat Room CMS
JavaScript Charts
   and Graphs




Free Personal Status Boards (PSB™)

Free Standard Free PSB

Free PSB Pro Version

Free Social PSB

Free Social PSB Plus (with Email)

Free Business PSB

Free Business PSB Plus (with Email)

PSB demo

Social PSB demo

Business PSB demo

So what's all this PSB stuff about?

Chart comparing business status boards

PSB hosting diagram

PSB Licence Agreement



Copyright © 2002 -
MCS Investments, Inc. sitemap

PSBs, social networking, social evolution, microcommunities, personal status boards
PSBs, social networking, business personal status boards
website design, ecommerce solutions
website menus, buttons, image rotators
Ez-Architect, home design software
the magic carpet and the cement wall, children's adventure book
the squirrel valley railroad, model railroad videos, model train dvds
the deep rock railroad, model railroad videos, model train dvds

List All Elements in HTML Document by Tag Name Using XPATH and PHP

This script will List All Elements in HTML Document by Tag Name Using XPATH and PHP. In this case, we've used a sample file that is this HTML file, called:
html-test-table.html. Here is all it has in it:

<html>
<body>

<table border="1">
  <caption>Monthly savings</caption>
  <tr>
    <th>Month</th>
    <th>Savings</th>
  </tr>
  <tr>
    <td>January</td>
    <td>$100</td>
  </tr>
  <tr>
    <td>February</td>
    <td>$50</td>
  </tr>
</table>

</body>
</html>


The script uses the PHP DOM extension and PHP 5. The DOM extension is enabled by default in most PHP installations, so the following should work fine—it does for us. The DOM extension allows you to operate on XML documents through the DOM API with PHP 5. It supports XPATH 1.0, which this script uses extensively. XPATH has been around awhile. What is it? XPath is a syntax for defining parts of an XML document (or an HTML or XHTML one). It uses path expressions to navigate in documents. It contains a library of standard functions.

The DOMXPath class has the DOMDocument property and several very useful methods: DOMXPath::__construct, DOMXPath::evaluate (which evaluates the given XPath expression and returns a typed result if possible or a DOMNodeList containing all nodes matching the given XPath expression), DOMXPath::query (which evaluates and executes the given XPath expression and returns a DOMNodeList containing all nodes matching the given XPath expression), DOMXPath::registerNamespace (which is necessary to use XPath to handle documents which have default namespaces described in the xmlns declaration which in the case of a sitemap is in the urlset tag), and DOMXPath::registerPhpFunctions. Most XML files seem to have no xmlns declaration (e.g., PAD files), therefore needing no namespace registration.

A new DOMDocument object is created. We load in the HTML file with the loadHTMLFile method. The $dom->loadHTMLFile($html) code loads $dom as it gets our HTML file's contents into the DOM object. We perform the task (on our HTML file) of listing all tag nodes, using * which means "all". We do it using the DOMDocument method without XPath first, then we do it again with XPath. Just to be cute we turn all node values into one string, $all. Then we use the PHP
explode() function to dump the values into the $pieces array. Using a foreach loop next, we check for empty array values (there are quite a few, for some reason), and when we find them we unset them. Finally, we echo the pieces array to the screen, getting a column containing all nodes in the test HTML file.

Now we use XPATH. We create a new object with $xpath = new DOMXPath($dom), then perform XPath queries on the HTML file. First, we get one node at a time. But when we get to the td tags, we define the $td array (which is not needed, but it's a convenient place to store HTML document info if you need to), then loop through the node values, echoing them to the screen. To get echoable nodes, we use the length of this DOMNodeList in a for loop to loop through these nodes, getting strings we can echo by use of: ->item(0)->nodeValue. The results of our use of the XPath query method is a DOMNodeList that contains node values we put into the $td array. We need strings that we can echo since raw DOM objects do not echo until you get their value as a string since echo only outputs strings, and nodeValue gets the nodes as strings.

Last, we use the non-XPath getElementsByTagName() method with the * parameter again, but this time merely use a loop to once again display all the nodes we find. Interestingly enough, the first three nodes found contain all node values, which we assume displays everything in the HTML tag, then everything in the body tag, then everything in the table tag.

The getElementsByTagName() method is used for listing the caption and th and td tags' nodes, in the non-XPath version, since using it you don't even need the XPath syntax and we felt it would be good to show both the DOM-only and XPath methods. We use XPath query to illustrate how it's done with XPath expressions and XPath queries even though getElementsByTagName() without XPath would do as well—as we illustrated. Keep in mind that XPath can do a lot that DOMDocument objects alone could never do. A non-XPath version of XML file node selection is at List Specified Elements in XML Document by Tag Name Using XPATH and PHP. An XPath version using XPath query is below after the getElementsByTagName() method without XPath, followed by a non-XPath node lister loop, just for kicks.

As you will see in List Specified Elements in XML Document by Tag Name Using XPATH and PHP, you can get tag nodes one at a time using getElementsByTagName, but this is useful only if there are a lot a unique tags with few or no children. In the script on this page, there are three different tags in the HTML file with a couple of child nodes, so we can get child nodes either one at a time or in a loop. We chose one at a time at first in the XPath version, then switched to looping through the DOMNodeList. In the List Urls in XML Sitemap by Tag Name Using XPATH and PHP script, we loop through results we get when using the getElementsByTagName() method, since this method returns a new instance of class DOMNodeList containing the elements with a given tag name. These are easy to loop through, as this page's script also demonstrates.

For DOM-only versions using the getElementsByTagName() method, there's no need for $xpath = new DOMXPath($doc), which creates an XPath object to use with the getElementsByTagName() method, because you do not need XPath for a getElementsByTagName method. But for $xpath->query() methods, XPath is essential. Note that we did not need to deal with namespace registration or XPATH either with the getElementsByTagName() method, because no XPATH is necessarily involved, but we needed it for the XPATH versions of scripts.

If an XPATH expression or non-XPATH expression returns a node set, you will get a DOMNodeList which can be looped through to get values. In the non-XPATH version in List Specified Elements in XML Document by Tag Name Using XPATH and PHP, we simply forget the loop and just get the node values of four different tags found in the file. This is good if there are no tags with the same tag name or few child tags under any one parent tag. But, it is essential to loop through nodes when there are many elements with the same tag name, as in List Urls in XML Sitemap by Tag Name Using XPATH and PHP.

The script in List Elements in HTML Document by Tag Name Using XPATH and PHP does the same type of tasks as this page's script, except it does it a bit differently.

In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document nodes. You can get more information on the syntax to use in XPath expressions in the W3Schools XPath expression page.

<?php

$dom = new DOMDocument(); // Initializing the DOM [Document Object Model]
$html = "html-test-table.html";
$dom->loadHTMLFile($html); // Loads the source for parsing
$nodes = $dom->getElementsByTagName("*");
$all = $nodes->item(0)->nodeValue;
$pieces = explode(" ", $all);
$n=count($pieces)."<BR>";
foreach ($pieces as $k => $v) {
if(empty($v)){unset($pieces[$k]);}}
for($i = 0; $i < $n ;$i++){
if($pieces[$i]){echo $pieces[$i]."<BR>";}}
echo "<BR>";

$xpath = new DOMXPath($dom);
$appNodes = $xpath->query('//caption');
echo $appNodes->item(0)->nodeValue."<BR>";
$appNodes = $xpath->query('//th');
echo $appNodes->item(0)->nodeValue."<BR>";
echo $appNodes->item(1)->nodeValue."<BR>";
$td = array();
$appNodes = $xpath->query('//td');
for($i=0;$i<$appNodes->length;$i++) {
$td[$i] = $appNodes->item($i)->nodeValue;
echo $td[$i]."<BR>";}
echo "<BR>";

$nodes = $dom->getElementsByTagName("*");
for($i=0;$i<$nodes->length;$i++) {
echo $nodes->item($i)->nodeValue."<BR>";}
echo "<BR>";

?>