XPATH Tutorial — What is XPATH?
- List Elements in HTML Document by Tag Name Using XPATH and PHP
- List All Elements in HTML Document by Tag Name Using XPATH and PHP
- List Specified Elements in XML Document by Tag Name Using XPATH and PHP
- List Urls in XML Sitemap by Tag Name Using XPATH and PHP
- List Elements in XML Document by Tag Name Using XPATH Query and PHP
- List Child Nodes of Element in XML Document Using XPATH Query and PHP
- List Urls in XML Sitemap Using XPATH Query and registerNamespace and PHP
- List a Website's Audio Links Alphabetically Using XPATH and PHP
- List a Website's Video Links Alphabetically Using XPATH and PHP
- Grab Web Page Links and Video Links and Audio Links from Web Page
- List a Website's Images Alphabetically Using XPATH and PHP
- List a Website's External Links Alphabetically Using XPATH and PHP
- List a Website's Page Urls Alphabetically Using XPATH and PHP
- List a Website's Page Descriptions Alphabetically Using XPATH and PHP
- List a Website's Page Titles Alphabetically Using XPATH and PHP
- Get Links from Web Page Using XPATH and PHP
- Count and Alphabetize Words on a Web Page
- Search Website without Indexing Using XPATH and PHP
- Free Website Indexing Script Using XPATH and PHP
- Free Website Search Script Using PHP
- Free Website Search Script and Tutorial
Most of our XPATH scripts use the PHP DOM extension and PHP 5. The DOM extension is enabled by default in most PHP installations, so the following should work fine—it does for us. The DOM extension allows you to operate on XML documents through the DOM API with PHP 5. It supports XPATH 1.0, which our scripts use extensively. XPATH has been around awhile. What is it? XPath is a syntax for defining parts of an XML document (or an HTML or XHTML one). It uses path expressions to navigate in documents. It contains a library of standard functions.
The DOMXPath class has the DOMDocument property and several very useful methods: DOMXPath::__construct, DOMXPath::evaluate (which evaluates the given XPath expression and returns a typed result if possible or a DOMNodeList containing all nodes matching the given XPath expression), DOMXPath::query (which evaluates and executes the given XPath expression and returns a DOMNodeList containing all nodes matching the given XPath expression), DOMXPath::registerNamespace (which is necessary to use XPath to handle documents which have default namespaces described in the xmlns declaration which in the case of a sitemap is in the urlset tag), and DOMXPath::registerPhpFunctions. Most XML files seem to have no xmlns declaration (e.g., PAD files), therefore needing no namespace registration.
Who cares about XML files? XML was designed to transport and store data, while HTML was designed to display data. XML is the most common tool there is for data transmissions between all sorts of applications. XML makes data sharing easier since computer systems and databases contain data in incompatible formats. But since XML data is stored in plain text format, it can provide a way of storing data that is not dependent software or hardware. This makes it much easier to create data that can be shared by different applications. XML makes data transporting easier. People often need to exchange data between incompatible systems over the Internet, but exchanging data in the form of XML greatly reduces this complexity, since the data can be read by different incompatible applications.
The script List Urls in XML Sitemap by Tag Name Using XPATH and PHP uses a sample file that is a website sitemap file which we generate with XML Sitemaps. This type of file is XML used for the purpose of storing site maps for search engines to read to make sure they find all of a site's web pages. The script List Specified Elements in XML Document by Tag Name Using XPATH and PHP uses a sample file that is a website PAD file which we generate with PADGen.
So we cover the XML aspect of XPATH use in the first few links above. But the remaining apps are about HTML files and website files. Websites can be built out of several file types. We concentrate on HTML, HTM, and PHP file extensions in our scripts. You can quickly see just how powerful XPATH is (check out the above links) when you look at what it can empower: site indexing and searching, listing all the internal or external (offsite) URLs or videos or audios or images or page titles or page descriptions on all the hundreds of pages on a website. There's even a Search Website without Indexing Using XPATH and PHP script that does a site search of site X while on site Y without even doing any site indexing first. Indexing is grabbing the page content on a site's pages, dumping the HTML tags, and storing it in a database—usually MySQL. Like we say: this XPATH stuff is powerful magic!