HTMLElement.php

The PHP DOM library is very useful but painful to use.

Traversing a DOMNodeList implies many lines of code and often duplicate code to access a node and its properties.

Traversing a DOMNodeList

The flow of classes you have to code through to get a DOMElement is:

  • DOMDocument => DOMXPath => DOMNodeList => DOMNode => DOMElement

Using DOMXpath to traverse a DOMNodeList:

$html = '<p>First paragraph <a href="#">Link</a></p><p>Second paragraph</p>';
$doc = new DOMDocument();
libxml_use_internal_errors(TRUE);
$doc->loadHTML($html, LIBXML_HTML_NODEFDTD);
libxml_clear_errors();
$xpath = new DOMXPath($doc);

$nodeList = $xpath->query('//p');
if ($nodeList->length) {
    foreach ($nodeList as $i => $node) {
        if ($nodeList->item($i)->nodeType == XML_ELEMENT_NODE) {
            echo $node->nodeValue, PHP_EOL;
            echo $node->ownerDocument->saveHTML($node), PHP_EOL; // like outerHTML
        }
    }
}

$nodeList = $xpath->query('//a');
if ($nodeList->length) {
    foreach ($nodeList as $i => $node) {
        if ($nodeList->item($i)->nodeType == XML_ELEMENT_NODE) {
            echo $node->nodeValue, PHP_EOL;
            echo $node->ownerDocument->saveHTML($node), PHP_EOL; // like outerHTML
        }
    }
}

To access two nodes, I have to loop through the DOMNodeList each time: lots of code.

HTMLElement class

The HTMLElement class inherits the methods and properties of the DOMElement class.

https://github.com/stemar/html-element

  • Use HTMLElement::elements() if you want to use DOMElement inherited methods and properties.
  • Use HTMLElement::xpath() if you want to output the HTML node as string straight away.

Reduce the repetitive lines of code needed to query a DOMElement with an XPath expression.

$html = '<p>First paragraph <a href="#">Link</a></p><p>Second paragraph</p>';
var_export(HTMLElement::new($html)->xpath('//p')), PHP_EOL;
echo HTMLElement::new($html)->elements('//a')[0]->nodeValue, PHP_EOL;

OR

$html = '<p>First paragraph <a href="#">Link</a></p><p>Second paragraph</p>';
$html_element = new HTMLElement($html);
var_export($html_element->xpath('//p')), PHP_EOL;
echo $html_element->elements('//a')[0]->nodeValue, PHP_EOL;

Results:

array (
  0 => '<p>First paragraph <a href="#">Link</a></p>',
  1 => '<p>Second paragraph</p>',
)
Link