HTMLElement.php
The PHP DOM library is very useful but painful to use.
Traversing a DOMNodeList implies many lines of code and often duplicate code to access a node and its properties.
Traversing a DOMNodeList
The flow of classes you have to code through to get a DOMElement is:
- DOMDocument => DOMXPath => DOMNodeList => DOMNode => DOMElement
Using DOMXpath to traverse a DOMNodeList:
$html = '<p>First paragraph <a href="#">Link</a></p><p>Second paragraph</p>';
$doc = new DOMDocument();
libxml_use_internal_errors(TRUE);
$doc->loadHTML($html, LIBXML_HTML_NODEFDTD);
libxml_clear_errors();
$xpath = new DOMXPath($doc);
$nodeList = $xpath->query('//p');
if ($nodeList->length) {
foreach ($nodeList as $i => $node) {
if ($nodeList->item($i)->nodeType == XML_ELEMENT_NODE) {
echo $node->nodeValue, PHP_EOL;
echo $node->ownerDocument->saveHTML($node), PHP_EOL; // like outerHTML
}
}
}
$nodeList = $xpath->query('//a');
if ($nodeList->length) {
foreach ($nodeList as $i => $node) {
if ($nodeList->item($i)->nodeType == XML_ELEMENT_NODE) {
echo $node->nodeValue, PHP_EOL;
echo $node->ownerDocument->saveHTML($node), PHP_EOL; // like outerHTML
}
}
}
To access two nodes, I have to loop through the DOMNodeList each time: lots of code.
HTMLElement class
The HTMLElement
class inherits the methods and properties of the DOMElement
class.
https://github.com/stemar/html-element
- Use
HTMLElement::elements()
if you want to useDOMElement
inherited methods and properties. - Use
HTMLElement::xpath()
if you want to output the HTML node as string straight away.
Reduce the repetitive lines of code needed to query a DOMElement
with an XPath expression.
$html = '<p>First paragraph <a href="#">Link</a></p><p>Second paragraph</p>';
var_export(HTMLElement::new($html)->xpath('//p')), PHP_EOL;
echo HTMLElement::new($html)->elements('//a')[0]->nodeValue, PHP_EOL;
OR
$html = '<p>First paragraph <a href="#">Link</a></p><p>Second paragraph</p>';
$html_element = new HTMLElement($html);
var_export($html_element->xpath('//p')), PHP_EOL;
echo $html_element->elements('//a')[0]->nodeValue, PHP_EOL;
Results:
array (
0 => '<p>First paragraph <a href="#">Link</a></p>',
1 => '<p>Second paragraph</p>',
)
Link