PHPのお勉強!

PHP TOP

Document Object Model

add a note

User Contributed Notes 38 notes

up
11
tobiasz.cudnik[at]gmail.com
16 years ago
If you need simple interface to DOM check out phpQuery - jQuery port to PHP:
http://code.google.com/p/phpquery/

It uses CSS selectors to fetch nodes.
Here's example how it works:
<?php
// just one file to include
require('phpQuery/phpQuery.php');

$html = '
<div>
mydiv
<ul>
<li>1</li>
<li>2</li>
<li>3</li>
</ul>
</div>'
;

// intialize new DOM from markup
phpQuery::newDocument($markup)
->
find('ul > li')
->
addClass('my-new-class')
->
filter(':last')
->
addClass('last-li');

// query all unordered lists in last used DOM
pq('ul')->insertAfter('div');

// iterate all LIs from last used DOM
foreach(pq('li') as $li) {
// iteration returns plain DOM nodes, not phpQuery objects
pq($li)->addClass('my-second-new-class');
}

// same as pq('anything')->htmlOuter()
// but on document root (returns doctype etc)
print phpQuery::getDocument();
?>

It uses DOM extension and XPath so it works only in PHP5.
up
5
Yanik <clonyara(at)ahoo(dot)com>
17 years ago
I hate DOM model !
so I wrote dom2array simple function (simple for use):

function dom2array($node) {
$res = array();
print $node->nodeType.'<br/>';
if($node->nodeType == XML_TEXT_NODE){
$res = $node->nodeValue;
}
else{
if($node->hasAttributes()){
$attributes = $node->attributes;
if(!is_null($attributes)){
$res['@attributes'] = array();
foreach ($attributes as $index=>$attr) {
$res['@attributes'][$attr->name] = $attr->value;
}
}
}
if($node->hasChildNodes()){
$children = $node->childNodes;
for($i=0;$i<$children->length;$i++){
$child = $children->item($i);
$res[$child->nodeName] = dom2array($child);
}
}
}
return $res;
}
up
3
super dot puma at gmail dot com
10 years ago
If you want to print the DOM XML file content, you can use the next code:

$doc = new DOMDocument();
$doc->load($xmlFileName);
echo "<br>" . $doc->documentURI;
$x = $doc->documentElement;
getNodeContent($x->childNodes, 0);

function getNodeContent($nodes, $level){
foreach ($nodes AS $item) {
// print "<br><br>TIPO: " . $item->nodeType ;
printValues($item, $level);
if ($item->nodeType == 1) { //DOMElement
foreach ($item->attributes AS $itemAtt) {
printValues($itemAtt, $level+3);
}
if($item->childNodes || $item->childNodes->lenth > 0) {
getNodeContent($item->childNodes, $level+5);
}
}
}
}

function printValues($item, $level){
if ($item->nodeType == 1) { //DOMElement
printLevel($level);
print $item->nodeName . " = " . $item->nodeValue;
}
if ($item->nodeType == 2) { //DOMAttr
printLevel($level);
print $item->name . " = " . $item->value ;
}
if ($item->nodeType == 3) { //DOMText
if ($item->isWhitespaceInElementContent() == false){
printLevel($level);
print $item->wholeText ;
}
}
}

function printLevel($level)
{
print "<br>";
if ($level == 0) {
print "<br>";
}
for($i=0; $i < $level; $i++) {
print "-";
}
}
up
2
pes_cz
19 years ago
When I tried to parse my XHTML Strict files with DOM extension, it couldn't understand xhtml entities (like &copy;). I found post about it here (14-Jul-2005 09:05) which adviced to add resolveExternals = true, but it was very slow. There was some small note about xml catalogs but without any glue. Here it is:

XML catalogs is something like cache. Download all needed dtd's to /etc/xml, edit file /etc/xml/catalog and add this line: <public publicId="-//W3C//DTD XHTML 1.0 Strict//EN" uri="file:///etc/xml/xhtml1-strict.dtd" />

Thats all. Thanks to http://www.whump.com/moreLikeThis/link/03815
up
2
simlee at indiana dot edu
18 years ago
The project I'm currently working on uses XPaths to dynamically navigate through chunks of an XML file. I couldn't find any PHP code on the net that would build the XPath to a node for me, so I wrote my own function. Turns out it wasn't as hard as I thought it might be (yay recursion), though it does entail using some PHP shenanigans...

Hopefully it'll save someone else the trouble of reinventing this wheel.

<?php
function getNodeXPath( $node ) {
// REMEMBER THAT XPATHS USE BASE-1 INSTEAD OF BASE-0!!!

// Get the index for the current node by looping through the siblings.
$parentNode = $node->parentNode;
if(
$parentNode != null ) {
$nodeIndex = 0;
do {
$testNode = $parentNode->childNodes->item( $nodeIndex );
$nodeName = $testNode->nodeName;
$nodeIndex++;

// PHP trickery! Here we create a counter based on the node
// name of the test node to use in the XPath.
if( !isset( $$nodeName ) ) $$nodeName = 1;
else $
$nodeName++;

// Failsafe return value.
if( $nodeIndex > $parentNode->childNodes->length ) return( "/" );
} while( !
$node->isSameNode( $testNode ) );

// Recursively get the XPath for the parent.
return( getNodeXPath( $parentNode ) . "/{$node->nodeName}[{$$nodeName}]" );
} else {
// Hit the root node! Note that the slash is added when
// building the XPath, so we return just an empty string.
return( "" );
}
}
?>
up
2
johanwthijs-at-hotmail-dot-com
19 years ago
Being an experienced ASP developer I was wondering how to replace textual content of a node (with msxml this is simply acheived by setting the 'text' property of a node). Out of frustration I started to play around with SimpleXml but I could not get it to work in combination with xPath.

I took me a lot of time to find out so I hope this helps others:

function replaceNodeText($objXml, $objNode, $strNewContent){
/*
This function replaces a node's string content with strNewContent
*/
$objNodeListNested = &$objNode->childNodes;
foreach ( $objNodeListNested as $objNodeNested ){
if ($objNodeNested->nodeType == XML_TEXT_NODE)$objNode->removeChild ($objNodeNested);
}

$objNode->appendChild($objXml->createTextNode($strNewContent));
}

$objXml= new DOMDocument();
$objXml->loadXML('<root><node id="1">bla</note></root>');
$objXpath = new domxpath($objXml);

$strXpath="/root/node[@id='1']";
$objNodeList = $objXpath ->query($strXpath);
foreach ($objNodeList as $objNode){
//pass the node by reference
replaceNodeText($objXml, &$objNode, $strImportedValue);
}
up
2
cooper at asu dot ntu-kpi dot kiev dot ua
18 years ago
If you are using not object-oriented functions and it takes too much time to change them all (or you'll be replacing them later) then as a temporary decision can be used this modules:

For DOM XML:
http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/

For XSLT:
http://alexandre.alapetite.net/doc-alex/xslt-php4-php5/
up
2
aidan at php dot net
19 years ago
When dealing with validation or loading, the output errors can be quite annoying.

PHP 5.1 introduces libxml_get_errors().

http://php.net/libxml_get_errors
up
2
aidan at php dot net
19 years ago
As of PHP 5.1, libxml options may be set using constants rather than the use of proprietary DomDocument properties.

DomDocument->resolveExternals is equivilant to setting
LIBXML_DTDLOAD
LIBXML_DTDATTR

DomDocument->validateOnParse is equivilant to setting
LIBXML_DTDLOAD
LIBXML_DTDVALID

PHP 5.1 users are encouraged to use the new constants.

Example:

DomDocument->load($file, LIBXML_DTDLOAD|LIBXML_DTDATTR);

DomDocument->load($file, LIBXML_DTDLOAD|LIBXML_DTDVALID);
up
1
toby at tobiasly dot com
19 years ago
This module is not included by default either in the CentOS 4 "centosplus" repository. For those using PHP5 on CentOS 4, a simple "yum --enablerepo=centosplus install php-xml" will do the trick (this will install both the XML and DOM modules).
up
2
sweisman at pobox dot com
15 years ago
I had problems with the dom2array_full function by "nospam at ya dot ru". Here's my function, which works correctly for my project, and might work for yours:

<?php
function dom_to_array($root)
{
$result = array();

if (
$root->hasAttributes())
{
$attrs = $root->attributes;

foreach (
$attrs as $i => $attr)
$result[$attr->name] = $attr->value;
}

$children = $root->childNodes;

if (
$children->length == 1)
{
$child = $children->item(0);

if (
$child->nodeType == XML_TEXT_NODE)
{
$result['_value'] = $child->nodeValue;

if (
count($result) == 1)
return
$result['_value'];
else
return
$result;
}
}

$group = array();

for(
$i = 0; $i < $children->length; $i++)
{
$child = $children->item($i);

if (!isset(
$result[$child->nodeName]))
$result[$child->nodeName] = dom_to_array($child);
else
{
if (!isset(
$group[$child->nodeName]))
{
$tmp = $result[$child->nodeName];
$result[$child->nodeName] = array($tmp);
$group[$child->nodeName] = 1;
}

$result[$child->nodeName][] = dom_to_array($child);
}
}

return
$result;
}
?>
up
1
PHPdeveloper
17 years ago
The Yanik's dom2array() function (added on 14-Mar-2007 08:40) does not handle multiple nodes with the same name, i.e.:

<foo>
<name>aa</name>
<name>bb</name>
</foo>

It will overwrite former and your array will contain just the last one ("bb")
up
1
amir.laherATcomplinet.com
19 years ago
This particular W3C page provides invaluable documentation for the DOM classes implemented in php5 (via libxml2). It fills in plenty of php.net's gaps:

http://www.w3.org/TR/DOM-Level-2-Core/core.html

Some key examples:
* concise summary of the class heirachy (1.1.1)
* clarification that DOM level 2 doesn't allow for population of internal DTDs
* explanation of DOMNode->normalize()
* explanation of the DOMImplementation class

The interfaces are described in OMG's Interface Definition Language
up
1
ohcc at 163 dot com
9 years ago
<?php
// this note is about how to get a DOMNode's outerHTML and innerHTML.
$dom = new DOMDocument('1.0','UTF-8');
$dom->loadHTML('<html><body><div><p>p1</p><p>p2</p></div></body></html>');
$node = $dom->getElementsByTagName('div')->item(0);
$outerHTML = $node->ownerDocument->saveHTML($node);
$innerHTML = '';
foreach (
$node->childNodes as $childNode){
$innerHTML .= $childNode->ownerDocument->saveHTML($childNode);
}
echo
'<h2>outerHTML: </h2>';
echo
htmlspecialchars($outerHTML);
echo
'<h2>innerHTML: </h2>';
echo
htmlspecialchars($innerHTML);
?>