PHPのお勉強!

PHP TOP

Document Object Model

add a note

User Contributed Notes 38 notes

up
11
tobiasz.cudnik[at]gmail.com
16 years ago
If you need simple interface to DOM check out phpQuery - jQuery port to PHP:
http://code.google.com/p/phpquery/

It uses CSS selectors to fetch nodes.
Here's example how it works:
<?php
// just one file to include
require('phpQuery/phpQuery.php');

$html = '
<div>
mydiv
<ul>
<li>1</li>
<li>2</li>
<li>3</li>
</ul>
</div>'
;

// intialize new DOM from markup
phpQuery::newDocument($markup)
->
find('ul > li')
->
addClass('my-new-class')
->
filter(':last')
->
addClass('last-li');

// query all unordered lists in last used DOM
pq('ul')->insertAfter('div');

// iterate all LIs from last used DOM
foreach(pq('li') as $li) {
// iteration returns plain DOM nodes, not phpQuery objects
pq($li)->addClass('my-second-new-class');
}

// same as pq('anything')->htmlOuter()
// but on document root (returns doctype etc)
print phpQuery::getDocument();
?>

It uses DOM extension and XPath so it works only in PHP5.
up
5
Yanik <clonyara(at)ahoo(dot)com>
17 years ago
I hate DOM model !
so I wrote dom2array simple function (simple for use):

function dom2array($node) {
$res = array();
print $node->nodeType.'<br/>';
if($node->nodeType == XML_TEXT_NODE){
$res = $node->nodeValue;
}
else{
if($node->hasAttributes()){
$attributes = $node->attributes;
if(!is_null($attributes)){
$res['@attributes'] = array();
foreach ($attributes as $index=>$attr) {
$res['@attributes'][$attr->name] = $attr->value;
}
}
}
if($node->hasChildNodes()){
$children = $node->childNodes;
for($i=0;$i<$children->length;$i++){
$child = $children->item($i);
$res[$child->nodeName] = dom2array($child);
}
}
}
return $res;
}
up
3
super dot puma at gmail dot com
10 years ago
If you want to print the DOM XML file content, you can use the next code:

$doc = new DOMDocument();
$doc->load($xmlFileName);
echo "<br>" . $doc->documentURI;
$x = $doc->documentElement;
getNodeContent($x->childNodes, 0);

function getNodeContent($nodes, $level){
foreach ($nodes AS $item) {
// print "<br><br>TIPO: " . $item->nodeType ;
printValues($item, $level);
if ($item->nodeType == 1) { //DOMElement
foreach ($item->attributes AS $itemAtt) {
printValues($itemAtt, $level+3);
}
if($item->childNodes || $item->childNodes->lenth > 0) {
getNodeContent($item->childNodes, $level+5);
}
}
}
}

function printValues($item, $level){
if ($item->nodeType == 1) { //DOMElement
printLevel($level);
print $item->nodeName . " = " . $item->nodeValue;
}
if ($item->nodeType == 2) { //DOMAttr
printLevel($level);
print $item->name . " = " . $item->value ;
}
if ($item->nodeType == 3) { //DOMText
if ($item->isWhitespaceInElementContent() == false){
printLevel($level);
print $item->wholeText ;
}
}
}

function printLevel($level)
{
print "<br>";
if ($level == 0) {
print "<br>";
}
for($i=0; $i < $level; $i++) {
print "-";
}
}
up
2
pes_cz
19 years ago
When I tried to parse my XHTML Strict files with DOM extension, it couldn't understand xhtml entities (like &copy;). I found post about it here (14-Jul-2005 09:05) which adviced to add resolveExternals = true, but it was very slow. There was some small note about xml catalogs but without any glue. Here it is:

XML catalogs is something like cache. Download all needed dtd's to /etc/xml, edit file /etc/xml/catalog and add this line: <public publicId="-//W3C//DTD XHTML 1.0 Strict//EN" uri="file:///etc/xml/xhtml1-strict.dtd" />

Thats all. Thanks to http://www.whump.com/moreLikeThis/link/03815
up
2
simlee at indiana dot edu
18 years ago
The project I'm currently working on uses XPaths to dynamically navigate through chunks of an XML file. I couldn't find any PHP code on the net that would build the XPath to a node for me, so I wrote my own function. Turns out it wasn't as hard as I thought it might be (yay recursion), though it does entail using some PHP shenanigans...

Hopefully it'll save someone else the trouble of reinventing this wheel.

<?php
function getNodeXPath( $node ) {
// REMEMBER THAT XPATHS USE BASE-1 INSTEAD OF BASE-0!!!

// Get the index for the current node by looping through the siblings.
$parentNode = $node->parentNode;
if(
$parentNode != null ) {
$nodeIndex = 0;
do {
$testNode = $parentNode->childNodes->item( $nodeIndex );
$nodeName = $testNode->nodeName;
$nodeIndex++;

// PHP trickery! Here we create a counter based on the node
// name of the test node to use in the XPath.
if( !isset( $$nodeName ) ) $$nodeName = 1;
else $
$nodeName++;

// Failsafe return value.
if( $nodeIndex > $parentNode->childNodes->length ) return( "/" );
} while( !
$node->isSameNode( $testNode ) );

// Recursively get the XPath for the parent.
return( getNodeXPath( $parentNode ) . "/{$node->nodeName}[{$$nodeName}]" );
} else {
// Hit the root node! Note that the slash is added when
// building the XPath, so we return just an empty string.
return( "" );
}
}
?>
up
2
johanwthijs-at-hotmail-dot-com
18 years ago
Being an experienced ASP developer I was wondering how to replace textual content of a node (with msxml this is simply acheived by setting the 'text' property of a node). Out of frustration I started to play around with SimpleXml but I could not get it to work in combination with xPath.

I took me a lot of time to find out so I hope this helps others:

function replaceNodeText($objXml, $objNode, $strNewContent){
/*
This function replaces a node's string content with strNewContent
*/
$objNodeListNested = &$objNode->childNodes;
foreach ( $objNodeListNested as $objNodeNested ){
if ($objNodeNested->nodeType == XML_TEXT_NODE)$objNode->removeChild ($objNodeNested);
}

$objNode->appendChild($objXml->createTextNode($strNewContent));
}

$objXml= new DOMDocument();
$objXml->loadXML('<root><node id="1">bla</note></root>');
$objXpath = new domxpath($objXml);

$strXpath="/root/node[@id='1']";
$objNodeList = $objXpath ->query($strXpath);
foreach ($objNodeList as $objNode){
//pass the node by reference
replaceNodeText($objXml, &$objNode, $strImportedValue);
}
up
2
cooper at asu dot ntu-kpi dot kiev dot ua
17 years ago
If you are using not object-oriented functions and it takes too much time to change them all (or you'll be replacing them later) then as a temporary decision can be used this modules:

For DOM XML:
http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/

For XSLT:
http://alexandre.alapetite.net/doc-alex/xslt-php4-php5/
up
2
aidan at php dot net
19 years ago
When dealing with validation or loading, the output errors can be quite annoying.

PHP 5.1 introduces libxml_get_errors().

http://php.net/libxml_get_errors
up
2
aidan at php dot net
19 years ago
As of PHP 5.1, libxml options may be set using constants rather than the use of proprietary DomDocument properties.

DomDocument->resolveExternals is equivilant to setting
LIBXML_DTDLOAD
LIBXML_DTDATTR

DomDocument->validateOnParse is equivilant to setting
LIBXML_DTDLOAD
LIBXML_DTDVALID

PHP 5.1 users are encouraged to use the new constants.

Example:

DomDocument->load($file, LIBXML_DTDLOAD|LIBXML_DTDATTR);

DomDocument->load($file, LIBXML_DTDLOAD|LIBXML_DTDVALID);
up
1
toby at tobiasly dot com
18 years ago
This module is not included by default either in the CentOS 4 "centosplus" repository. For those using PHP5 on CentOS 4, a simple "yum --enablerepo=centosplus install php-xml" will do the trick (this will install both the XML and DOM modules).
up
2
sweisman at pobox dot com
15 years ago
I had problems with the dom2array_full function by "nospam at ya dot ru". Here's my function, which works correctly for my project, and might work for yours:

<?php
function dom_to_array($root)
{
$result = array();

if (
$root->hasAttributes())
{
$attrs = $root->attributes;

foreach (
$attrs as $i => $attr)
$result[$attr->name] = $attr->value;
}

$children = $root->childNodes;

if (
$children->length == 1)
{
$child = $children->item(0);

if (
$child->nodeType == XML_TEXT_NODE)
{
$result['_value'] = $child->nodeValue;

if (
count($result) == 1)
return
$result['_value'];
else
return
$result;
}
}

$group = array();

for(
$i = 0; $i < $children->length; $i++)
{
$child = $children->item($i);

if (!isset(
$result[$child->nodeName]))
$result[$child->nodeName] = dom_to_array($child);
else
{
if (!isset(
$group[$child->nodeName]))
{
$tmp = $result[$child->nodeName];
$result[$child->nodeName] = array($tmp);
$group[$child->nodeName] = 1;
}

$result[$child->nodeName][] = dom_to_array($child);
}
}

return
$result;
}
?>
up
1
PHPdeveloper
17 years ago
The Yanik's dom2array() function (added on 14-Mar-2007 08:40) does not handle multiple nodes with the same name, i.e.:

<foo>
<name>aa</name>
<name>bb</name>
</foo>

It will overwrite former and your array will contain just the last one ("bb")
up
1
amir.laherATcomplinet.com
19 years ago
This particular W3C page provides invaluable documentation for the DOM classes implemented in php5 (via libxml2). It fills in plenty of php.net's gaps:

http://www.w3.org/TR/DOM-Level-2-Core/core.html

Some key examples:
* concise summary of the class heirachy (1.1.1)
* clarification that DOM level 2 doesn't allow for population of internal DTDs
* explanation of DOMNode->normalize()
* explanation of the DOMImplementation class

The interfaces are described in OMG's Interface Definition Language
up
1
ohcc at 163 dot com
8 years ago
<?php
// this note is about how to get a DOMNode's outerHTML and innerHTML.
$dom = new DOMDocument('1.0','UTF-8');
$dom->loadHTML('<html><body><div><p>p1</p><p>p2</p></div></body></html>');
$node = $dom->getElementsByTagName('div')->item(0);
$outerHTML = $node->ownerDocument->saveHTML($node);
$innerHTML = '';
foreach (
$node->childNodes as $childNode){
$innerHTML .= $childNode->ownerDocument->saveHTML($childNode);
}
echo
'<h2>outerHTML: </h2>';
echo
htmlspecialchars($outerHTML);
echo
'<h2>innerHTML: </h2>';
echo
htmlspecialchars($innerHTML);
?>
up
1
miguelangelhdz at NOSPAM dot com
16 years ago
After searching how to extend the DOMDocument and DOMElement I found a way in the bug: http://bugs.php.net/bug.php?id=35104. The following code shows how:

<?php
class extDOMDocument extends DOMDocument {
public function
createElement($name, $value=null) {
$orphan = new extDOMElement($name, $value); // new sub-class object
$docFragment = $this->createDocumentFragment(); // lightweight container maintains "ownerDocument"
$docFragment->appendChild($orphan); // attach
$ret = $docFragment->removeChild($orphan); // remove
return $ret; // ownerDocument set; won't be destroyed on method exit
}
// .. more class definition
}

class
extDOMElement extends DOMElement {
function
__construct($name, $value='', $namespaceURI=null) {
parent::__construct($name, $value, $namespaceURI);
}
// ... more class definition here
}

$doc = new extDOMDocument('test');
$el = $doc->createElement('tagname');
$el->setAttribute("attr", "val");
$doc->appendChild($el);

// append discards the DOMDocumentFragment and just adds its child nodes, but ownerDocument is maintained.
echo get_class($el)."<br/>";
echo
get_class($doc->documentElement)."<br/>";
echo
"<xmp>".$doc->saveXML()."</xmp>";
?>
up
0
emmanuellutula at hotmail dot com
6 years ago
If you want to use DOMDocument in your PHPUnit Tests drive on Symfony Controller (testing form)! Use like :

namespace Tests\YourBundle\Controller;

use Symfony\Bundle\FrameworkBundle\Test\WebTestCase;
use YourBundle\Controller\TextController;

class DefaultControllerTest extends WebTestCase
{
public function testIndex()
{
$client = static::createClient(array(), array());

$crawler = $client->request('GET', '/text/add');
$this->assertTrue($crawler->filter("form")->count() > 0, "Text form exist !");

$form = $crawler->filter("form")->form();

$domDocument = new \DOMDocument;

$domInput = $domDocument->createElement('input');
$dom = $domDocument->appendChild($domInput);
$dom->setAttribute('slug', 'bloc');


$formInput = new \Symfony\Component\DomCrawler\Field\InputFormField($domInput);
$form->set($formInput);

$crawler = $client->submit($form);

if ($client->getResponse()->isRedirect())
{
$crawler = $client->followRedirect(false);
}

// $this->assertTrue($client->getResponse()->isSuccessful());
//$this->assertEquals(200, $client->getResponse()->getStatusCode(),
// "Unexpected HTTP status code for GET /backoffice/login");

}
}
up
0
Drupella
13 years ago
Here is a fast innerHTML function that returns the result without iterating over child nodes.

<?php
function innerHTML($el) {
$doc = new DOMDocument();
$doc->appendChild($doc->importNode($el, TRUE));
$html = trim($doc->saveHTML());
$tag = $el->nodeName;
return
preg_replace('@^<' . $tag . '[^>]*>|</' . $tag . '>$@', '', $html);
}
?>

Example
<?php
$doc
= new DOMDocument();
// A corrupt HTML string
$doc->loadHTML('<HTML><A HREF="ss">asd</A>');
$body = $doc->getElementsByTagName('body')->item(0);
print
htmlspecialchars(innerHTML($body));
// Prints <a href="ss">asd</a>
?>
up
0
odessa131 at aol dot nospam dot com
15 years ago
I had the hardest time updating a complex XML document. Here's a quick example on how to do it.

<?php

// Load the XML from a file.
$xml = "a2062.xml"; // This is an XFDL form previously unencoded and ungzipped.
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->Load($xml);

// Create an XPath query.
// Note: you must define the namespace if the XML document has defined namespaces.
$xpath = new DOMXPath($dom);
$xpath->registerNamespace('xfdl', "http://www.PureEdge.com/XFDL/6.5");

// Locate the value for the first Item Description field.
$query = "//xfdl:page/xfdl:field[@sid='ITEMDESA']/xfdl:value";
$nodeList = $xpath->query($query);

$nodeList->item(0)->nodeValue = "This is the text in the value node of the first Item Description field inside the DA 2062 PureEdge form.";

$dom->save($xml);

?>

I hope this helps someone.
up
0
Junior
15 years ago
innerHTML in PHP DOM

<?php
function DOMinnerHTML($element)
{
$innerHTML = "";
$children = $element->childNodes;
foreach (
$children as $child)
{
$tmp_dom = new DOMDocument();
$tmp_dom->appendChild($tmp_dom->importNode($child, true));
$innerHTML.=trim($tmp_dom->saveHTML());
}
return
$innerHTML;
}
?>

Example:

<?php
$dom
= new DOMDocument();
$dom->load($html_string);
$dom->preserveWhiteSpace = false;

$domTable = $dom->getElementsByTagName("table");

foreach (
$domTable as $tables)
{
echo
DOMinnerHTML($tables);
}
?>
up
0
fantasyman3000 at gmail dot com
15 years ago
In response to "simlee at indiana dot edu",
- First of all thanks for sharing your funciton.
- It didn't work for me so i rewrite it from scratch using different method.

Here is the new version, hope it helps someone :

<?php
/**
* result sample : /html[1]/body[1]/span[1]/fieldset[1]/div[1]
* @return string
*/
function getNodeXPath( $node ) {
$result='';
while (
$parentNode = $node->parentNode) {
$nodeIndex=-1;
$nodeTagIndex=0;
do {
$nodeIndex++;
$testNode = $parentNode->childNodes->item( $nodeIndex );

if (
$testNode->nodeName==$node->nodeName and $testNode->parentNode->isSameNode($node->parentNode) and $testNode->childNodes->length>0) {
//echo "{$testNode->parentNode->nodeName}-{$testNode->nodeName}-{}<br/>";
$nodeTagIndex++;
}

} while (!
$node->isSameNode($testNode));

$result="/{$node->nodeName}[{$nodeTagIndex}]".$result;
$node=$parentNode;
};
return
$result;
}
?>

By Sina.Salek.ws
up
-1
ben_demott at hotmail dot com
14 years ago
A function among several others to parse a google results page, I wrote this some time ago - google has probably changed their site since then, but I thought this might be helpful to someone.

I'm moving servers, but I will probably throw this up on my blog when I get it back up.

<?php

function googleResult($listItem) {
// given a LIST ITEM element, this will validate, and return an array for that LI entry as an inline result from google.
/*
* <li class='g w0'>
* <h3 class='r'>
* <a href='the URL' class='l'>
* Description <em>description</em>
* </a>
* </h3>
* </li>
*
UPDATE:
This function will now look for any subcontainer that has an href, it doesn't have to be an H3
this will make it work with a few more formatted search results.
*/

$listItem = $listItem->childNodes;
// Yes I don't use instanceof - I guess you'll have to deal.
foreach($listItem as $element) {
if(
is_object($element) && get_class($element) == 'DOMElement' && $element->hasChildNodes()) {
$hrefContainer = $element->childNodes;
foreach(
$hrefContainer as $element2) {
if(
is_object($element2) && get_class($element2) == 'DOMElement' && $element2->nodeName == 'a' && $element2->hasAttribute('href')) {
$anchor = $element2;
unset(
$h3);
unset(
$element2);
break;
} else {
//print __LINE__ ." :: Breaking out of loop (normal result) element is not an annchor Element='".$element2->nodeName."'\n";
}
}
unset(
$element);
unset(
$listItem);
break;
}
}
if(empty(
$anchor) || !is_object($anchor) || get_class($anchor) != 'DOMElement') {
//print __LINE__ ." :: Returning false, did not locate anchor through iteration...";
return false;
}
$href = $anchor->getAttribute('href');
if(empty(
$href)) {
//print __LINE__ ." :: Found anchor object, could not read href attribute / href is empty? href='$href'\n";
return false;
}
$description = $anchor->childNodes;
$urlDescription = '';
foreach(
$description as $words) {
$name = trim($words->nodeName);
if(
$name == 'em' || $name == '#text' || $name == 'b') {
if(!empty(
$words->nodeValue)) {
$text = trim($words->nodeValue);
$urlDescription = $urlDescription . $text . ' ';
}
}
}
$urlDescription = htmlspecialchars_decode($urlDescription, ENT_QUOTES);
$urlDescription = trim($urlDescription);
return array(
'description' => $urlDescription, 'href' => $href);
}
up
-1
philipwaynerollins at gmail dot com
15 years ago
You can get the "innerHTML" by nodeValue so

<?php
$doc
= new DOMDocument( );
$ele = $doc->createElement( 'p', 'Sensei Ninja' );
print
$ele->nodeValue;
?>

You can even set it if you want

<?php
$doc
= new DOMDocument( );
$ele = $doc->createElement( 'p' );
$ele->nodeValue = 'Sensei Ninja';
$doc->appendChild( $ele );
print
$doc->saveHTML( );
?>
up
-1
sean at lookin3d dot com
18 years ago
$xmlDoc=<<<XML
<?xml version="1.0"?>
<methodCall>
<methodName>examples.getStateName</methodName>
<params>
<param>
<value><i4>41</i4></value>
</param>
</params>
</methodCall>
XML;

$xml= new DOMDocument();
$xml->preserveWhiteSpace=false;
$xml->loadXML($xmlDoc);
print_r(xml2array($xml));

function xml2array($n)
{
$return=array();
foreach($n->childNodes as $nc)
($nc->hasChildNodes())
?($n->firstChild->nodeName== $n->lastChild->nodeName&&$n->childNodes->length>1)
?$return[$nc->nodeName][]=xml2array($item)
:$return[$nc->nodeName]=xml2array($nc)
:$return=$nc->nodeValue;
return $return;
}
up
-1
nospam at ya dot ru
15 years ago
<?PHP
function dom2array_full($node){
$result = array();
if(
$node->nodeType == XML_TEXT_NODE) {
$result = $node->nodeValue;
}
else {
if(
$node->hasAttributes()) {
$attributes = $node->attributes;
if(!
is_null($attributes))
foreach (
$attributes as $index=>$attr)
$result[$attr->name] = $attr->value;
}
if(
$node->hasChildNodes()){
$children = $node->childNodes;
for(
$i=0;$i<$children->length;$i++) {
$child = $children->item($i);
if(
$child->nodeName != '#text')
if(!isset(
$result[$child->nodeName]))
$result[$child->nodeName] = dom2array($child);
else {
$aux = $result[$child->nodeName];
$result[$child->nodeName] = array( $aux );
$result[$child->nodeName][] = dom2array($child);
}
}
}
}
return
$result;
}
?>
up
-1
danf dot 1979 at []gmail[] dot com
16 years ago
This is a couple of classes to deal with yahoo yui menu.

/*
$menubar = new MenuBar();

$file = new Menu("File");
$file->setAttribute("href", "http://file.com");

$quit = new Menu("Quit");
$quit->setAttribute("href", "http://quit.com");

$file->appendChild($quit);
$menubar->appendChild($file);

echo $menubar->grab();
*/

//
// Author: Daniel Queirolo.
// LGPL
//

/** ---------------------------------
/** Class MenuBar()
/** Creates a the menubar and appends
/** yuimenubaritems to it.
/** ---------------------------------*/

class MenuBar extends DOMDocument
{

public $menuID = "nav_menu"; // holds the css id that javascript yui menu code should have to recognize
private $UL; // This node holds every menu, This is THE node.

/** ---------------------------------
/** Constructor
/** Generates a menubar skeleton and the UL node
/** ---------------------------------*/

public function __construct() {

parent::__construct();

$rootdiv = parent::createElement("div");
$rootdiv->setAttribute("class", "yui-skin-sam");

parent::appendChild($rootdiv);

$yui_menubar = parent::createElement("div");
$yui_menubar->setAttribute("id", $this->menuID);
$yui_menubar->setAttribute("class", "yuimenubar");

$rootdiv->appendChild($yui_menubar);

$bd = parent::createElement("div");
$bd->setAttribute("class", "bd");

$yui_menubar->appendChild($bd);

$ul = parent::createElement("ul");
$ul->setAttribute("class", "first-of-type");

// ALL Menu() instances ocurr inside an <ul> tag.

$this->UL = $bd->appendChild($ul);

}

/** ---------------------------------
/** appendChild()
/** Appends a new yuimenubaritem to the menubar UL node.
/** This function changes <li> and <a> classes to yuiMENUBARsomething
/** ---------------------------------*/

public function appendChild($child) {

$li = parent::importNode($child->LI, true);

$li->setAttribute("class", "yuimenubaritem");

$li->getElementsByTagName("a")->item(0)->setAttribute("class", "yuimenubaritemlabel");

$this->UL->appendChild($li);

}

public function grab() {

return parent::saveHTML();

}

}

/** ---------------------------------
/** Class Menu()
/** Creates a yuimenuitem li node
/** ---------------------------------*/

class Menu extends DOMDocument {

public $LI; // stores the <li> node (THE link) that will be exported to MenuBar() or used on appendChild()

/** ---------------------------------
/** Constructor
/** Generates a yuimenuitem li node
/** No yuimenubar items are created here. MenuBar handles that.
/** ---------------------------------*/

public function __construct($link_name) {

parent::__construct();

$li = parent::createElement("li");
$li->setAttribute("class", "yuimenuitem");

// LI node stores THE link.
// if appendChild is used, the new (sub) Menu() would be LI node child.

$this->LI = parent::appendChild($li);

$a = parent::createElement("a", $link_name);
$a->setAttribute("class", "yuimenuitemlabel");

$li->appendChild($a);

$this->li = $li;
$this->a = $a;

}

/** ---------------------------------
/** appendChild
/** Appends a (sub) Menu() to current Menu() in LI
/** ---------------------------------*/

public function appendChild($child) {

$yuimenu = parent::createElement("div");
$yuimenu->setAttribute("class", "yuimenu");

$this->LI->appendChild($yuimenu);

$bd = parent::createElement("div");
$bd->setAttribute("class", "bd");

$yuimenu->appendChild($bd);

$ul = parent::createElement("ul");

$bd->appendChild($ul);

// child->NODE holds THE link from the new child (from child's __construct())

$ul->appendChild(parent::importNode($child->LI, true));

}

public function setAttribute($name, $value, $node="a") {

if ($node == "a") {
$this->a->setAttribute($name, $value);
}

else {
$this->li->setAttribute($name, $value);
}
}

}