PHP - Manual: DOMDocument::saveHTML

2025-12-06

DOMDocument::saveHTML

(PHP 5, PHP 7, PHP 8)

DOMDocument::saveHTML — Dumps the internal document into a string using HTML formatting

说明

public DOMDocument::saveHTML(?DOMNode $node = null): string|false

Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.

参数

node: Optional parameter to output a subset of the document.

返回值

Returns the HTML, or false if an error occurred.

示例

示例 #1 Saving a HTML tree into a string

<?php

$doc = new DOMDocument('1.0');

$root = $doc->createElement('html');
$root = $doc->appendChild($root);

$head = $doc->createElement('head');
$head = $root->appendChild($head);

$title = $doc->createElement('title');
$title = $head->appendChild($title);

$text = $doc->createTextNode('This is the title');
$text = $title->appendChild($text);

echo $doc->saveHTML();

?>

参见

DOMDocument::saveHTMLFile() - Dumps the internal document into a file using HTML formatting
DOMDocument::loadHTML() - Load HTML from a string
DOMDocument::loadHTMLFile() - Load HTML from a file

发现了问题？

了解如何改进此页面 • 提交拉取请求 • 报告一个错误

＋添加备注

用户贡献的备注 12 notes

down

tomas dot strejcek at ghn dot cz ¶

8 years ago

As of PHP 5.4 and Libxml 2.6, there is currently simpler approach:

when you load html as this

$html->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

in the output, there will be no doctype, html or body tags

down

sasha @ goldnet dot ca ¶

7 years ago

When saving HTML fragment initiated with LIBXML_HTML_NOIMPLIED option, it will end up being "broken" as libxml requires root element. libxml will attempt to fix the fragment by adding closing tag at the end of string based on the first opened tag it encounters in the fragment. 

For an example:

<h1>Foo</h1><p>bar</p>

will end up as:

<h1>Foo<p>bar</p></h1>

Easiest workaround is adding root tag yourself and stripping it later:

$html->loadHTML('<html>' . $content .'</html>', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$content = str_replace(array('<html>','</html>') , '' , $html->saveHTML());

down

contact at cathexis dot de ¶

8 years ago

If you load HTML from a string ensure the charset is set.

<?php
...
$html_src = '<html><head><meta content="text/html; charset=utf-8" http-equiv="Content-Type"></head><body>';
$html_src .= '...';
...
?> 

Otherwise the charset will be ISO-8859-1!

down

Anonymous ¶

9 years ago

To solve the script tag problem just add an empty text node to the script node and DOMDocument will render <script src="your.js"></script> nicely.

down

Anonymous ¶

15 years ago

If you want a simpler way to get around the <script> tag problem try:

<?php

  $script = $doc->createElement ('script');\
// Creating an empty text node forces <script></script>
$script->appendChild ($doc->createTextNode (''));
$head->appendChild ($script);

?>

down

tyson at clugg dot net ¶

19 years ago

<?php
// Using DOM to fix sloppy HTML.
// An example by Tyson Clugg <tyson@clugg.net>
//
// vim: syntax=php expandtab tabstop=2

function tidyHTML($buffer)
{
// load our document into a DOM object
$dom = @DOMDocument::loadHTML($buffer);
// we want nice output
$dom->formatOutput = true;
  return($dom->saveHTML());
}

// start output buffering, using our nice
// callback funtion to format the output.
ob_start("tidyHTML");

?>
<html>
<p>It's like comparing apples to oranges.
</html>
<?php

// this will be called implicitly, but we'll
// call it manually to illustrate the point.
ob_end_flush();

?>

The above code takes out sloppy HTML:
 <html>
 <p>It's like comparing apples to oranges.
 </html>

And cleans it up to the following:
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
 <html><body><p>It's like comparing apples to oranges.
 </p></body></html>

down

jeboy ¶

7 years ago

LIBXML_HTML_NOIMPLIED doesn't work on PHP 7.1.9 with libxml2-2.7.8

down

Anonymous ¶

15 years ago

To avoid script tags from being output as <script />, you can use the DOMDocumentFragment class:

<?php

$doc = new DOMDocument();
$doc -> loadXML($xmlstring);
$fragment = $doc->createDocumentFragment();
/* Append the script element to the fragment using raw XML strings (will be preserved in their raw form) and if succesful proceed to insert it in the DOM tree */ 
if($fragment->appendXML("<script type='text/javascript' src='$source'></script>") { 
$xpath = new DOMXpath($doc);
$resultlist = $xpath->query("//*[local-name() = 'html']/*[local-name() = 'head']"); /* namespace-safe method to find all head elements which are childs of the html element, should only return 1 match */
foreach($resultlist as $headnode)  // insert the script tag
$headnode->appendChild($fragment);
}
$doc->saveXML(); /* and our script tags will still be <script></script> */

?>

down

archanglmr at yahoo dot com ¶

17 years ago

If created your DOMDocument object using loadHTML() (where the source is from another site) and want to pass your changes back to the browser you should make sure the HTTP Content-Type header matches your meta content-type tags value because modern browsers seem to ignore the meta tag and trust just the HTTP header. For example if you're reading an ISO-8859-1 document and your web server is claiming UTF-8 you need to correct it using the header() function.

<?php
header('Content-Type: text/html; charset=iso-8859-1');
?>

down

xoplqox ¶

17 years ago

XHTML:

If the output is XHTML use the function saveXML().

Output example for saveHTML:

<select name="pet" size="3" multiple>
    <option selected>mouse</option>
    <option>bird</option>
    <option>cat</option>
</select>

XHTML conform output using saveXML:

<select name="pet" size="3" multiple="multiple">
    <option selected="selected">mouse</option>
    <option>bird</option>
    <option>cat</option>
</select>

down

-1

Anonymous ¶

16 years ago

<?php

function getDOMString($retNode) {

  if (!$retNode) return null;

$retval = strtr($retNode-->ownerDocument->saveXML($retNode),

  array(

'></area>' => ' />',

'></base>' => ' />',

'></basefont>' => ' />',

'></br>' => ' />',

'></col>' => ' />',

'></frame>' => ' />',

'></hr>' => ' />',

'></img>' => ' />',

'></input>' => ' />',

'></isindex>' => ' />',

'></link>' => ' />',

'></meta>' => ' />',

'></param>' => ' />',

'default:' => '', 

// sometimes, you have to decode entities too...

'&quot;' => '&#34;',

'&amp;' =>  '&#38;',

'&apos;' => '&#39;',

'&lt;' =>   '&#60;',

'&gt;' =>   '&#62;',

'&nbsp;' => '&#160;',

'&copy;' => '&#169;',

'&laquo;' => '&#171;',

'&reg;' =>   '&#174;',

'&raquo;' => '&#187;',

'&trade;' => '&#8482;'

));

  return $retval;

}

?>

down

-2

qrworld.net ¶

10 years ago

In this post http://softontherocks.blogspot.com/2014/11/descargar-el-contenido-de-una-url_11.html I found a simple way to get the content of a URL with DOMDocument, loadHTMLFile and saveHTML().

function getURLContent($url){
    $doc = new DOMDocument;
    $doc->preserveWhiteSpace = FALSE;
    @$doc->loadHTMLFile($url);
    return $doc->saveHTML();
}

＋添加备注

官方地址：https://www.php.net/manual/en/domdocument.savehtml.php

有任何技术问题请点击这里网站运营推广招聘

IT PHP 编程语言开发编程 Linux 科技 Elasticsearch HTML/CSS/XML 面试数据库网络 JAVA NoSQL C/C++ Golang 操作系统 Git 算法正则表达式 Redis 互联网 MySql 软件运维 JavaScript 国际架构设计 Mac OS TCP/IP Excel Windows Oracle Socket VR Vim MongoDB 运营 Python MemCache 商业硬件电子娱乐设计摄影 nginx WordPress 游戏 HTTP 团建数码电器 Docker 大模型

php7.3 使用 PDO_DM 扩展连接 DM8 中文乱码 PhpStorm中PHP注释的规范指南使用PHPWord将docx文件转换为html格式 docker-compose启动nginx与php-fpm laravel查看orm生成的sql PHPStorm ESC 会退出命令行 composer install参数 laravel orm中DB::insert方法导致内存泄漏的问题解决方法 php7 安装fileinfo扩展 adodb手册 ADORecordSet对象 opcache预加载 ADOConnection 公用函数 Composer的Packagist资源 php 将字符串中的连续多个空格转换为一个空格常用的php ADODB使用方法集锦 adodb连接mysql多个数据库的问题 [鸟哥]PHP_INT_MIN 和 -9223372036854775808 composer基本用法利用PHP SOAP实现WEB SERVICE

略微加速

PHP官方手册 - 互联网笔记

DOMDocument::saveHTML

说明

参数

返回值

示例

参见

发现了问题？

用户贡献的备注 12 notes