DOMDocument::saveHTMLFile

(PHP 5, PHP 7, PHP 8)

DOMDocument::saveHTMLFile Dumps the internal document into a file using HTML formatting

说明

public DOMDocument::saveHTMLFile(string $filename): int|false

Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.

参数

filename

The path to the saved HTML document.

返回值

Returns the number of bytes written or false if an error occurred.

示例

示例 #1 Saving a HTML tree into a file

<?php

$doc
= new DOMDocument('1.0');
// we want a nice output
$doc->formatOutput = true;

$root = $doc->createElement('html');
$root = $doc->appendChild($root);

$head = $doc->createElement('head');
$head = $root->appendChild($head);

$title = $doc->createElement('title');
$title = $head->appendChild($title);

$text = $doc->createTextNode('This is the title');
$text = $title->appendChild($text);

echo
'Wrote: ' . $doc->saveHTMLFile("/tmp/test.html") . ' bytes'; // Wrote: 129 bytes

?>

参见

添加备注

用户贡献的备注 3 notes

up
6
RiKdnUA at mail dot ru
11 years ago
saveHTMLFile() always saves the file in UTF-8. Even if the DOMDocument->encoding explicitly prescribe different from UTF-8 encoding. All "non-Latin" characters will be converted to HTML-entities. Tested in PHP 5.2.9-2 and PHP 5.2.17. Example:

<?php
$document
=new domDocument('1.0', 'WINDOWS-1251');
$document->loadHTML('<html><head><title>Russian language</title></head><body>Русский язык</body></html>');
$document->formatOutput=true;
$document->encoding='WINDOWS-1251';
echo
"Записано байт. Recorded bytes: ".$document->saveHTMLFile('html.html');
?>

Method recorded file in UTF-8 encoding. The contents of the file html.html:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Russian language</title>
</head>
<body>Ðóññêèé ÿçûê</body>
</html>
up
-1
naebeth at hotmail dot NOSPAM dot com
12 years ago
Not mentioned in the documentation is the fact that using DOMDocument::saveHTMLFile() will automatically overwrite the contents if an existing file is used - with no notice, warning or error thrown.

Make sure you check the filename before using this function so that you don't accidentally overwrite important files.

Example:
<?php

$file
= fopen('test.html', 'w');
fwrite($file, 'this is some text');
fclose($file);

$doc = new DOMDocument();
$doc->formatOutput = true;
$doc->loadHTML('<html><head><title>Test</title></head><body></body></html>');
$doc->saveHTMLFile('test.html');

// test.html
/*
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Test</title>
</head>
<body></body>
</html>
*/

?>

If you're dynamically generating a series of pages using DOMDocument objects, make sure you are also dynamically generating the file or directory names using something that can't easily be confused for an existing file/folder, or check if the desired path already exists before saving so that you don't accidentally delete previous files.
up
-1
deep42thouSPAMght42 at y_a_h_o_o dot com
13 years ago
I foolishly assumed that this function was equivalent to
<?php
file_put_contents
($filename, $document->saveHTML());
?>
but there are differences in the generated HTML:
<?php
$doc
= new DOMDocument();
$doc->loadHTML(
'<html><head><title>Test</title></head><body></body></html>'
);
$doc->encoding = 'iso-8859-1';

echo
$doc->saveHTML();
#<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
#<html>
#<head><title>Test</title></head>
#<body></body>
#</html>

$doc->saveHTMLFile('output.html');
#<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
#<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><title>Test</title></head><body></body></html>

?>
Note that saveHTMLFile() adds a UTF-8 meta tag despite the ISO-8859-1 document encoding.
To Top