quoted_printable_decode

(PHP 4, PHP 5, PHP 7, PHP 8)

quoted_printable_decodeConvert a quoted-printable string to an 8 bit string

Description

quoted_printable_decode(string $string): string

This function returns an 8-bit binary string corresponding to the decoded quoted printable string (according to » RFC2045, section 6.7, not » RFC2821, section 4.5.2, so additional periods are not stripped from the beginning of line).

This function is similar to imap_qprint(), except this one does not require the IMAP module to work.

Parameters

string

The input string.

Return Values

Returns the 8-bit binary string.

Examples

Example #1 quoted_printable_decode() example

<?php

$encoded
= quoted_printable_encode('Möchten Sie ein paar Äpfel?');

var_dump($encoded);
var_dump(quoted_printable_decode($encoded));
?>

The above example will output:

string(37) "M=C3=B6chten Sie ein paar =C3=84pfel?"
string(29) "Möchten Sie ein paar Äpfel?"

See Also

add a note

User Contributed Notes 21 notes

up
9
jab_creations at yahoo dot com
3 years ago
If you're getting black diamonds or weird characters that seemingly block an echo but still encounter strlen($string) > 0 you're probably encountering an encoding issue. Unlike the people writing ENCODE functions on a DECODE page I will actually talk about DECODE on a DECODE page.

The specific problem I encountered was that an email was encoded using a Russian encoding (KOI8-R) though I output everything as UTF-8 because: compatibility.

If you try to do this with a Russian encoding:

<?php
echo quoted_printable_decode('=81');
?>

You'll get that corrupted data.

I did a couple of tests and it turns out the following is how you nest the mb_convert_encoding function:

<?php
echo '<p>Test: "'.mb_convert_encoding(quoted_printable_decode('=81'), 'UTF-8', 'KOI8-R').'".</p>';
?>

Unfortunately I could not find a character mapping table or anything listed under RFC 2045 Section 6.7. However I came across the website https://dencode.com/en/string/quoted-printable which allows you to manually choose the encoding (it's an open source site, they have a GIT repository for the morbidly curious).

As it turns out the start of the range is relative to the encoding. So Latin (ISO-8859-1) and Russian (KOI8-R) will likely (not tested this) encode to different characters **relative to the string encoding**.
up
3
Karora
16 years ago
Taking a bunch of the earlier comments together, you can synthesize a nice short and reasonably efficient quoted_printable_encode function like this:

Note that I put this in my standard library file, so I wrap it in a !function_exists in order that if there is a pre-existing PHP one it will just work and this will evaluate to a noop.

<?php
if ( !function_exists("quoted_printable_encode") ) {
/**
* Process a string to fit the requirements of RFC2045 section 6.7. Note that
* this works, but replaces more characters than the minimum set. For readability
* the spaces aren't encoded as =20 though.
*/
function quoted_printable_encode($string) {
return
preg_replace('/[^\r\n]{73}[^=\r\n]{2}/', "$0=\r\n", str_replace("%","=",str_replace("%20"," ",rawurlencode($string))));
}
}
?>

Regards,
Andrew McMillan.
up
4
madmax at express dot ru
24 years ago
Some browser (netscape, for example)
send 8-bit quoted printable text like this:
=C5=DD=A3=D2=C1= =DA

"= =" means continuos word.
php function not detect this situations and translate in string like:
abcde=f
up
2
soletan at toxa dot de
17 years ago
Be warned! The method below for encoding text does not work as requested by RFC1521!

Consider a line consisting of 75 'A' and a single é (or similar non-ASCII character) ... the method below would encode and return a line of 78 octets, breaking with RFC 1521, 5.1 Rule #5: "The Quoted-Printable encoding REQUIRES that encoded lines be no more than 76 characters long."

Good QP-encoding takes a bit more than this.
up
2
zg
16 years ago
<?php

function quoted_printable_encode( $str, $chunkLen = 72 )
{
$offset = 0;

$str = strtr(rawurlencode($str), array('%' => '='));
$len = strlen($str);
$enc = '';

while (
$offset < $len )
{
if (
$str{ $offset + $chunkLen - 1 } === '=' )
{
$line = substr($str, $offset, $chunkLen - 1);
$offset += $chunkLen - 1;
}
elseif (
$str{ $offset + $chunkLen - 2 } === '=' )
{
$line = substr($str, $offset, $chunkLen - 2);
$offset += $chunkLen - 2;
}
else
{
$line = substr($str, $offset, $chunkLen);
$offset += $chunkLen;
}

if (
$offset + $chunkLen < $len )
$enc .= $line ."=\n";
else
$enc .= $line;
}

return
$enc;
}

?>
up
2
pob at medienrecht dot NOSPAM dot org
23 years ago
If you do not have access to imap_* and do not want to use
?$message = chunk_split( base64_encode($message) );?
because you want to be able to read the ?source? of your mails, you might want to try this:
(any suggestions very welcome!)


function qp_enc($input = "quoted-printable encoding test string", $line_max = 76) {

$hex = array('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F');
$lines = preg_split("/(?:\r\n|\r|\n)/", $input);
$eol = "\r\n";
$escape = "=";
$output = "";

while( list(, $line) = each($lines) ) {
//$line = rtrim($line); // remove trailing white space -> no =20\r\n necessary
$linlen = strlen($line);
$newline = "";
for($i = 0; $i < $linlen; $i++) {
$c = substr($line, $i, 1);
$dec = ord($c);
if ( ($dec == 32) && ($i == ($linlen - 1)) ) { // convert space at eol only
$c = "=20";
} elseif ( ($dec == 61) || ($dec < 32 ) || ($dec > 126) ) { // always encode "\t", which is *not* required
$h2 = floor($dec/16); $h1 = floor($dec%16);
$c = $escape.$hex["$h2"].$hex["$h1"];
}
if ( (strlen($newline) + strlen($c)) >= $line_max ) { // CRLF is not counted
$output .= $newline.$escape.$eol; // soft line break; " =\r\n" is okay
$newline = "";
}
$newline .= $c;
} // end of for
$output .= $newline.$eol;
}
return trim($output);

}

$eight_bit = "\xA7 \xC4 \xD6 \xDC \xE4 \xF6 \xFC \xDF = xxx yyy zzz \r\n"
." \xA7 \r \xC4 \n \xD6 \x09 ";
print $eight_bit."\r\n---------------\r\n";
$encoded = qp_enc($eight_bit);
print $encoded;
up
2
legolas558
17 years ago
Please note that in the below encode function there is a bug!

<?php
if (($c==0x3d) || ($c>=0x80) || ($c<0x20))
?>

$c should be checked against less or equal to encode spaces!

so the correct code is

<?php
if (($c==0x3d) || ($c>=0x80) || ($c<=0x20))
?>

Fix the code or post this note, please
up
1
Christian Albrecht
16 years ago
In Addition to david lionhead's function:

<?php
function quoted_printable_encode($txt) {
/* Make sure there are no %20 or similar */
$txt = rawurldecode($txt);
$tmp="";
$line="";
for (
$i=0;$i<strlen($txt);$i++) {
if ((
$txt[$i]>='a' && $txt[$i]<='z') || ($txt[$i]>='A' && $txt[$i]<='Z') || ($txt[$i]>='0' && $txt[$i]<='9')) {
$line.=$txt[$i];
if (
strlen($line)>=75) {
$tmp.="$line=\n";
$line="";
}
}
else {
/* Important to differentiate this case from the above */
if (strlen($line)>=72) {
$tmp.="$line=\n";
$line="";
}
$line.="=".sprintf("%02X",ord($txt[$i]));
}
}
$tmp.="$line\n";
return
$tmp;
}
?>
up
1
vita dot plachy at seznam dot cz
16 years ago
<?php
$text
= <<<EOF
This function enables you to convert text to a quoted-printable string as well as to create encoded-words used in email headers (see http://www.faqs.org/rfcs/rfc2047.html).

No line of returned text will be longer than specified. Encoded-words will not contain a newline character. Special characters are removed.
EOF;

define('QP_LINE_LENGTH', 75);
define('QP_LINE_SEPARATOR', "\r\n");

function
quoted_printable_encode($string, $encodedWord = false)
{
if(!
preg_match('//u', $string)) {
throw new
Exception('Input string is not valid UTF-8');
}

static
$wordStart = '=?UTF-8?Q?';
static
$wordEnd = '?=';
static
$endl = QP_LINE_SEPARATOR;

$lineLength = $encodedWord
? QP_LINE_LENGTH - strlen($wordStart) - strlen($wordEnd)
:
QP_LINE_LENGTH;

$string = $encodedWord
? preg_replace('~[\r\n]+~', ' ', $string) // we need encoded word to be single line
: preg_replace('~\r\n?~', "\n", $string); // normalize line endings
$string = preg_replace('~[\x00-\x08\x0B-\x1F]+~', '', $string); // remove control characters

$output = $encodedWord ? $wordStart : '';
$charsLeft = $lineLength;

$chr = isset($string{0}) ? $string{0} : null;
$ord = ord($chr);

for (
$i = 0; isset($chr); $i++) {
$nextChr = isset($string{$i + 1}) ? $string{$i + 1} : null;
$nextOrd = ord($nextChr);

if (
$ord > 127 or // high byte value
$ord === 95 or // underscore "_"
$ord === 63 && $encodedWord or // "?" in encoded word
$ord === 61 or // equal sign "="
// space or tab in encoded word or at line end
$ord === 32 || $ord === 9 and $encodedWord || !isset($nextOrd) || $nextOrd === 10
) {
$chr = sprintf('=%02X', $ord);
}

if (
$ord === 10) { // line feed
$output .= $endl;
$charsLeft = $lineLength;
} elseif (
strlen($chr) < $charsLeft or
strlen($chr) === $charsLeft and $nextOrd === 10 || $encodedWord
) { // add character
$output .= $chr;
$charsLeft-=strlen($chr);
} elseif (isset(
$nextOrd)) { // another line needed
$output .= $encodedWord
? $wordEnd . $endl . "\t" . $wordStart . $chr
: '=' . $endl . $chr;
$charsLeft = $lineLength - strlen($chr);
}

$chr = $nextChr;
$ord = $nextOrd;
}

return
$output . ($encodedWord ? $wordEnd : '');
}

echo
quoted_printable_encode($text/*, true*/);
up
2
andre at luyer dot nl
16 years ago
A small update for Andrew's code below. This one leaves the original CRLF pairs intact (and allowing the preg_replace to work as intended):

<?php
if (!function_exists("quoted_printable_encode")) {
/**
* Process a string to fit the requirements of RFC2045 section 6.7. Note that
* this works, but replaces more characters than the minimum set. For readability
* the spaces and CRLF pairs aren't encoded though.
*/
function quoted_printable_encode($string) {
return
preg_replace('/[^\r\n]{73}[^=\r\n]{2}/', "$0=\r\n",
str_replace("%", "=", str_replace("%0D%0A", "\r\n",
str_replace("%20"," ",rawurlencode($string)))));
}
}
?>

Regards, André
up
1
Thomas Pequet / Memotoo.com
18 years ago
If you want a function to do the reverse of "quoted_printable_decode()", follow the link you will find the "quoted_printable_encode()" function:
http://www.memotoo.com/softs/public/PHP/quoted printable_encode.inc.php

Compatible "ENCODING=QUOTED-PRINTABLE"
Example:
quoted_printable_encode(ut8_encode("c'est quand l'été ?"))
-> "c'est quand l'=C3=A9t=C3=A9 ?"
up
2
sven at e7o dot de
4 years ago
If you're really lazy and producing HTML anyways and the end, just convert it to HTML entities and move the Unicode/ISO struggling to the document's encoding:

<?php
function qpd($e)
{
return
preg_replace_callback(
'/=([a-z0-9]{2})/i',
function (
$m) {
return
'&#x00' . $m[1] . ';';
},
$e
);
}
?>
up
1
yual at inbox dot ru
11 years ago
I use a hack for this bug:

$str = str_replace("=\r\n", '', quoted_printable_encode($str));
if (strlen($str) > 73) {$str = substr($str,0,74)."=\n".substr($str,74);}
up
1
steffen dot weber at computerbase dot de
19 years ago
As the two digit hexadecimal representation SHOULD be in uppercase you want to use "=%02X" (uppercase X) instead of "=%02x" as the first argument to sprintf().
up
1
feedr
15 years ago
Another (improved) version of quoted_printable_encode(). Please note the order of the array elements in str_replace().
I've just rewritten the previous function for better readability.

<?php
if (!function_exists("quoted_printable_encode")) {
/**
* Process a string to fit the requirements of RFC2045 section 6.7. Note that
* this works, but replaces more characters than the minimum set. For readability
* the spaces and CRLF pairs aren't encoded though.
*/
function quoted_printable_encode($string) {
$string = str_replace(array('%20', '%0D%0A', '%'), array(' ', "\r\n", '='), rawurlencode($string));
$string = preg_replace('/[^\r\n]{73}[^=\r\n]{2}/', "$0=\r\n", $string);

return
$string;
}
}
?>
up
1
david at lionhead dot nl
16 years ago
My version of quoted_printable encode, as the convert.quoted-printable-encode filter breaks on outlook express. This one seems to work on express/outlook/thunderbird/gmail.

function quoted_printable_encode($txt) {
$tmp="";
$line="";
for ($i=0;$i<strlen($txt);$i++) {
if (($txt[$i]>='a' && $txt[$i]<='z') || ($txt[$i]>='A' && $txt[$i]<='Z') || ($txt[$i]>='0' && $txt[$i]<='9'))
$line.=$txt[$i];
else
$line.="=".sprintf("%02X",ord($txt[$i]));
if (strlen($line)>=75) {
$tmp.="$line=\n";
$line="";
}
}
$tmp.="$line\n";
return $tmp;
}
up
1
ludwig at gramberg-webdesign dot de
17 years ago
my approach for quoted printable encode using the stream converting abilities

<?php
/**
* @param string $str
* @return string
* */
function quoted_printable_encode($str) {
$fp = fopen('php://temp', 'w+');
stream_filter_append($fp, 'convert.quoted-printable-encode');
fwrite($fp, $str);
fseek($fp, 0);
$result = '';
while(!
feof($fp))
$result .= fread($fp, 1024);
fclose($fp);
return
$result;
}
?>
up
1
roelof
17 years ago
I modified the below version of legolas558 at users dot sausafe dot net and added a wrapping option.

<?php
/**
* Codeer een String naar zogenaamde 'quoted printable'. Dit type van coderen wordt
* gebruikt om de content van 8 bit e-mail berichten als 7 bits te versturen.
*
* @access public
* @param string $str De String die we coderen
* @param bool $wrap Voeg linebreaks toe na 74 tekens?
* @return string
*/

function quoted_printable_encode($str, $wrap=true)
{
$return = '';
$iL = strlen($str);
for(
$i=0; $i<$iL; $i++)
{
$char = $str[$i];
if(
ctype_print($char) && !ctype_punct($char)) $return .= $char;
else
$return .= sprintf('=%02X', ord($char));
}
return (
$wrap === true)
?
wordwrap($return, 74, " =\n")
:
$return;
}

?>
up
1
legolas558 at users dot sausafe dot net
17 years ago
As soletan at toxa dot de reported, that function is very bad and does not provide valid enquoted printable strings. While using it I saw spam agents marking the emails as QP_EXCESS and sometimes the email client did not recognize the header at all; I really lost time :(. This is the new version (we use it in the Drake CMS core) that works seamlessly:

<?php

//L: note $encoding that is uppercase
//L: also your PHP installation must have ctype_alpha, otherwise write it yourself
function quoted_printable_encode($string, $encoding='UTF-8') {
// use this function with headers, not with the email body as it misses word wrapping
$len = strlen($string);
$result = '';
$enc = false;
for(
$i=0;$i<$len;++$i) {
$c = $string[$i];
if (
ctype_alpha($c))
$result.=$c;
else if (
$c==' ') {
$result.='_';
$enc = true;
} else {
$result.=sprintf("=%02X", ord($c));
$enc = true;
}
}
//L: so spam agents won't mark your email with QP_EXCESS
if (!$enc) return $string;
return
'=?'.$encoding.'?q?'.$result.'?=';
}

I hope it helps ;)

?>
up
-1
dmitry at koterov dot ru
19 years ago
Previous comment has a bug: encoding of short test does not work because of incorrect usage of preg_match_all(). Have somebody read it at all? :-)

Correct version (seems), with additional imap_8bit() function emulation:

if (!function_exists('imap_8bit')) {
function imap_8bit($text) {
return quoted_printable_encode($text);
}
}

function quoted_printable_encode_character ( $matches ) {
$character = $matches[0];
return sprintf ( '=%02x', ord ( $character ) );
}

// based on http://www.freesoft.org/CIE/RFC/1521/6.htm
function quoted_printable_encode ( $string ) {
// rule #2, #3 (leaves space and tab characters in tact)
$string = preg_replace_callback (
'/[^\x21-\x3C\x3E-\x7E\x09\x20]/',
'quoted_printable_encode_character',
$string
);
$newline = "=\r\n"; // '=' + CRLF (rule #4)
// make sure the splitting of lines does not interfere with escaped characters
// (chunk_split fails here)
$string = preg_replace ( '/(.{73}[^=]{0,3})/', '$1'.$newline, $string);
return $string;
}
up
-1
naitsirch at e dot mail dot de
4 years ago
If there is a NULL byte in the string that is passed, quoted_printable_decode will crop everything after the NULL byte and the NULL byte itself.

<?php
$result
= quoted_printable_decode("This is a\0 test.");
// $result === 'This is a'
?>

This is not a bug, but the intended behaviour and defined by RFC 2045 (see https://www.ietf.org/rfc/rfc2045.txt) in paragraph 2.7 and 2.8.
To Top