PHP 8.1.0 Alpha 1 available for testing

mb_ord

(PHP 7 >= 7.2.0, PHP 8)

mb_ordGet Unicode code point of character

说明

mb_ord ( string $string , string|null $encoding = null ) : int|false

Returns the Unicode code point value of the given character.

This function complements mb_chr().

参数

string

A string

encoding

encoding 参数为字符编码。如果省略或是 null,则使用内部字符编码。

返回值

The Unicode code point for the first character of string 或者在失败时返回 false.

更新日志

版本 说明
8.0.0 现在 encoding 可以为 null。

范例

<?php
var_dump
(mb_ord("A""UTF-8"));
var_dump(mb_ord("🐘""UTF-8"));
var_dump(mb_ord("\x80""ISO-8859-1"));
var_dump(mb_ord("\x80""Windows-1252"));
?>

以上例程会输出:


int(65)
int(128024)
int(128)
int(8364)

参见

add a note add a note

User Contributed Notes 1 note

up
2
Andrew
1 year ago
You can forget about DIY uniord()
https://www.php.net/manual/en/function.ord.php#42778

$array['Б'] = uniord('Б');
$array['🚷'] = uniord('🚷');
$array['mb_ord Б'] = mb_ord('Б');
$array['mb_ord 🚷'] = mb_ord('🚷');

function uniord($charUTF8)
{
$charUCS4 = mb_convert_encoding($charUTF8, 'UCS-4BE', 'UTF-8');
$byte1 = ord(substr($charUCS4, 0, 1));
$byte2 = ord(substr($charUCS4, 1, 1));
$byte3 = ord(substr($charUCS4, 2, 1));
$byte4 = ord(substr($charUCS4, 3, 1));
return ($byte1 << 32) + ($byte2 << 16) + ($byte3 << 8) + $byte4;
}

var_export($array);

Shows:

array ( 'Б' => 1041, '🚷' => 128695, 'mb_ord Б' => 1041, 'mb_ord 🚷' => 128695, )

https://unicode-table.com/en/0411/
Б
Encoding     hex     dec (bytes)     dec     binary
UTF-8         D0 91     208 145     53393     11010000 10010001
UTF-16BE     04 11     4 17     1041     00000100 00010001
UTF-16LE     11 04     17 4     4356     00010001 00000100
UTF-32BE     00 00 04 11     0 0 4 17     1041     00000000 00000000 00000100 00010001
UTF-32LE     11 04 00 00     17 4 0 0     285474816     00010001 00000100 00000000 00000000

https://unicode-table.com/en/1F6B7/
🚷
Encoding     hex     dec (bytes)     dec     binary
UTF-8         F0 9F 9A B7     240 159 154 183     4036991671     11110000 10011111 10011010 10110111
UTF-16BE     D8 3D DE B7     216 61 222 183     3627933367     11011000 00111101 11011110 10110111
UTF-16LE     3D D8 B7 DE     61 216 183 222     1037613022     00111101 11011000 10110111 11011110
UTF-32BE     00 01 F6 B7     0 1 246 183     128695     00000000 00000001 11110110 10110111
UTF-32LE     B7 F6 01 00     183 246 1 0     3086352640     10110111 11110110 00000001 00000000
To Top