Here is an example demonstrating the difference between `substr` and `mb_substr` functions:
1- When working with non UTF-8 characters, both functions behave the same and give the same output:
$str = 'abcdef';
echo substr($s, 0, 3); // abc
echo mb_substr($s, 0, 3); // abc
2- When working with UTF-8 characters, each function will behave differently and give a different result:
2.A- The 'substr' function works on the byte-level and with single-byte encoded characters only (doesn't support multibyte encoding).
For example:
$str_utf8 = utf8_encode("déjà_vu");
If we do this:
echo substr($str_utf8, 0, 3); // dé
echo substr($str_utf8, 0, 2); // d�
=> That's because the special character "é" (and "à") is internally coded with two bytes:
PHP will start reading the first byte at index 0, which represents `d`, then move to the second byte, which is a part of the two-byte encoding of the character ` é `, and since the length is set to 2, PHP will stop here and doesn't continue reading the third byte, so it doesn't recognize the character ` é ` and prints � instead of é.
2.B- The 'mb_substr' function works on the character-level and supports multibyte encoded characters. This means, PHP counts the number of characters only and doesn't take into consideration the number of bytes of their encoding, for example:
$str_utf8 = utf8_encode("déjà_vu");
echo mb_substr($str_utf8, 0, 4, "UTF-8"); // Déjà
echo mb_substr($str_utf8, 1, 4, "UTF-8"); // éjà_
echo mb_substr($str_utf8, 6, 4, "UTF-8"); // u
echo mb_substr($str_utf8, 7, 4, "UTF-8"); // ''
echo mb_substr($str_utf8, -2, "UTF-8"); // vu
echo mb_substr($str_utf8, -2, 1, "UTF-8"); // v
echo mb_substr($str_utf8, -2, 3, "UTF-8"); // vu