Presenting several UTF-8 / Multibyte-aware escape functions.

These functions represent alternatives to mysqli::real_escape_string, as long as your DB connection and Multibyte extension are using the same character set (UTF-8), they will produce the same results by escaping the same characters as mysqli::real_escape_string.

This is based on research I did for my SQL Query Builder class:
https://github.com/twister-php/sql

if (function_exists(‘mb_ereg_replace’))
{
function mb_escape(string $string)
{
return mb_ereg_replace(‘[x00x0Ax0Dx1Ax22x27x5C]’, ‘\’, $string);
}
} else {
function mb_escape(string $string)
{
return preg_replace(‘~[x00x0Ax0Dx1Ax22x27x5C]~u’, ‘\$0’, $string);
}
}

?>

Characters escaped are (the same as mysqli::real_escape_string):

00 = (NUL)
0A = n
0D = r
1A = ctl-Z
22 = “
27 = ‘
5C =

Note: preg_replace() is in PCRE_UTF8 (UTF-8) mode (`u`).

Enhanced version:

When escaping strings for `LIKE` syntax, remember that you also need to escape the special characters _ and %

So this is a more fail-safe version (even when compared to mysqli::real_escape_string, because % characters in user input can cause unexpected results and even security violations via SQL injection in LIKE statements):

if (function_exists(‘mb_ereg_replace’))
{
function mb_escape(string $string)
{
return mb_ereg_replace(‘[x00x0Ax0Dx1Ax22x25x27x5Cx5F]’, ‘\’, $string);
}
} else {
function mb_escape(string $string)
{
return preg_replace(‘~[x00x0Ax0Dx1Ax22x25x27x5Cx5F]~u’, ‘\$0’, $string);
}
}

?>

Additional characters escaped:

25 = %
5F = _

Bonus function:

The original MySQL `utf8` character-set (for tables and fields) only supports 3-byte sequences.
4-byte characters are not common, but I’ve had queries fail to execute on 4-byte UTF-8 characters, so you should be using `utf8mb4` wherever possible.

However, if you still want to use `utf8`, you can use the following function to replace all 4-byte sequences.

function mysql_utf8_sanitizer(string $str)
{
return preg_replace(‘/[x{10000}-x{10FFFF}]/u’, “xEFxBFxBD”, $str);
}
?>

Pick your poison and use at your own risk!