PHP: visual charset detection of a string

Sometimes in mb_convert_encoding($str, 'UTF-8', 'auto') that latest auto argument won't work.

To find what charset the string is using you can get all possible charsets, convert your $str into that charset and print it:

foreach(mb_list_encodings() as $charset) {
    echo mb_convert_encoding($str, 'UTF-8', $charset) . ' ----- ' . $chr . "\n";                                                                                                2              }
}

This code may not work for some charsets returned by mb_list_encodings(). You can ignore some of them like this:

foreach(mb_list_encodings() as $charset) {
    if (in_array($chr, ['pass', 'wchar', 'byte2be', 'byte2le', 'byte4be', 'byte4le', 'BASE64', 'UUENCODE', 'HTML-ENTITIES', 'Quoted-Printable', '7bit', '8bit'])) {
        continue;
    }
    echo mb_convert_encoding($str, 'UTF-8', $charset) . ' ----- ' . $chr . "\n";                                                                                                2              }
}

Of course it's only applicable if you know that $str will always be in the charset you found.

Proxy Information
Original URL
gemini://g.codelearn.me/2020-11-25-visual-detection-of-charset-in-php.gmi
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
513.057595 milliseconds
Gemini-to-HTML Time
0.219154 milliseconds

This content has been proxied by September (ba2dc).