Understanding Character Sets and Encodings
Having only just recently been bit by the character encoding issue again, we thought it would be a good time to bring it up on the podcast. Starting from the beginning with ASCII, we move on to discuss how 8-bit compatible machines brought way to the ISO-8859-* standards. This leads us on to Unicode, with the goal to develop a single character-set encoding standard that could support all of the world’s scripts. Finally, we discuss the de-factor character encoding implementation used on the web today ‘UTF-8’, and reasons why this is the case.
Show Links
- PhalconPHP
- Team Pacific Rowers
- Computerphile
- phpwtf
- wtfjs
- Twitter - fabpot: php -r ‘echo in_array(“foo”, …
- 3v4l - EvAluate your code in our online PHP shell (100+ PHP versions)
- Reversing a String in PHP
- Reversing a Unicode String in PHP using UTF-16BE/LE
- Portable UTF-8 in PHP
- Lazy Load Enabled With AJAX Content
- Foundation Version Control for Web Developers
- Detecting UTF BOM - byte order mark
- Unicode Character Table
- Unicode - Wikipedia
- Unicode <3 JavaScript - YouTube
- Characters, Symbols and the Unicode Miracle - Computerphile - YouTube
- Decode Unicode - Johannes Bergerhausen at TEDxVienna - YouTube
- Pragmatic Unicode - YouTube
- Punycode - Wikipedia
- Understanding Unicode
- Encool Tool - Generate Text with Symbols