UTF-8 encoding
On php UI, both fields (location, zone) support utf-8 characters. The user could type and save. What surprised me is that in mysql db, zone is defined as a varchar (latin1) column, and location is defined as a varchar (utf8). How can this work?
This is because the UI page uses utf-8 characters. It could be store blindly into a latin1 column.
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
[Explanation for location]
The data is in utf8, the connection settings are in latin1. The table is in utf8. So, MySQL converts each of C3 and A9 (which it was told was latin1) to utf8, leading to 4 bytes in the table: C3 83 C2 A9. I call this double-encoding. The problem was caused by lying (usually accidentally) to MySQL about what encoding was in the application's data.
[Explanation for zone]
The data was loaded in an old system in the following manner. The application got utf8 bytes and blindly stored them in a latin1 column.
Reference: MySQL Charset/Collate
No comments:
Post a Comment