Monday, February 6, 2012

UTF-8 encoding

On php UI, both fields (location, zone) support utf-8 characters. The user could type and save. What surprised me is that in mysql db, zone is defined as a varchar (latin1) column, and location is defined as a varchar (utf8). How can this work?

I set both zone and location to ‘测试’ and save. Below is the values stored in the DB. It is interesting that the zone field actually store the encoded utf8 string (each character takes 3 bytes) in a latin-1 column, whereas location field is a double encoded utf8 string which is unnecessary (it is because the connection setting is latin-1).

This is because the UI page uses utf-8 characters. It could be store blindly into a latin1 column.
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>

[Explanation for location]
The data is in utf8, the connection settings are in latin1. The table is in utf8. So, MySQL converts each of C3 and A9 (which it was told was latin1) to utf8, leading to 4 bytes in the table: C3 83 C2 A9. I call this double-encoding. The problem was caused by lying (usually accidentally) to MySQL about what encoding was in the application's data.

[Explanation for zone]
The data was loaded in an old system in the following manner. The application got utf8 bytes and blindly stored them in a latin1 column.

Reference: MySQL Charset/Collate

Tuesday, November 29, 2011

Signed Jars and Certificate

1. Public/Private key Pair
It is used in two scenarios:
1) Encryption: encrypt secure content by public key of the opposite side, e.g., SSL handshake, client (browser) encrypts a random symmetric encryption key with web server's public key, web server will decrypt it with its own private key and get the symmetirc key, thereafter, they send/receive message en/deccrypted by using the symmetic key;
2) Digital Signing: encrypt the message by your private key, to certify the message (*unsecure*) coming from you. The receiver will decrypt it with your public key.

2. Certificate
Naturally, the question arises. How to broadcast your public key? The answer is: certificate. The certificate contains the reference to the issuer, the public key of the owner of this certificate, the dates of validity of this certificate and the signature of the certificate to ensure this certificate hasen't been tampered with. Usually your browser or application has already loaded the root certificate of well known Certification Authorities (CA) or root CA Certificates. The CA maintains a list of all signed certificates as well as a list of revoked certificates. A certificate is insecure until it is signed (**signed by a CA certificate, in the other words, encrypted by a CA authority's private key), as only a signed certificate cannot be modified. You can sign a certificate using itself, it is called a self signed certificate. All root CA certificates are self signed.

The certificate has all the elements to send an encrypted message to the owner (using the public key) or to verify a message signed by the author of this certificate.

3) Signed Jars
When jarsigner is used to sign a JAR file, the output signed JAR file is exactly the same as the input JAR file, except that it has two additional files placed in the META-INF directory:

- a signature file, with a .SF extension, (**digest value for the three lines in the manifest file for the source file)
- a signature block file, with a .DSA, .RSA, or .EC extension. (** signature of .SF, we actually sign the signature(digest) of signature)

The .SF file is signed and the signature is placed in the signature block file. This file also contains, encoded inside it, the certificate or certificate chain from the keystore which authenticates the public key corresponding to the private key used for signing.

SSL Certificates HOWTO
jarsigner - JAR Signing and Verification Tool

Tuesday, November 15, 2011

Rethrown exception disregarded

public void badmethod() throws Exception {
  boolean error = false;
  try {
   throw new Exception();
  } catch (Exception  e) {
   error = true;
   throw e;
  } finally {
   if (error) return;
  }
 }
 
 public void testbadmethod() {
  try {
   badmethod();
   System.out.println("WRONG");
  } catch (Exception ae) {
   System.out.println("RIGHT");
  }
 }
It prints out "WRONG". i.e., with return statement in finally caluse, the rethrown exception is swallowed.

Wednesday, August 31, 2011

MySQL backup & binlog

- Full backup (dump tables to sql file)

shell> mysqldump --user=xxx --password=xxx --single-transaction --flush-logs --master-data --delete-master-logs --all-databases -r backup_sunday_1_PM.sql
- Differential backup (take binlog file)
shell> mysqladmin flush-logs
MySQL binary logs provide point-in-time recovery. But it may take huge disk space.

flush-logs: close the current binary log file and begin a new one
delete-master-logs: delete binary logs
master-data: write binary log information to dump file, so we know which is the next position in binlog. E.g., in dump file, we see
--
-- Position to start replication or point-in-time recovery from
--
CHANGE MASTER TO MASTER_LOG_FILE='hwm_binlog.000002', MASTER_LOG_POS=98;
And mysqlbinlog can be used to display the content of binlog.
C:\HWM\mysqldb\bin>mysqlbinlog ..\data\hwm_binlog.000002
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
# at 4
#110830 23:16:33 server id 1  end_log_pos 98    Start: binlog v 4, server v 5.0.
24-community-nt-log created 110830 23:16:33
# Warning: this binlog was not closed properly. Most probably mysqld crashed wri
ting it.
# at 98
#110830 23:16:33 server id 1  end_log_pos 198   Query   thread_id=108   exec_tim
e=0     error_code=0
use webnmsdb;
SET TIMESTAMP=1314760593;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.uniq
ue_checks=1;
SET @@session.sql_mode=0;
/*!\C utf8 */;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@sessio
n.collation_server=8;
DELETE FROM `webnmsdb`.`extstats`;
# at 198