G Dimension: 2011

Tuesday, November 29, 2011

Signed Jars and Certificate

1. Public/Private key Pair
It is used in two scenarios:
1) Encryption: encrypt secure content by public key of the opposite side, e.g., SSL handshake, client (browser) encrypts a random symmetric encryption key with web server's public key, web server will decrypt it with its own private key and get the symmetirc key, thereafter, they send/receive message en/deccrypted by using the symmetic key;
2) Digital Signing: encrypt the message by your private key, to certify the message (*unsecure*) coming from you. The receiver will decrypt it with your public key.

2. Certificate
Naturally, the question arises. How to broadcast your public key? The answer is: certificate. The certificate contains the reference to the issuer, the public key of the owner of this certificate, the dates of validity of this certificate and the signature of the certificate to ensure this certificate hasen't been tampered with. Usually your browser or application has already loaded the root certificate of well known Certification Authorities (CA) or root CA Certificates. The CA maintains a list of all signed certificates as well as a list of revoked certificates. A certificate is insecure until it is signed (**signed by a CA certificate, in the other words, encrypted by a CA authority's private key), as only a signed certificate cannot be modified. You can sign a certificate using itself, it is called a self signed certificate. All root CA certificates are self signed.

The certificate has all the elements to send an encrypted message to the owner (using the public key) or to verify a message signed by the author of this certificate.

3) Signed Jars
When jarsigner is used to sign a JAR file, the output signed JAR file is exactly the same as the input JAR file, except that it has two additional files placed in the META-INF directory:

- a signature file, with a .SF extension, (**digest value for the three lines in the manifest file for the source file)
- a signature block file, with a .DSA, .RSA, or .EC extension. (** signature of .SF, we actually sign the signature(digest) of signature)

The .SF file is signed and the signature is placed in the signature block file. This file also contains, encoded inside it, the certificate or certificate chain from the keystore which authenticates the public key corresponding to the private key used for signing.

SSL Certificates HOWTO
jarsigner - JAR Signing and Verification Tool

Tuesday, November 15, 2011

Rethrown exception disregarded

public void badmethod() throws Exception {
  boolean error = false;
  try {
   throw new Exception();
  } catch (Exception  e) {
   error = true;
   throw e;
  } finally {
   if (error) return;
  }
 }
 
 public void testbadmethod() {
  try {
   badmethod();
   System.out.println("WRONG");
  } catch (Exception ae) {
   System.out.println("RIGHT");
  }
 }

It prints out "WRONG". i.e., with return statement in finally caluse, the rethrown exception is swallowed.

Wednesday, August 31, 2011

MySQL backup & binlog

- Full backup (dump tables to sql file)

shell> mysqldump --user=xxx --password=xxx --single-transaction --flush-logs --master-data --delete-master-logs --all-databases -r backup_sunday_1_PM.sql

- Differential backup (take binlog file)

shell> mysqladmin flush-logs

MySQL binary logs provide point-in-time recovery. But it may take huge disk space.

flush-logs: close the current binary log file and begin a new one
delete-master-logs: delete binary logs
master-data: write binary log information to dump file, so we know which is the next position in binlog. E.g., in dump file, we see

--
-- Position to start replication or point-in-time recovery from
--
CHANGE MASTER TO MASTER_LOG_FILE='hwm_binlog.000002', MASTER_LOG_POS=98;

And mysqlbinlog can be used to display the content of binlog.

C:\HWM\mysqldb\bin>mysqlbinlog ..\data\hwm_binlog.000002
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
# at 4
#110830 23:16:33 server id 1  end_log_pos 98    Start: binlog v 4, server v 5.0.
24-community-nt-log created 110830 23:16:33
# Warning: this binlog was not closed properly. Most probably mysqld crashed wri
ting it.
# at 98
#110830 23:16:33 server id 1  end_log_pos 198   Query   thread_id=108   exec_tim
e=0     error_code=0
use webnmsdb;
SET TIMESTAMP=1314760593;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.uniq
ue_checks=1;
SET @@session.sql_mode=0;
/*!\C utf8 */;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@sessio
n.collation_server=8;
DELETE FROM `webnmsdb`.`extstats`;
# at 198

Tuesday, August 30, 2011

The recovery of crashed InnoDB

If the database crashes, InnoDB will attempt to fix everything by running the log file from the last timestamp. Unfortunately if InnoDB fails to repair itself, the -entire- database will not start. MySQL will exit with an error message and your entire database will be offline.

Once you have corrupt InnoDB tables that are preventing your database from starting, you should follow this five step process:

Step 1: Add this line to your /etc/my.cnf configuration file:

[mysqld]
innodb_force_recovery = 4

Step 2: Restart MySQL. Your database will now start, but with innodb_force_recovery, all INSERTs and UPDATEs will be ignored.

Step 3: Dump all tables

Step 4: Shutdown database and delete the data directory. Run mysql_install_db to create MySQL default tables

Step 5: Remove the innodb_force_recovery line from your /etc/my.cnf file and restart the database. (It should start normally now)

Step 6: Restore everything from your backup

*** This is copied from How to Fix MySQL Database (MyISAM / InnoDB). The command mysql_install_db is only available since MySQL 5. The steps I took are slightly differently since we are using MySQL 4.
1. Set innod_force_recovery=6
2. Start MySQL, drop databases
3. Stop MySQL, delete ibdata and ib_logfiles, and unset innodb_force_recovery
4. Start MySQL again, the tablespace and log files will be recreated
5. Restore the data with the backup file

Friday, August 26, 2011

Discover runtime type of Generic class

A common misconception about generics in Java 5 is that you can't access them at runtime.

What you can't find out at runtime is which generic type is associated with an instance of an object. However you can use reflection to look at which types have been staticly associated with a member of a class.

public Map perGlobalSettings = new HashMap();
public void test() {
  Field field = Test.class.getField("perGlobalSettings");
  ParameterizedType type0=(ParameterizedType) field.getGenericType();
  System.out.println(type0);  //java.util.Map
  Class clazz=(Class)type0.getActualTypeArguments()[1]; //GlobalSettings
					
  ParameterizedType t=(ParameterizedType)perGlobalSettings.getClass().getGenericSuperclass();
  System.out.println(t);	//java.util.AbstractMap
}

References:
Accessing generic type information at runtime
Some notes on discovering your type parameter using the Reflection API

Thursday, July 28, 2011

MBean operation name


    @ManagedOperation(description="Ap Cache")
    public String getApCache() { ...}

When invoked, got the error "Spring javax.management.MBeanException: Operation getApCache not in ModelMBeanInfo".

This is becasue the method name getXXX which can only be used for managedAttribute, not for operation. Changed it to showApCache() then worked.

Wednesday, July 27, 2011

Java analysis tool

1) dump threads with jstack
In Windows 2008, get the error "Not enough storage is available to process this command", the solution is to use the Sysinternals psexec tool as follows:

psexec -s "%JAVA_HOME%\bin\jstack.exe" pid >stack.txt

2) dump heap with jmap
jmap -dump:format=b,file=<heap.bin> pid
In windows 2008,

psexec -s "%JAVA_HOME%\bin\jmap.exe" -dump:format=b,file=heap.bin pid

3) analyse dumped heap
jhat [options] <heap_dump_file>

jhat -J-mx768m heap.bin

which sets the memory available to jhat to a value just above the size of the heap to be analysed.
4) check ports status

netstat -anp | grep 8090  #linux
netstat -a -o | find "8090" #window

5) TCP dump

tcpdump -i any -n -v -s 0 -c 10 port 40189
tcpdump -i any -n -v -s 0 -c 10 not port 22   
#22 telnet port, -i interface, -n no name resolving, -s 0 full packet -c number of packets

6) vi
ESC toggle between command mode and insert mode
Command mode: x delete, :/xyz search (n-next), :wq save&exit, :q not saved exit
Insert mode: i - insert before, a - insert after, o - new line

Thursday, July 14, 2011

Slow InnoDB insert?

InnoDB has different characteristics which myisam doesn't have. Some of them need extra time. Table space management, transaction logging, writing to rollback segment, ...

Generally the insertion speed of InnoDB is slower than MyISAM.

1) Row size
Run "show table status", you will see for the same table, the avg_row_length for are different. MyISAM is 84 bytes, but InnoDB is 116, and could be more depending on how PK is defined. MyISAM uses heap storage, the new inserted records are basically appended to the data file without sorting; whereas InnoDB uses a cluster indexed storeage, i.e., the whole table is structured as a sorted BTree, the data records on the leaf nodes. The inernal nodes is comprised of PK. So the table size is not calculated as rowSize*numOfRows. There is a overhead.

2) Primary Key
Since InnoDB stores data at the leaves of the primary index the primary key should be kept as short as possible. And if the primary key is not a int that just increments linearly each insert of data can't be made at the end of the table but in the middle which forces some data shuffling.

I tried table partitition and MySQL requires the partition columns must be part of PK. So I chose a composite PK (mac, partId, recordTime), partId is an auxiliary colulmn, a hash code (0..9) generated from mac. This is becasue the hash type partition doesn't support UDF (user defined function). Since the PK value of it is in arbitrary order while inserting and causes data shuffling. The insertion of the partitioned table is much slower (5-317 seconds for 50k records) than the one without partition and with an auto-increment long PK (5-30 seconds). And MyISAM is the fastest, 5-10 seconds.

3) InnoDB configuration
The article Cofnigure InnoDB explains this subject very well. These setting affects the InnoDB performance a lot. Among them, innodb_log_file_size, innodb_flush_log_at_trx_commit are important for insertion to reduce disk I/O. Note that when innodb_log_file_size is changed, we need to remove the old log files and restart mysql server.

innodb_data_file_path = ibdata1:15M:autoextend

# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 1024M
innodb_additional_mem_pool_size = 16M

# Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 512M
innodb_log_buffer_size = 256M
innodb_flush_log_at_trx_commit = 0
innodb_lock_wait_timeout = 50
innodb_file_per_table

Run "show innodb status" to know the innodb runtime status as explained in SHOW INNODB STATUS walk through.
By monitoring log sequence number and value up to which logs have been flushed you can check if your innodb_log_buffer_size is optimal - if you see more than 30% of log buffer size being unflushed you may want to increase it.

From these values you can learn if your buffer pool is sized well - if you have constantly a lot of pages free, it probably means your active database size is smaller than allocated buffer pool size so you can tune down innodb_buffer_pool_size.

4) JDBC batch size
InnoDB is transaction safe. Turn of autoCommit and use JDBC batch insert should be faster. But surprisingly, there are not performance improvement when I tried the batch insert. It is becasue MySQL in general doesn't have a "batch" form of prepared statement parameter bindings(?). But MySQL support batch in a different way. By adding rewriteBatchedStatements into url (e.g., jdbc:mysql://localhost:4589/mydb?rewriteBatchedStatements=true), it will collapse multiple inserts into one based on the batch size, e.g., "INSERT INTO foo VALUES (...), (...), (...)". The insertion performance is further improved from 10 seconds to 5 seconds, i.e., 5000 records per second.

Hibernate's StatelessSession doesn't repect the hibernate.jdbc.batch_size setting, so couldn't do a batch insert with it. One trick to view the actaul running sql statement is run "show innodb status".

Wednesday, June 29, 2011

Query with large result set

when caching is not required, it may be better to use StatelessSession.

ScrollableResults results = session.createQuery("SELECT person FROM Person person") .setReadOnly(true).setFetchSize( 1000 ).setCacheable(false).scroll(ScrollMode.FORWARD_ONLY)

- The fetch size is the number of rows physically retrieved from the database at one time by the JDBC driver as you scroll through a query ResultSet with next(). As a general rule, setting the query fetch size is only effective for large result sets.I typically set this value to 100 for large result sets. (by default 10)

Tuesday, February 15, 2011

Synchronized over listeners?

Iterate over a synchronized collection is actully not thread safe. As discussed here, there are three approaches to deal with this:
- CopyOnWriteArrayList: good for seledom write case, such as listeners
- Synchronized the collections
- Clone the collection before iteration: If you're doing a lot of work and don't want to block other threads working at the same time, the hit of cloning the collection may well be acceptable. Also, it is recommend to use toArray() for the collection cloning which is safe during copy.

There was a deadlock occurring between two classes A and B, they both have a bunch of synchronized methods and need each other at some point.

My solution is to remove 'Synchronized' from methods in class A and use a thread-safe CopyOnWriteArrayList class for the collection of listeners. The CopyOnWrteArrayList provides lock-free list traversal. Therefore, the lock contention between class A and class B is relieved.

In CopyOnWrteArrayList APIDoc, it states: "This class implements a variant of java.util.ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array. This is ordinarily too costly, but it becomes attractive when traversal operations vastly overwhelm mutations, and, especially, when you cannot or don't want to synchronize traversals, yet need to preclude interference among concurrent threads...".

The synchronized collections wrappers, synchronizedMap and synchronizedList, are sometimes called conditionally thread-safe.

Map m = Collections.synchronizedMap(new HashMap());
List l = Collections.synchronizedList(new ArrayList());

// put-if-absent idiom -- contains a race condition
// may require external synchronization
if (!map.containsKey(key))
map.put(key, value);

// ad-hoc iteration -- contains race conditions
// may require external synchronization
for (int i=0; i<list.size(); i++) {
doSomething(list.get(i));
}

// normal iteration -- can throw ConcurrentModificationException
// may require external synchronization
for (Iterator i=list.iterator(); i.hasNext(); ) {
doSomething(i.next());
}

As discussed in this article, the java concurrent package provides some other classes such as ConcurrentHashMap, they aim to reduce lock granularity and perform better than old JDK ones.

Also, in java.util.concurrent.atomic package, there are a bunch of classes such as A AtomicBoolean, provide lock-free alternate. It is a wrapper for 'volatile boolean'. The difference between volatile and synchronized are well discussed in this blog. Basically it is about thread copy of variable, main memory and the synchronization between them.

G Dimension