Wednesday, July 27, 2011

Java analysis tool

1) dump threads with jstack
In Windows 2008, get the error "Not enough storage is available to process this command", the solution is to use the Sysinternals psexec tool as follows:

psexec -s "%JAVA_HOME%\bin\jstack.exe" pid >stack.txt 
2) dump heap with jmap
jmap -dump:format=b,file=<heap.bin> pid
In windows 2008,
psexec -s "%JAVA_HOME%\bin\jmap.exe" -dump:format=b,file=heap.bin pid
3) analyse dumped heap
jhat [options] <heap_dump_file>
jhat -J-mx768m heap.bin 
which sets the memory available to jhat to a value just above the size of the heap to be analysed.
4) check ports status
netstat -anp | grep 8090  #linux
netstat -a -o | find "8090" #window
5) TCP dump
tcpdump -i any -n -v -s 0 -c 10 port 40189
tcpdump -i any -n -v -s 0 -c 10 not port 22   
#22 telnet port, -i interface, -n no name resolving, -s 0 full packet -c number of packets
6) vi
ESC toggle between command mode and insert mode
Command mode: x delete, :/xyz search (n-next), :wq save&exit, :q not saved exit
Insert mode: i - insert before, a - insert after, o - new line

Thursday, July 14, 2011

Slow InnoDB insert?

InnoDB has different characteristics which myisam doesn't have. Some of them need extra time. Table space management, transaction logging, writing to rollback segment, ...

Generally the insertion speed of InnoDB is slower than MyISAM.

1) Row size
Run "show table status", you will see for the same table, the avg_row_length for are different. MyISAM is 84 bytes, but InnoDB is 116, and could be more depending on how PK is defined. MyISAM uses heap storage, the new inserted records are basically appended to the data file without sorting; whereas InnoDB uses a cluster indexed storeage, i.e., the whole table is structured as a sorted BTree, the data records on the leaf nodes. The inernal nodes is comprised of PK. So the table size is not calculated as rowSize*numOfRows. There is a overhead.

2) Primary Key
Since InnoDB stores data at the leaves of the primary index the primary key should be kept as short as possible. And if the primary key is not a int that just increments linearly each insert of data can't be made at the end of the table but in the middle which forces some data shuffling.

I tried table partitition and MySQL requires the partition columns must be part of PK. So I chose a composite PK (mac, partId, recordTime), partId is an auxiliary colulmn, a hash code (0..9) generated from mac. This is becasue the hash type partition doesn't support UDF (user defined function). Since the PK value of it is in arbitrary order while inserting and causes data shuffling. The insertion of the partitioned table is much slower (5-317 seconds for 50k records) than the one without partition and with an auto-increment long PK (5-30 seconds). And MyISAM is the fastest, 5-10 seconds.

3) InnoDB configuration
The article Cofnigure InnoDB explains this subject very well. These setting affects the InnoDB performance a lot. Among them, innodb_log_file_size, innodb_flush_log_at_trx_commit are important for insertion to reduce disk I/O. Note that when innodb_log_file_size is changed, we need to remove the old log files and restart mysql server.

innodb_data_file_path = ibdata1:15M:autoextend

# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 1024M
innodb_additional_mem_pool_size = 16M

# Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 512M
innodb_log_buffer_size = 256M
innodb_flush_log_at_trx_commit = 0
innodb_lock_wait_timeout = 50
innodb_file_per_table

Run "show innodb status" to know the innodb runtime status as explained in SHOW INNODB STATUS walk through.
By monitoring log sequence number and value up to which logs have been flushed you can check if your innodb_log_buffer_size is optimal - if you see more than 30% of log buffer size being unflushed you may want to increase it.

From these values you can learn if your buffer pool is sized well - if you have constantly a lot of pages free, it probably means your active database size is smaller than allocated buffer pool size so you can tune down innodb_buffer_pool_size.

4) JDBC batch size
InnoDB is transaction safe. Turn of autoCommit and use JDBC batch insert should be faster. But surprisingly, there are not performance improvement when I tried the batch insert. It is becasue MySQL in general doesn't have a "batch" form of prepared statement parameter bindings(?). But MySQL support batch in a different way. By adding rewriteBatchedStatements into url (e.g., jdbc:mysql://localhost:4589/mydb?rewriteBatchedStatements=true), it will collapse multiple inserts into one based on the batch size, e.g., "INSERT INTO foo VALUES (...), (...), (...)". The insertion performance is further improved from 10 seconds to 5 seconds, i.e., 5000 records per second.

Hibernate's StatelessSession doesn't repect the hibernate.jdbc.batch_size setting, so couldn't do a batch insert with it. One trick to view the actaul running sql statement is run "show innodb status".

Wednesday, June 29, 2011

Query with large result set

when caching is not required, it may be better to use StatelessSession.

ScrollableResults results = session.createQuery("SELECT person FROM Person person") .setReadOnly(true).setFetchSize( 1000 ).setCacheable(false).scroll(ScrollMode.FORWARD_ONLY)

- The fetch size is the number of rows physically retrieved from the database at one time by the JDBC driver as you scroll through a query ResultSet with next(). As a general rule, setting the query fetch size is only effective for large result sets.I typically set this value to 100 for large result sets. (by default 10)

Tuesday, February 15, 2011

Synchronized over listeners?

Iterate over a synchronized collection is actully not thread safe. As discussed here, there are three approaches to deal with this:
- CopyOnWriteArrayList: good for seledom write case, such as listeners
- Synchronized the collections
- Clone the collection before iteration: If you're doing a lot of work and don't want to block other threads working at the same time, the hit of cloning the collection may well be acceptable. Also, it is recommend to use toArray() for the collection cloning which is safe during copy.

There was a deadlock occurring between two classes A and B, they both have a bunch of synchronized methods and need each other at some point.

My solution is to remove 'Synchronized' from methods in class A and use a thread-safe CopyOnWriteArrayList class for the collection of listeners. The CopyOnWrteArrayList provides lock-free list traversal. Therefore, the lock contention between class A and class B is relieved.

In CopyOnWrteArrayList APIDoc, it states: "This class implements a variant of java.util.ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array. This is ordinarily too costly, but it becomes attractive when traversal operations vastly overwhelm mutations, and, especially, when you cannot or don't want to synchronize traversals, yet need to preclude interference among concurrent threads...".

The synchronized collections wrappers, synchronizedMap and synchronizedList, are sometimes called conditionally thread-safe.

Map m = Collections.synchronizedMap(new HashMap());
List l = Collections.synchronizedList(new ArrayList());

// put-if-absent idiom -- contains a race condition
// may require external synchronization
if (!map.containsKey(key))
map.put(key, value);

// ad-hoc iteration -- contains race conditions
// may require external synchronization
for (int i=0; i<list.size(); i++) {
doSomething(list.get(i));
}

// normal iteration -- can throw ConcurrentModificationException
// may require external synchronization
for (Iterator i=list.iterator(); i.hasNext(); ) {
doSomething(i.next());
}

As discussed in this article, the java concurrent package provides some other classes such as ConcurrentHashMap, they aim to reduce lock granularity and perform better than old JDK ones.

Also, in java.util.concurrent.atomic package, there are a bunch of classes such as A AtomicBoolean, provide lock-free alternate. It is a wrapper for 'volatile boolean'. The difference between volatile and synchronized are well discussed in this blog. Basically it is about thread copy of variable, main memory and the synchronization between them.