JAX: Java HPC by Norman Maurer

2013-05-19 20:31 More posts about Event JAX netty performance
For slides see also: Speakerdeck: High performance networking on the JVM


Norman started his talk clarifying what he means by high scale: Anything above
1000 concurrent connections in his talk are considered high scale, anything
below 100 concurrent connections is fine to be handled with threads and blocking
IO. Before tuning anything, make sure to measure if you have any problem at
all: Readability should always go before optimisation.


He gave a few pointers as to where to look for optimisations: Get started by
studying the socket options - TCP-NO-DELAY as well as the send and receive
buffer sizes are most interesting. When under GC pressure (check the GC locks
to figure out if you are) make sure to minimise allocation and deallocation of
objects. In order to do that consider making objects static and final where
possible. Make sure to use CMS or G1 for garbage collection in order to
maximise throughput. Size areas in the JVM heap according to your access
patterns. The goal should always be to minimise the chance of running into a
stop the world garbage collection.


When it comes to using buffers you have the choice of using direct or heap
buffers. While the former are expensive to create, the latter come with the
cost of being zero'ed out. Often people start buffer pooling, potentially
initialising the pool in a lazy manner. In order to avoid memory fragmentation
in the Java heap, it can be a good idea to create the buffer at startup time
and re-use it later on.


In particular when parsing structured messages like they are common in
protocols it usually makes sense to use gathering writes and scattering reads
to minimise the number of system calls for reading and writing. Also try to
buffer more if you want to minimise system calls. Use slice and duplicate to
create views on your buffers to avoid mem copies. Use a file channel when
copying files without modifications.


Make sure you do not block - think of DNS servers being unavailable or slow as
an example.


As a parting note, make sure to define and document your threading model. It
may ease development to know that some objects will always only be used in a
single threaded context. It usually helps to reduce context switches as well as
may ease development to know that some objects will always only be used in a
single threaded context. It usually helps to reduce context switches as well as
keeping data in the same thread to avoid having to use synchronisation and the
use of volatile.


Also make a conscious decision about which protocol you would like to use for
transport - in addition to tcp there's also udp, udt, sctp. Use pipelining in
order to parallelise.