JAX: Java HPC by Norman Maurer #
For slides see also: Speakerdeck:
High performance networking on the JVM
Norman started his talk clarifying what he means by high
scale: Anything above
1000 concurrent connections in his talk are considered high scale, anything
below 100
concurrent connections is fine to be handled with threads and blocking
IO. Before tuning anything, make sure to
measure if you have any problem at
all: Readability should always go before optimisation.
He gave a
few pointers as to where to look for optimisations: Get started by
studying the socket options - TCP-NO-DELAY as
well as the send and receive
buffer sizes are most interesting. When under GC pressure (check the GC locks
to
figure out if you are) make sure to minimise allocation and deallocation of
objects. In order to do that consider
making objects static and final where
possible. Make sure to use CMS or G1 for garbage collection in order
to
maximise throughput. Size areas in the JVM heap according to your access
patterns. The goal should always be
to minimise the chance of running into a
stop the world garbage collection.
When it comes to using
buffers you have the choice of using direct or heap
buffers. While the former are expensive to create, the latter
come with the
cost of being zero’ed out. Often people start buffer pooling, potentially
initialising the pool in
a lazy manner. In order to avoid memory fragmentation
in the Java heap, it can be a good idea to create the buffer
at startup time
and re-use it later on.
In particular when parsing structured messages like they are
common in
protocols it usually makes sense to use gathering writes and scattering reads
to minimise the number of
system calls for reading and writing. Also try to
buffer more if you want to minimise system calls. Use slice and
duplicate to
create views on your buffers to avoid mem copies. Use a file channel when
copying files without
modifications.
Make sure you do not block - think of DNS servers being unavailable or slow as
an
example.
As a parting note, make sure to define and document your threading model. It
may ease
development to know that some objects will always only be used in a
single threaded context. It usually helps to
reduce context switches as well as
may ease development to know that some objects will always only be used in
a
single threaded context. It usually helps to reduce context switches as well as
keeping data in the same thread
to avoid having to use synchronisation and the
use of volatile.
Also make a conscious decision about
which protocol you would like to use for
transport - in addition to tcp there’s also udp, udt, sctp. Use pipelining
in
order to parallelise.