Simplicity is everything

by saoj

Inter-socket communication with less than 2 microseconds latency

Non-blocking I/O through selectors is the part of networking that I like the most. The Java NIO API is not easy, but once you understand the reactor pattern and abstract away its complexities you end up with a powerful and re-usable network multiplexer. The classic one-thread-per-socket approach does not scale, has a lot of overhead and almost always lead to complex code. Continue Reading →

by saoj

Intro to Parallel Processing with MapReduce

When I first tried to learn about MapReduce I found it difficult to grasp the basic concepts so I decided to write a simple example that demonstrates its benefits in practice, the most important one being distributed computing or parallel processing. In this article I describe a simple problem and proceed to solve it with and without MapReduce. Then to finalize I show how MapReduce makes it straightforward to distribute the work in a cluster. Continue Reading →

by saoj

Asynchronous logging versus Memory Mapped Files

One of the challenges of HFT systems is to minimize I/O latency. Whenever you want to write something to the disk or send something to the network you risk introducing latency in your critical thread. One solution for this problem is NIO (non-blocking I/O). For disk I/O you can use a memory-mapped file and for network I/O you can use a non-blocking channel. NIO is supposed to provide asynchronous output (i.e writes), meaning you pass the bytes to the OS (i.e. copy them to a direct byte buffer) and hope that the OS will make the best effort to do its job. Continue Reading →

by saoj

Which one is faster: Java heap or native memory?

One of the advantages of the Java language is that you do not need to deal with memory allocation and deallocation. Whenever you instantiate an object with the new keyword, the necessary memory is allocated in the JVM heap. The heap is then managed by the garbate collector which reclaims the memory after the object goes out-of-scope. However there is a backdoor to reach the off-heap native memory from the JVM. In this article I am going to show how an object can be stored in memory as a sequence of bytes and how you can choose between storing these bytes in heap memory or in direct (i.e. native) memory. Then I will try to conclude which one is faster to access from the JVM: heap memory or direct memory. Continue Reading →

by saoj

Inter-thread communication with 2-digit nanosecond latency

With the proliferation of multi-core processors and the high cost of collocation, it is tempting to run more than one critical application in the same machine using a thread affinity library to pin each application to its own isolated core. However, multithreading can be a big source of latency due to locking. The solution is to make your threads communicate by exchanging messages through a lock-free queue. Continue Reading →

by saoj

Real-time Java programming without GC

If you are writing latency-sensitive applications in Java it is paramount that you gain control over the garbage collector. Although you cannot turn off the GC, you can and should adopt some coding techniques that will delay the garbage collector indefinitely. But before we examine these techniques, we need a reliable way to profile our programs for allocated memory and collected memory,┬áso we can know accurately the amount of garbage (i.e. de-referenced objects) we are leaving behind. Meet GCUtils, a simple tool I wrote to (really) force the GC and measure these values. Continue Reading →

by saoj

Refactoring Bean Property Names Through Proxies

I have recently had the following desire for the MentaBean‘s programmatic mapping approach. Instead of having this:

userConfig.field("age", DBTypes.STRING);

Wouldn’t be nice to have something like this with full support for property name refactoring?

userConfig.field(user.getAge(), DBTypes.STRING);

Continue Reading →

by saoj

Hibernate is more complex than the problem it tries to solve

Relational databases have been around for a long time and there is nothing too complex about them. Any medium programmer should be able to learn how to write SQL in less than a day. Indexes, joins, transactions, caching, lazy loading are not complex topics either. Despite all that, Hibernate has become the de facto standard for writing database access code in Java. In this article I try to explain the drawbacks of Hibernate and how things can be done differently. Continue Reading →