Java memory optimization example

"""Here is an example. Suppose you have to create a map from int to 20 character long strings. The size of this map is equal to one million and all mappings are static and predefined (saved in some dictionary, for example).

The first approach is to use a Map<Integer, String> from the standard JDK. Let’s roughly estimate the memory consumption of this structure. Each Integer occupies 16 bytes plus 4 bytes for an Integer reference from a map. Each 20 character long String occupies 36 + 20*2 = 76 bytes (see String description above), which are aligned to 80 bytes. Plus 4 bytes for a reference. The total memory consumption will be roughly (16 + 4 + 80 + 4) * 1M = 104M.

The better approach will be replacing String with a byte[] in UTF-8 encoding as described in String packing part 1: converting characters to bytes article. Our map will be Map<Integer, byte[]>. Assume that all string characters belong to ASCII set (0-127), which is true in most of English-speaking countries. A byte[20] occupies 12 (header) + 20*1 = 32 bytes, which conveniently fit into 8 bytes alignment. The whole map will now occupy (16 + 4 + 32 + 4) * 1M = 56M, which is 2 times less than the previous example.

Now let’s use Trove TIntObjectMap<byte[]>. It stores keys as normal int[] compared to wrapper types in JDK collections. Each key will now occupy exactly 4 bytes. The total memory consumption will go down to (4 + 32 + 4) * 1M = 40M.

The final structure will be more complicated. All String values will be stored in the single byte[] one after another with (byte)0 as a separator (we still assume we have a text-based ASCII strings). The whole byte[] will occupy (20 + 1) * 1M = 21M. Our map will store an offset of the string in the large byte[] instead of the string itself. We will use Trove TIntIntMap for this purpose. It will consume (4 + 4) * 1M = 8M. The total memory consumption in this example will be 8M + 21M = 29M. By the way, this is the first example relying on the immutability of this dataset.

Can we achieve the better result? Yes, we can, but at a cost of CPU consumption. The obvious ‘optimization’ is to sort the whole dataset by keys before storing values into a large byte[]. Now we may store the keys in the int[] and use a binary search to look for a key. If a key is found, its index multiplied by 21 (remember, all strings have the same length) will give us the offset of a value in the byte[]. This structure occupies ‘only’ 21M + 4M (for int[]) = 25M at a price of O(log N) lookups compared to O(1) lookups in case of a hash map.

Is this the best we can do? No! We have forgotten that all values are 20 characters long, so we don’t actually need the separators in the byte[]. It means that we can store our ‘map’ using 24M memory if we agree on O( log N ) lookups. No overhead at all compared to theoretical data size and nearly 4.5 times less than required by the original solution ( Map<Integer, String> )! Who told you that Java programs are memory hungry?

Summary

  • Prefer primitive types to their Object wrappers. The main cause of wrapper types usage are JDK collections, so consider using one of primitive type collection frameworks like Trove.

Try to minimize number of Objects you have. For example, prefer array-based structures like ArrayList/ArrayDeque to pointer based structures like LinkedList."""

Reference: http://java-performance.info/overview-of-memory-saving-techniques-java/


To view or add a comment, sign in

More articles by Sajith Nandasena

Explore content categories