How much memory was wasted when an additional boolean field was added to java.lang.String in Java 13? None at all. This article explains why
[278] Free Memory
Author: Dr Heinz M. Kabutz | Date: 2020-04-30 | Category: Performance | Java Version: 15 | Read Online
Abstract:
How much memory was wasted when an additional boolean field was added to java.lang.String in Java 13? None at all. This article explains why.
Welcome to the 278th edition of The Java(tm) Specialists' Newsletter, sent to you from the stunning Island of Crete. During the lockdown period, we are fortunately still allowed to go out for exercise. Thus my daily runs are continuing. I regularly share the lovely views on @heinzkabutz.
My book "Dynamic Proxies in Java" has now been published and you can get your free copy of the e-book from InfoQ.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
Free Memory
Last month, in newsletter 277, I wrote about a change in Java 13 that prevented having to recalculate the hash code of a String in the unlikely case that it was 0. I saw several objections to the change, asking why Oracle had added another field to String, thus increasing its memory consumption.
Object size in Java is somewhat hard to determine. We do not have a sizeof operator. It also varies by system. For example, in a 64-bit JVM with compressed OOPS, we use 4 bytes for a reference and 12 bytes for the object header. If our JVM is configured with a maximum heap of 32 GB or more, then a reference is 8 bytes and the object header is 16 bytes.
One thing that is consistent with all JVM systems I have looked at, is that objects are aligned on 8 byte boundaries. This means that the actual memory usage of an object will always be a multiple of 8. Thus the java.lang.Boolean
class is 12 bytes for the object header and one byte for the boolean, totalling 13 bytes. However, it will use 16 bytes, wasting 3 bytes due to object alignment.
In the past, I used all sorts of trickery for guessing the object size. Nowadays I use JOL (Java Object Layout). For example, here is the output when we look at the internals of java.lang.Boolean
:
java.lang.Boolean object internals: OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 1 boolean Boolean.value 13 3 (loss due to the next object alignment) Instance size: 16 bytes Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
As we see, the instance size is 16 bytes and we have three bytes that are unused space.
If we create a JVM with a 32GB heap (-Xmx32g), then the object header uses 16 bytes and thus the size is 17 bytes. However, the actual size is 24 bytes, due to object alignment:
java.lang.Boolean object internals: OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 4 (object header) 16 1 boolean Boolean.value 17 7 (loss due to the next object alignment) Instance size: 24 bytes Space losses: 0 bytes internal + 7 bytes external = 7 bytes total
Let's get back to String and consider the object sizes over the versions of Java. We are ignoring the size of the char[]
or byte[]
that contain the actual text.
Java 6 used 32 bytes, since they were storing the offset and count:
# java version "1.6.0_65" OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 4 char[] String.value 16 4 int String.offset 20 4 int String.count 24 4 int String.hash 28 4 (loss due to the next object alignment) Instance size: 32 bytes Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
(Incidentally, when the cached hash
was added to String in Java 1.3, most JVMs were 32-bit and the object header was just 8 bytes. In those days, the extra hash
field fitted into the wasted space. Another interesting factoid from 2001 - in those days every field took at least 4 bytes, even boolean
and byte
. That changed in Java 1.4. Enough ancient history!)
Java 7 decreases this to 24 bytes. The hash32
field was an optimization to reduce DOS attacks on hash maps. It was "free" in terms of memory usage, since without that we would have had 4 unused bytes anyway.
# openjdk version "1.7.0_252" (Zulu 7.36.0.5-CA-macosx) java.lang.String object internals: OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 4 char[] String.value 16 4 int String.hash 20 4 int String.hash32 Instance size: 24 bytes Space losses: 0 bytes internal + 0 bytes external = 0 bytes total
Java 8 gets rid of the hash32
field, which they replaced with a generalized solution inside java.util.HashMap
. This did not save any memory in String, since those 4 bytes are now "wasted" due to the next object alignment.
# openjdk version "1.8.0_242" (Zulu 8.44.0.11-CA-macosx) java.lang.String object internals: OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 4 char[] String.value 16 4 int String.hash 20 4 (loss due to the next object alignment) Instance size: 24 bytes Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
Java 9 changed the array type to byte[]
and added a coder
. However, the String object still uses 24 bytes, with 3 lost due to object alignment.
# java version "9.0.4" build 9.0.4+11 java.lang.String object internals: OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 4 byte[] String.value 16 4 int String.hash 20 1 byte String.coder 21 3 (loss due to the next object alignment) Instance size: 24 bytes Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
Java 13 added the hashIsZero
boolean field, which in Java uses 1 byte. However, we still do not use any additional memory. Thus, as stated in the abstract, adding this new field did not cost any additional memory.
# openjdk version "13.0.2" 2020-01-14 build 13.0.2+8 java.lang.String object internals: OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 4 byte[] String.value 16 4 int String.hash 20 1 byte String.coder 21 1 boolean String.hashIsZero 22 2 (loss due to the next object alignment) Instance size: 24 bytes Space losses: 0 bytes internal + 2 bytes external = 2 bytes total
When I ran the test in Java 15, I noticed a slight change in the object layout:
# openjdk version "15-ea" 2020-09-15 - build 15-ea+20-899 java.lang.String object internals: OFFSET SIZE TYPE DESCRIPTION 0 4 (object header) 4 4 (object header) 8 4 (object header) 12 4 int String.hash 16 1 byte String.coder 17 1 boolean String.hashIsZero 18 2 (alignment/padding gap) 20 4 byte[] String.value Instance size: 24 bytes Space losses: 2 bytes internal + 0 bytes external = 2 bytes total
After some searching, I found Shipilev's "Java Objects Inside Out" article that includes a link to an enhancement added to Java 15. Since Java 15, the field layout is a bit different and they can pack fields across class hierarchies. This has a whole bunch of implications for high performance Java. I would encourage you to read Shipilev's article.
Kind regards from Crete
Heinz
Our entire Java Specialists Training in One Huge Bundle

If you no longer wish to receive our emails, click the link below:
Unsubscribe
Cretesoft Limited 77 Strovolos Ave Strovolos, Lefkosia 2018 Cyprus