I can see that the Java community is growing a lot but there are a lot of Java developers only focused on Spring, microservices, and other fancy tools, and forgetting to understand how the incredible Java Virtual Machine works.
I can understand that, using Spring Boot in a microservices architecture, will be rare the case where we need to tune the performance of an application (since we have a microservice / micro-application and Spring Boot brings to us the Development by Convention) based on a JVM investigation.
But I’m the type of person that really wants to know how things work behind the hood, and I think this knowledge is important to create better and performative code.
And as my mother always told me: It is better to know and not need than need and not know.
So, let’s start talking about how JVM runs the Java code.
The Java compiler (javac) will compile your .java class into a .class file. This .class is your bytecode.
This bytecode (.class files) can be packaged into a JAR file or a WAR file. When we are running our app using the java command, the Java Virtual Machine (JVM) will run our bytecode, that is, our .class files.
This capability to an interpreter at runtime the bytecode is how Java can achieve one of its benefits, WORA (Write Once, Run Anywhere). But the key part is not only run anywhere but run anywhere with consistent results!
So, that’s why Java became so famous because, with this WORA concept allowed by the JVM, we now could run any Java apps on Mac, Windows, Linux, or any other Operation System (OS) for which a JVM exists.
So, the flow is: Java code is going to be compiled into bytecode (.class files) and the JVM can interpret this bytecode.
But you are wondering yourself right now, why do we still need a JVM if nowadays we have Docker?
And the answer is simple: Because this is only one of the huge amount of features that the JVM contains and also because JVM has complex algorithms to make it more efficient than a traditional code interpreter would be.
For example, if you are writing code in a non compiled (that is, a runtime interpreted) language like PHP* using an interpreter, each line of PHP code is only looked at, analyzed and the way to execute it is determined as it is needed. In a JVM it’s a much more complicated process.
*PHP 8 will also have a JIT Compiler (https://wiki.php.net/rfc/jit), but for now, this is not the default behavior.
So, the first benefit of using the Java Virtual Machine (JVM) is because the JVM will not run the Java code directly, but rather the Java bytecode and in fact, any language which can be compiled to a JVM compatible bytecode can be run on the JVM like Scala, Groovy, Kotlin, Clojure, etc.
But Julio, you said that a JVM uses a much more complicated process, how is this process?
Initially, JVM acts like any other interpreter, running each line of code as it is needed (like the PHP Interpreter that we mentioned before). However, by default, this would make the code execution a little bit slow, so, certainly, if you compare this code written in a language like C, which will be compiled to native machine code, the way that this kind of language works is that the code is compiled into a runnable format that the OS can comprehend directly, it means that the OS doesn’t need any additional software to interpret or run it.
This makes it quick to run compared to interpreted languages but using this kind of language, that is compiled natively means that we lose the WORA property.
So, how does JVM help us with this problem, the slower execution in interpreted languages than compiled languages?
That’s why Java HostSpot VM has a feature called JIT (Just In Time) Compilation!
The JVM will monitor which blocks of code are run the most often, which methods or parts of methods, specifically loops are executed the most frequently, and then the JVM can decide for example, that a particular method is being used a lot so, the code execution would be speeded up if that method was compiled to native machine code and it this is exactly what JVM will do.
At some point, our application is being run in interpretive mode, as bytecode, and some it is no longer bytecode but it is running as compiled native machine code.
With this, the part of the code that was compiled to native machine code will run faster than the bytecode interpreted part.
Just to be clear: When we are talking about “native machine code” we are talking about an executable code that can be understood directly by your Operating System. It means that if you are running your app in a Windows JVM will generate a specific code that can be natively understood by the Windows OS, but if you now are running in Linux, this generated native machine code will be different.
So, the Windows JVM can create a native Windows code and the Linux JVM can create a native Linux code, for example.
This process of native combination is completely transparent to us and that’s why sometimes people think that they don’t need to know about it, but it has an important implication.
Your code will run faster the longer it is left to run.
That’s because the JVM can profile your code and work out which bits of it could be optimized by compiling them to native machine code.
So, a method that runs multiple times every minute is very likely to be Just In Time compiled quickly, but a method that might run once a day might not ever be JIT compiled.
Important note: The process of compiling the bytecode to native machine code will run in a separate thread. The JVM is, of course, a multi-threaded application itself, so the threads within the JVM responsible for running the code that is interpreting the bytecode and executing the bytecode won’t be affected by the thread doing JIT compiling. It means that the JIT compiling process doesn’t stop the application running.
While the compilation is taking place, the JVM will continue to use the interpreted version but once that compilation is complete and the native machine code version is available, then the JVM will seamlessly switch to use the JIT-compiled version instead of the bytecode. This process is called on-stack replacement (OSR) and I’ll explain better more forward.
But there is a case that you might need to look closer to check if the JIT Compilation process is decreasing your application performance.
If your application is using all of the available CPU resources, you can potentially see a temporary reduction in performance if JIT compilation is running, although it would only be in the most critical and high power processing application that you might notice this, and even then, it will be worth taking a slight drop in processing power to get the benefit of the native code version for your method.
Important note: If you want to measure the performance of two different methods (with different implementations) and determine how long the methods will take to run, you definitely will get different results when your application first starts and when your application has been running for a short while. You need to think about whether you’re assessing the performance of the code before it has been natively compiled or after. To do that you can use JMH (Java Microbenchmark Harness) and use the warmups configuration to handle this. Here you can find a great example of how to use JHM.
So, any method or code block can be compiled into native machine code.
To summarize, the JDK has 2 different compilers. The javac and JIT Compiler (that is part of the JVM).
The javac is responsible for compiling Java code for bytecode.
The JIT (Just in Time) Compiler is composed of 2 “sub-compilers”, the C1 (Client Compiler) and C2 (Server Compiler).
The C1 will take the bytecode (already compiled by javac) and optimize this code by profiling (basically enriching the Java code metadata according to the number of executions) in 3 layers and when realizing that this is a very used code based on the information contained in profiling, C2 will compile this code for Native code and then place it in CodeCache.
Now that you already understood how the JVM executes your Java Code and the benefits and drawbacks in this process, in the next parts of this series I’ll explain how can we track and see the JIT compilation.
Hope you liked it, see you!