Are We There Yet? When do we move to GraalVM?

This is the right time to raise this point – as Spring Native is coming to the fore. Is it time to move to GraalVM? Spoiler: It depends. Yes, if you’re building serverless, probably not, if you’re building something else – with a few exceptions for some microservices.

Before I start, I want to qualify that I am talking about Native Image (SubstrateVM), which is what most people mean when they say GraalVM. That distinctive feature captured a much larger and more ambitious project that includes some amazing capabilities such as polyglot programming. GraalVM Native Images let us compile our Java projects into native code. It analyzes and removes unnecessary stuff, it can significantly reduce the size and startup time of the binary. I’ve seen 10-20x improvement in startup time, that’s a lot. Ram usage is also sometimes much lower than the same scale but usually not as significant.

Perhaps ironically, GraalVM can be even more secure than a typical JVM because it lacks dynamic features. An object serialization attack would probably be more difficult to conduct against GraalVM. Hopefully, I didn’t inadvertently challenge every security researcher just to prove me wrong…

downsides

I am a performance geek. With these numbers I typically run to compile my apps to native code and be on my way to faster performance. But the situation is not so clear. For long running processes a JVM may still perform better in production. Traditional deployments are not dependent on startup time or even RAM. There are some microservices where both can have an impact, but in most cases, it’s not the only consideration.

First, the performance story for GraalVM is more nuanced than just startup time. The startup time and memory difference are not as important for large applications running for a long time. Runtime performance is not such a clear story and can often favor other JVMs. It’s a very nuanced story because the native image supports profiling to generate optimization hints for the compiler and other interesting tools.

Some libraries are challenging to adapt to the original image (eg, Freemarker), there are tricks such as You can use a tracing agent on a regular JVM to trace dynamic code. You can package a native app using the results of tracing agent execution. But it’s a more complex project than just adding dependencies to Maven.

GraalVM compilation speed has improved significantly over the years, but it is still much slower than a typical Maven build. But that’s not the worst part. The worst part is the relatively lean observable story.

To be clear, tools like JFR and other capabilities of Java are supported. JMX is coming too. This means that you can use jconsole and other wonderful JVM capabilities on native executables. He is superb. You can also debug native executables, IntelliJ/IDEA has added native support for debugging executables directly. Another great improvement!

But some things are still not supported. Agent API is where many JVM level extensions live. obviously there is some work in it Bringing support for these features, but maybe not everything is included in the tool. Still it would be a huge boost.

So should we use it?

The last time I started a Spring Boot project was long before 3.0 and I chose Spring Native’s early preview as an option Initializer Creation Wizard. That’s why I’m very much into experimenting with GraalVM, I think it’s a wonderful alternative that is becoming increasingly more compelling over time. In fact, it’s probably already the best option for many CLI tools.

Should we use it in production is a different question. For some cases, it just won’t be practical. If you build GUI applications or rely on dynamic class loading, the situation is moot. But again, Spring Boot 3 is very exciting, and I look forward to moving projects into it (and JDK 17). When we migrate these projects, should I aim for “native first”?

As it stands now, I would like to have a GraalVM native image creation target in CI. However, we will probably deploy in the cloud with a standard JVM. The main reasons behind this are all of the above, but most are observational and familiar aspects. When we build for scale, the individual performance of a specific node is not that important. This is important, but the bigger picture and the ability to fix issues at large are even more important.

Imagine going through the trace between multiple servers and looking at the timing and tracing a problem to its root cause. This is at the core of the issues of mass production. Traces help us to understand the root cause of performance issue or failure. The good thing about trails is that they are free. Free ones like that, we don’t need to write much code for them. Our code basically gets mechanized to include that functionality.

The rising star in the world of tracing is OpenTelemetry, and it uses an Agent API. It is not unique in that area; Agent API is prevalent in the industry. Without the Agent API, many of the features required by a higher level system (tracing, developer observability, error handling, APM, etc.) are effectively gone.

when is it ok?

Serverless is the ideal case. While it also requires the Developer Observability Tool, there are already some problems with such extensions. E.g. Lambda fails with some agent configuration. Note that there are tracing solutions for AWS so that aspect can be addressed. Using a native image for serverless saves cost, speeds up results, and does that at no cost. This is a straightforward decision.

In other cases, I try to keep my finger on the pulse as these things tend to turn up overnight. That’s why I recommend experimenting with GraalVM for now. This will put you in a good position for the future where we can move VMs.

The reason we’re still deploying with the standard JVM is that our deployment doesn’t see any significant benefit from GraalVM at this time. We don’t have enough scale and spin up/down costs to make the transition worthwhile.

eventually

one in twitter discussion, I predicted that it would take 10 years for 50% of Java developers to move to GrailVM unless Leyden suddenly changes dynamics and makes the standard JVM more efficient. Java developers tend to be slow – I think that’s a feature, not a bug.

At 22, GraalVM is in a completely different state than it was only 3 years ago. It’s helpful tooling and third party support are both finally picking up and it’s ready to bridge the gap. I think it’s already done for CLI tools. Even if you don’t pick it up now, you should give it a try as there is a lot to learn while working with it.

One of the biggest benefits it brings with it is the focus on the reflective code we have in all our applications. Reducing that code will improve the quality of the application, increasing imperative logic that is easier to debug. Will clear up failures and probably even improve the performance of regular JVMs. The work that vendors need to do to support GraalVM is great for all of us.

I’ve also barely touched on the polyglot aspects of GraalVM which are some of its most exciting features. Integrating Java and Python code into a native binary is a powerful proposition.

Leave a Comment