2020-12-21

Apache Camel 3.7 (LTS) Released - The fastest Camel ever

The Apache Camel 3.7 was released some days ago.

This is a LTS release which means we will provide patch releases for one year. The next planned LTS release is 3.10 scheduled towards summer 2021.



So what's in this release

This release introduces a set of new features and noticeable improvements that we will cover in this blog post.


Pre compiled languages

We continued our avenue of making Camel faster and smaller. This time we focused on the built-in Simple scripting language.

First we added the jOOR language. jOOR is a small Java tool for performing runtime compilation of Java source code in-memory. It has some limitations but generally works well for small scripting code (requires Java 11 onwards).

Then we worked on compiled simple.


Compiled Simple

The csimple language is parsed into regular Java source code and compiled together with all the other source code, or compiled once during bootstrap via jOOR.

In a nutshell, compiled simple language excels over simple language when using dynamic Object-Graph Navigation Language (OGNL) method calls.

For example profiling the following simple expression

    <simple>${exchangeProperty.user.getName} != null && ${exchangeProperty.user.getAge} > 11</simple>

with the equivalent csimple expression:

    <csimple>${exchangeProperty.user} != null && 

             ${exchangeProperty.user.getName()} != null && 

             ${exchangeProperty.user.getAge()} > 11</csimple>

yields a dramatic 100 times performance improvement in reduced cpu usage as shown in the screenshot:


For more information about the compiled simple language and further break down of performance improvements then read my recent blog post introducing the csimple language.

We have provided two small examples that demonstrate csimple as pre compiled and as runtime compiled during bootstrap.

You can find these two examples from the official Apache Camel examples repository at:


Optimized core

We have continued the effort to optimize camel-core. This time a number of smaller improvements in various areas such as replacing regular expressions with regular Java code when regular expressions were overkill (regexp take up sizeable heap memory).

The direct component has been enhanced to avoid synchronisation when the producer calls the consumer.

We also enhanced the internals of the event notifier separating startup/stop events from routing events, gaining a small performance improvement during routing.

We also reduced the number of objects used during routing which reduced the memory usage.

Another significant win was to bulk together all the type converters from the core, into two classes (source generated). This avoids registering individually each type converter into the type converter registry which saves 20kb of heap memory.

If you are more curious about how we did these optimisations and with some performance numbers, then read another of my recent blog posts.


Optimized components startup

The camel core has been optimized in Camel 3 to be small, slim, and fast on startup. This benefits Camel Quarkus which can do built time optimizations that take advantage of the optimized camel core.

We have continued this effort in the Camel components where whenever possible initialization is moved ahead to an earlier phase during startup, that allows enhanced built time optimizations. As there are a lot of Camel components then this work will progress over the next couple of Camel releases.


Separating Model and EIP processors

In this release we untangled model, reifier and processors.

This is a great achievement which allows us to take this even further with design time vs runtime.

    Model    ->    Reifier   ->   Processor

    (startup)      (startup)      (runtime)

The model is the structure of the DSL which you can think of as _design time_ specifying your Camel routes. The model is executed once during startup and via the reifier (factory) the runtime EIP processors is created. After this work is done, the model is essentially not needed anymore.

By separating this into different JARs (camel-core-model, camel-core-reifier, camel-core-processor) then we ensure they are separated and this allows us to better do built time optimizations and dead code elimination via Quarkus and/or GraalVM.

This brings up to lightweight mode.


Lightweight mode

We started an experiment earlier with a lightweight mode. With the separation of the model from the processors, then we have a great step forward, which allowed us to make the lightweight mode available for end users to turn on.

In lightweight mode Camel removes all references to the model after startup which causes the JVM to be able to garbage collect all model objects and unload classes, freeing up memory.

After this it's no longer possible to dynamically add new Camel routes. The lightweight mode is intended for microservice/serverless architectures, with a closed world assumption.


Autowiring components

The Camel components is now capable of autowiring by type. For example the AWS SQS components can automatically lookup in the registry if there is a single instance of SqsClient, and then pre configure itself. 

We have marked up in the Camel documentation which component options supports this by showing Autowired in bold in the description.


Salesforce fixes

Our recent Camel committer Jeremy Ross did great work to improve and fix bugs in the camel-salesforce component. We expect more to come from him.


VertX Kafka Component

A new Kafka component has been developed that uses the Vert.X Kafka Java Client which allows us to use all of its features, and also its robustness and stability.

The camel-vertx-kafka component is intended to be (more) feature complete with the existing camel-kafka component. We will continue this work for the next couple of Camel releases.


DataSonnet

The new camel-datasonnet component, is to be used for data transformation using the DataSonnet.

DataSonnet is an open source JSON-centric, template-based data transformation standard built to rival proprietary options available in the market.


Spring Boot

We have upgraded to Spring Boot 2.4.


New components

This release has 7 new components, data formats or languages:

  • AtlasMap: Transforms the message using an [AtlasMap](https://www.atlasmap.io/) transformation
  • Kubernetes Custom Resources: Perform operations on Kubernetes Custom Resources and get notified on Deployment changes
  • Vert.X Kafka: Sent and receive messages to/from an Apache Kafka broker using vert.x Kafka client
  • JSON JSON-B: Marshal POJOs to JSON and back using JSON-B
  • CSimple: Evaluate a compile simple expression language
  • DataSonnet: To use DataSonnet scripts in Camel expressions or predicates
  • jOOR: Evaluate a jOOR (Java compiled once at runtime) expression language


Upgrading

Make sure to read the upgrade guide if you are upgrading to this release from a previous Camel version.


More details

The previous LTS release was Camel 3.4. We have blog posts for what's new in Camel 3.5 and Camel 3.6 you may want to read to cover all news between the two LTS releases.


Release Notes

You can find more information about this release in the release notes, with a list of JIRA tickets resolved in the release.



2020-12-09

Apache Camel 3.7 - Compiled Simple Language (Part 6)

I have previously blogged about the optimizations we are doing in the Apache Camel core. The first 3 blogs (part1, part2, part3) were a while back leading up to the 3.4 LTS release.

We have done more work (part4, part5) and this (part 6) that will be included in the next Camel 3.7 LTS release (to be released this month). 

This time we worked on a new variation of the Camel simple language, called csimple.


Compiled Simple (csimple)

The csimple language is parsed into regular Java source code and compiled together with all the other source code, or compiled once during bootstrap via the camel-csimple-joor module.

To better understand why we created csimple then you can read on about the difference between simple and csimple (in the section further below). But first let me show you some numbers.

I profiled a Camel application that processes 1 million messages, which are triggered in-memory via a timer, and calls a bean to select a random User object that contains user information. The message is then multicasted and processed concurrently by 10 threads, which does some content based routing based on information on the User object.

The Camel route is from a Spring XML file, and then a few Java beans to represent the User object and the bean to select a random user.

The application is profiled running with simple and csimple language until all messages has been processed.

The main focus is the difference between the following simple and csimple expression (XML DSL)

<simple>${exchangeProperty.user.getName} != null &&
        ${exchangeProperty.user.getAge} > 11
</simple>

<csimple>${exchangeProperty.user} != null &&      
         ${exchangeProperty.user.getName()} != null &&
         ${exchangeProperty.user.getAge()} > 11
</csimple>

At first glance they may look identical, but the csimple language has an additional not null check whether the user object exists or not. You may think that the csimple language contains type information but it does actually not. We have "cheated" by using an alias (a feature in csimple) which can be configured in camel-csimple.properties file as shown:

# import our user so csimple language can use the shorthand classname
import org.example.User;

# alias to make it shorter to type this
exchangeProperty.user = exchangePropertyAs('user', User.class)

Here we can see the alias is referring to the exchangePropertyAs function that takes the property name as first input, and then the class name as 2nd input. And because we have a Java import statement in the top of the properties file, we can type the local classname User.class instead of org.example.User.

The csimple script gets parsed into the following Java source code, which is then compiled by the regular Java compiler together with the rest of the application source code:

    @Override

    public Object evaluate(CamelContext context, Exchange exchange, Message message, Object body) throws Exception {

        return isNotEqualTo(exchange, exchangePropertyAs(exchange, "user", User.class), null) && isNotEqualTo(exchange, exchangePropertyAs(exchange, "user", User.class).getName(), null) && isGreaterThan(exchange, exchangePropertyAs(exchange, "user", User.class).getAge(), 11);

    }

Performance numbers

Okay lets get back to the performance numbers. The raw data is presented below as screenshot and table.




CPU usage

simple            814815 millis
csimple             7854 millis


Memory usage

simple               123 objects         5328 bytes
bean                3171 objects       177680 bytes

csimple               3 objects           792 bytes


As we can see the cpu usages is dramatically reduced by a factor of 100 (one hundred).

The memory usage is also reduced. The simple language uses OGNL expression with the bean language and hence we should calculate the combined usage which then is roughly 3294 objects taking up about 183kb of heap memory. (the bean language has introspection cache and other things). The csimple language is very very tiny with just 3 objects taking up 792 bytes of heap memory. The memory usage is dramatically reduced by a factor of 231

The memory screenshot includes simple language for both runs, the reason is that there are some basic simple expressions in the route which was not changed to csimple. Only the script that performed the most complex expression with OGNL on the User object.

So all together is a very dramatic reduction in both cpu and memory. How can this be?

Very low footprint, why?

The low footprint is because of mainly two reasons

1)
The script is compiled as Java code by the Java compiler either at build time or during bootstrap.

2)
The script is not using bean language / bean introspection with reflection for OGNL paths. However this requires the script to include type information so the Java compiler knows the types to compile the OGNL paths as regular java method calls. This is the main driver of the reduced footprint on both memory and cpu. Basic scripts such as ${header.zipCode} != null would have similar footprint. However csimple with pre compiled would have lower footprint as the script is pre parsed which otherwise would have to happen during bootstrap to generate the Java source code for the Java compiler to do an in-memory compilation; which will impact the startup performance.

Are they any limitations?

Yes the csimple language is not a 100% replacement for simple (we will continue to improve the feature parity). In the Camel 3.7 release csimple is in preview mode and have the following limitations

- nested functions is currently not supported
- null safe operator is not supported

And for OGNL paths, then as previously mentioned, then csimple requires to be type safe, by including the types of the objects.

Difference between simple and csimple

The simple language is a dynamic expression language which is runtime parsed into a set of Camel Expressions or Predicates.

The csimple language is parsed into regular Java source code and compiled together with all the other source code, or compiled once during bootstrap via the camel-csimple-joor module.

The simple langauge is generally very lightweight and fast, however for some use-cases with dynamic method calls via OGNL paths, then the simple language does runtime introspection and reflection calls. This has an overhead on performance, and was one of the reasons why csimple was created.

The csimple language requires to be typesafe and method calls via OGNL paths requires to know the type during parsing. This means for csimple languages expressions you would need to provide the class type in the script, where as simple introspects this at runtime.

In other words the simple language is using duck typing (if it looks like a duck, and quacks like a duck, then it is a duck) and csimple is using Java type (typesafety). If there is a type error then simple will report this at runtime, and with csimple there will be a Java compilation error.


Any examples for me to try?

We have provide two small examples that demonstrate csimple as pre compiled and as runtime compiled during bootstrap. You can find these two examples from the official Apache Camel examples repository at:


Whats Next

We want to implement the missing feature for nested functions and the null safe operator. We are also working on camel-quarkus to make csimple optimized for Quarkus and GraalVM. This effort is already started and Camel 3.7 will come with the first work in this area.

We also want to work on speeding up the runtime compilation to be able to do batch compilation. Currently each csimple script is compiled sequentially. 

And we want to take a look at if we can make runtime compilation work better with Spring Boot in its tar jar mode.

However at first enjoy csimple in the upcoming Camel 3.7 LTS release and as always we want your feedback and love contributions.