2020-12-09

Apache Camel 3.7 - Compiled Simple Language (Part 6)

I have previously blogged about the optimizations we are doing in the Apache Camel core. The first 3 blogs (part1, part2, part3) were a while back leading up to the 3.4 LTS release.

We have done more work (part4, part5) and this (part 6) that will be included in the next Camel 3.7 LTS release (to be released this month). 

This time we worked on a new variation of the Camel simple language, called csimple.


Compiled Simple (csimple)

The csimple language is parsed into regular Java source code and compiled together with all the other source code, or compiled once during bootstrap via the camel-csimple-joor module.

To better understand why we created csimple then you can read on about the difference between simple and csimple (in the section further below). But first let me show you some numbers.

I profiled a Camel application that processes 1 million messages, which are triggered in-memory via a timer, and calls a bean to select a random User object that contains user information. The message is then multicasted and processed concurrently by 10 threads, which does some content based routing based on information on the User object.

The Camel route is from a Spring XML file, and then a few Java beans to represent the User object and the bean to select a random user.

The application is profiled running with simple and csimple language until all messages has been processed.

The main focus is the difference between the following simple and csimple expression (XML DSL)

<simple>${exchangeProperty.user.getName} != null &&
        ${exchangeProperty.user.getAge} > 11
</simple>

<csimple>${exchangeProperty.user} != null &&      
         ${exchangeProperty.user.getName()} != null &&
         ${exchangeProperty.user.getAge()} > 11
</csimple>

At first glance they may look identical, but the csimple language has an additional not null check whether the user object exists or not. You may think that the csimple language contains type information but it does actually not. We have "cheated" by using an alias (a feature in csimple) which can be configured in camel-csimple.properties file as shown:

# import our user so csimple language can use the shorthand classname
import org.example.User;

# alias to make it shorter to type this
exchangeProperty.user = exchangePropertyAs('user', User.class)

Here we can see the alias is referring to the exchangePropertyAs function that takes the property name as first input, and then the class name as 2nd input. And because we have a Java import statement in the top of the properties file, we can type the local classname User.class instead of org.example.User.

The csimple script gets parsed into the following Java source code, which is then compiled by the regular Java compiler together with the rest of the application source code:

    @Override

    public Object evaluate(CamelContext context, Exchange exchange, Message message, Object body) throws Exception {

        return isNotEqualTo(exchange, exchangePropertyAs(exchange, "user", User.class), null) && isNotEqualTo(exchange, exchangePropertyAs(exchange, "user", User.class).getName(), null) && isGreaterThan(exchange, exchangePropertyAs(exchange, "user", User.class).getAge(), 11);

    }

Performance numbers

Okay lets get back to the performance numbers. The raw data is presented below as screenshot and table.




CPU usage

simple            814815 millis
csimple             7854 millis


Memory usage

simple               123 objects         5328 bytes
bean                3171 objects       177680 bytes

csimple               3 objects           792 bytes


As we can see the cpu usages is dramatically reduced by a factor of 100 (one hundred).

The memory usage is also reduced. The simple language uses OGNL expression with the bean language and hence we should calculate the combined usage which then is roughly 3294 objects taking up about 183kb of heap memory. (the bean language has introspection cache and other things). The csimple language is very very tiny with just 3 objects taking up 792 bytes of heap memory. The memory usage is dramatically reduced by a factor of 231

The memory screenshot includes simple language for both runs, the reason is that there are some basic simple expressions in the route which was not changed to csimple. Only the script that performed the most complex expression with OGNL on the User object.

So all together is a very dramatic reduction in both cpu and memory. How can this be?

Very low footprint, why?

The low footprint is because of mainly two reasons

1)
The script is compiled as Java code by the Java compiler either at build time or during bootstrap.

2)
The script is not using bean language / bean introspection with reflection for OGNL paths. However this requires the script to include type information so the Java compiler knows the types to compile the OGNL paths as regular java method calls. This is the main driver of the reduced footprint on both memory and cpu. Basic scripts such as ${header.zipCode} != null would have similar footprint. However csimple with pre compiled would have lower footprint as the script is pre parsed which otherwise would have to happen during bootstrap to generate the Java source code for the Java compiler to do an in-memory compilation; which will impact the startup performance.

Are they any limitations?

Yes the csimple language is not a 100% replacement for simple (we will continue to improve the feature parity). In the Camel 3.7 release csimple is in preview mode and have the following limitations

- nested functions is currently not supported
- null safe operator is not supported

And for OGNL paths, then as previously mentioned, then csimple requires to be type safe, by including the types of the objects.

Difference between simple and csimple

The simple language is a dynamic expression language which is runtime parsed into a set of Camel Expressions or Predicates.

The csimple language is parsed into regular Java source code and compiled together with all the other source code, or compiled once during bootstrap via the camel-csimple-joor module.

The simple langauge is generally very lightweight and fast, however for some use-cases with dynamic method calls via OGNL paths, then the simple language does runtime introspection and reflection calls. This has an overhead on performance, and was one of the reasons why csimple was created.

The csimple language requires to be typesafe and method calls via OGNL paths requires to know the type during parsing. This means for csimple languages expressions you would need to provide the class type in the script, where as simple introspects this at runtime.

In other words the simple language is using duck typing (if it looks like a duck, and quacks like a duck, then it is a duck) and csimple is using Java type (typesafety). If there is a type error then simple will report this at runtime, and with csimple there will be a Java compilation error.


Any examples for me to try?

We have provide two small examples that demonstrate csimple as pre compiled and as runtime compiled during bootstrap. You can find these two examples from the official Apache Camel examples repository at:


Whats Next

We want to implement the missing feature for nested functions and the null safe operator. We are also working on camel-quarkus to make csimple optimized for Quarkus and GraalVM. This effort is already started and Camel 3.7 will come with the first work in this area.

We also want to work on speeding up the runtime compilation to be able to do batch compilation. Currently each csimple script is compiled sequentially. 

And we want to take a look at if we can make runtime compilation work better with Spring Boot in its tar jar mode.

However at first enjoy csimple in the upcoming Camel 3.7 LTS release and as always we want your feedback and love contributions.


No comments: