2020-10-09

Apache Camel 3.6 - More camel-core optimizations coming (Part 4)

I have previously blogged about the optimziations we are doing in the Apache Camel core. The first 3 blogs (part1, part2, part3) were a while back leading up to the 3.4 LTS release.

Now we have done some more work that is coming in Camel 3.6 leading up to the next 3.7 LTS release.

To speedup startup we switched to a new uuid generator. The old (classic) generator was inherited from Apache ActiveMQ which needed to ensure its ids were unique in a network of brokers, and therefore to ensure this the generator was using the hostname as prefix in the id. This required on startup to do a network access to obtain this information which costs a little time. Also depending on networks this can be more restrictive and delay the startup. The new generator is a pure in-memory fast generator that was used by Camel K and Camel Quarkus.

We also identified a few other spots during route initialization. For example one small change was to avoid doing some regular expression masking on route endpoints which wasn't necessary anymore.

Now the bigger improvements are in the following areas

Avoid throwing exceptions

We identified on spring runtimes that Camel would query the spring bean registry for known beans by id, which the Spring framework would throw a NoSuchBeanDefinitionException if the bean is not present. As Camel does a bit of optional bean discovery during bootstrap, we found a way to avoid this which prevents this.

Singleton languages

Another related problem is that in Camel 3 due to the modularization then some of the languages (bean, simple, and others) have been changed from being a singleton to prototype scoped. This is in fact one of the biggest problems and we had a Camel user report a problem with thread contention in a high concurrent use-case would race for resolving languages (they are prototype scoped). So you would have this problem, and because the language resolver would query the registry first then Spring would throw that no such bean exception, and then Camel would resolve the language via its own classpath resolver. So all together this cost performance. We can see this in the screenshots from the profiler in the following.




The top screenshot is using Camel 3.5 and the bottom 3.6. In the top we can see the threads are blocked in Camels resolveLanguage method. And in 3.6 then its actually the log4j logger that is blocking for writing to the log file. Both applications are using the same Camel application and have been running for about 8 minutes.

Reduce object allocations

The next screenshots are showing a sample of the object allocations.



With Camel 3.5 we are average about 1000 obj/sec and with 3.6 we are down to about a 1/3th.

One of the improvements to help reduce the object allocations was how parameters to languages was changed from using a Map to a plain object array. The Map takes up more memory and object allocations than a single fixed object array. 

Do as much init as possible

Another performance improvement that aids during runtime was that we moved as much we could from the evaluation to the initialization phase in the Camel languages (simple, bean, etc.). We did this by introducing the init phase and ensuring CamelContext was carried around in the interns so we can use the context during the init phase, where its really needed. This ensures the runtime evaluation is as fast as possible.

Other smaller optimizations

We also improved the simple language to be a bit smarter in its binary operators (such as header.foo > 100). Now the simple language has stronger types for numeric and boolean types during its parsing, which allows us to know better from the right and left hand side of the binary operator to do type coercion so the types are comparable by the JVM. Before we may end up with falling back to converting to string types on both sides. And there is more to come, I have some ideas how to work on a compiled simple language.

The screenshots below shows a chart with the CPU, object allocations and thrown exceptions.



As we can see this summarise what was mentioned was done to optimize. The number of exceptions has been reduced to 0 at runtime. There is about 3500 thrown during bootstrap (that is Java JAXB which is used for loading the spring XML file with the Camel routes used for the sample application). We do have a fast XML loader in Camel that is not using JAXB.

Another improvement we did was to build a source code generator for a new UriFactory which allows each component to quickly build dynamic endpoint URIs from a Map of parameters. The previous solution was to use RuntimeCamelCatalog that was more generic and required loading component metadata from json descriptor files. A few components use this to optimize the toD (such as http components). By this change we avoid the runtime catalog as dependency (reduce JAR size) and the source code generated uri factory is much faster (its speedy plain Java). However the sample application used for this blog did not use toD nor the UriFactory.

Apache Camel 3.6 is scheduled for release later this month of October. It's going to be the fastest Camel ever ;)


2020-07-27

BarcelonaJUG talk on Tuesday July 28th about Camel 3 in the era of Kubernetes and Serverless

Just back from PTO and myself and Andrea Cosentino are on the spot tomorrow where we have been invited by Barcelona JUG to give a talk about the Camel 3 and its latest innovation around Kubernetes and Serverless (and Kafka).


It's a free and online event, in English, tomorrow Tuesday 28th at 19.00 CEST
https://www.meetup.com/es-ES/BarcelonaJUG/events/271746564/

The session runs for 45 min with Q and A at the end.

PS: Just updated my platform to use latest Camel K 1.1.0 release, so hope the demo gods are on our side tomorrow.

2020-06-10

Apache Camel K 1.0 is here - Why should you care

Yesterday we released Apache Camel K 1.0 and it was announced on social media and on the Camel website.


So what is Camel K and why should you care? That is a great question and I want to help answer this by referring to great minds.

Hugo Guerrero posted the following tweet  


That is a powerful statement from Hugo, where he highlights the groundbreaking innovation from Camel K, that gives developers the tools and means to build Java based services that are both serverless and functional and runs using standard Kubernetes building blocks.

Camel K is the biggest innovation in Apache Camel for the last 10 years. So fill your cup with coffee or tea, and sit back and enjoy the next 10 minutes read.

I give the floor to Nicola Ferraro (co-creator of Camel K) whom have allowed me to re-post his blog post from the announcement yesterday.


Apache Camel K has made a lot of progress since its inception and we're now proud to announce the 1.0 release. We've been working hard in the past months to add more awesome features to Camel K, but also to improve stability and performance. This post contains a list of cool stuff that you'll find in the 1.0 GA release.

First of all, if you're living under a rock and it's the first time you hear about Camel K,
you can read some introductory blog posts here (1 - introducing Camel K) (2 - camel k on knative)
or look at the Apache Camel website that contains a Camel K section
with a lot of material that is automatically generated from the Github repository

User experience

Camel K development style is minimalistic: you need just to write a single file with your integration routes and you can immediately run them on any Kubernetes cluster. This way of defining things is common to many FaaS platforms (although Camel K is not a proper FaaS platform, but a lightweight integration platform) and it's technically difficult to provide IDE support, such as code completion and other utilities, to developers.

But now we've it. The integration tooling team has created some cool extensions for VS Code that make the development experience with Camel K even more exciting.
You don't need to remember the Camel DSL syntax, the IDE will give you suggestions and error highlighting.



Code completion works with Java code, but it's not only limited to it: you also have suggestions and documentation out of the box when writing the Camel URIs and property files.
And you also have many options to run integrations and interact with them, all integrated in the IDE.

Just install the VS Code Extension Pack for Apache Camel to have all these features available.


Getting started tutorials

Good tools are fundamental to have a great development experience with Camel K, but then you need to learn what you can do with such a great power.
We've created a new repository in the Apache organization that hosts getting started examples: the camel-k-examples repository.

So far we've added guides that drive you through:

- 01 Basic: Learn the basics of Camel K and some interesting use cases
- 02 Serverless APIs: How to design a serverless (i.e. auto-scaling, scaling to zero) API and run it in a few minutes

The basic quickstart is also available online, so you can have a look at how camel k works without installing anything on your laptop.

More tutorials are expected to come in the following months. You are also welcome if you want to help us by contributing your own. They are based on the VSCode Didact project, that provides an
awesome user experience.

If you are looking for Camel K code samples that you can just pick and run using the CLI, the examples directory of the Camel K main repository contains a lot of them. You can also run them directly from Github:

kamel run https://raw.githubusercontent.com/apache/camel-k/master/examples/Sample.java

You can find ready-to-use examples written in different languages (e.g. XML, JavaScript and others).

Serverless

Serverless is the most important area where we're focusing the new developments in Apache Camel K, although, you should remember, you can have a wonderful Camel K experience even without serverless features. To enable the serverless profile in Camel K, you just need to have Knative installed.

In recent releases, we have added support for the most recent advancements in Knative, for example, Camel K is very well integrated with the Knative event broker and you can easily produce or consume events from it.

With 2 lines of code you can transfer events (e.g. generated by IoT devices) from your MQTT broker to the mesh:

bridge.groovy

from('paho:mytopic?brokerUrl=tcp://broker-address:1883&clientId=knative-bridge')
  .to('knative:event/device-event')


No kidding, you just need to write those two lines of code in a file and run it with kamel run bridge.groovy to push data into the Knative broker.

And you can also scale the Integration out (Integration is a Kubernetes custom resource, kubectl get integrations to see all of them)
to have a higher throughput. Scaling here is manual because the source of events is a MQTT broker (but we've plans to put auto-scaling also in this scenario

The Camel K embedded auto-scaling feature works really well when you want to react to some Knative events:

listener.groovy

from('knative:event/device-event')
  .to('http://myhost/webhook/random-id')

This integration is configured to receive all events with `type=device-event` and scales automatically with the load because it is materialized into a Knative Serving Service and automatically subscribed to the Eventing Broker via a Trigger.

It then receives a CloudEvent when your IoT devices produce something and scales down to zero if there's no data coming. You just need to create it (as before, just kamel run listener.groovy), all the remaining configuration is done automatically by the Camel K operator.

We've added much more features for having a better integration with the Knative ecosystem and we've also fixed some compatibility and performance issues that were present in previous versions. The user experience is now much smoother.

If you are a Knative YAML developer (!), instead of using Camel K directly, you also have the option to use Knative Camel Sources which are part of the Knative release. They are wrappers for Camel K integrations that are compatible with all the tools used by Knative developers (such as the kn CLI or the OpenShift serverless console).
Sources in Knative can only push data into the various Knative endpoints, but not the other way around (i.e. they cannot be used to publish data from Knative to the outside).
In Camel K you don't have this limitation: the Route is the fundamental building block of a Camel integration and you can do whatever you want with it.

Fast startup and low memory

We cannot say we're serverless without mentioning the work that we've been doing in improving the performance of Camel K integrations.

Starting from Camel 3.3.0 which is the default version used by Camel K 1.0.0, you can benefit from all improvements that have been made directly in the Camel core to make it much more lightweight. More in depth details of the Camel core improvements can be found the following blog series that highlights what has been changed in the 3.x Camel timeline to reduce memory footprint and speedup the startup time, which is fundamental when running integrations in a serverless environment: part 1, part 2  part 3, part 4.

But improvements are not only limited to the Camel core: we're doing much more. Several months ago we've started a new subproject of Apache Camel named Camel Quarkus with the goal of seamlessly running integrations on top of the Quarkus framework. As you probably know, Quarkus is able to reduce the memory footprint of Java applications and improve the startup time, because it moves much startup logic to the build phase. And Quarkus applications can also be compiled to a native binary, allowing a dramatic improvements in startup performance and very low memory footprint.

In Camel K 1.0.0 we support Camel Quarkus in JVM mode. A goal is to have also the in-cluster native compilation soon (for some DSL languages, such as YAML), in one of next releases!

To use Quarkus as underlying runtime, you just need to enable the Quarkus trait when running an integration:

kamel run myintegration.groovy -t quarkus.enabled=true

Quarkus is expected to be the default underlying runtime in the next release, and support for Standalone mode (via camel-main) will be deprecated and removed. This means that you won't need to enable Quarkus manually in the next releases, but you still need to do it in 1.0.

Fast build time

Every application running on Kubernetes needs to be packaged in a container image, but in Camel K you only provide the integration DSL and the operator does what it takes to run it, including building images directly in the cluster.

The operator manages a pool of reusable container images and if you redeploy your integration code, it does try to reuse existing images from the pool rather than building a new one at each change, because it takes some time to build a new one. It was 1 minute at the beginning...

But Kubernetes is moving so fast that you cannot solve a problem once and forget about it, you need to take care of it continuously. It happened that some of our third party dependencies that we used for doing builds in "vanilla Kube" has slowly degraded in performance up to a point where Camel K user experience was highly affected.

We decided to work harder on the build system in order to dramatically improve (again!) the build phase of Camel K integrations.

Build time can be be now measured in seconds in dev environments such as Minikube. A bunch of seconds, most of the times. This is more than a simple improvement!


Better CLI

The 'kamel' CLI is the main tool we provide to developers to run integrations. It's not a mandatory requirement: at the end, an Integration is a Kubernetes custom resources and you can manage it with any Kubernetes standard tool (e.g. kubectl). But the kamel CLI adds a lot of value for integration developers.

For example, if you're a Camel Java developer it's not super easy to remember the boilerplate that you have to write in order to instantiate a Camel route builder. Now you don't have to remember that:

kamel init Handler.java

You get a Java file with all the boilerplate written for you and you just have to write your integration routes.

It works also with all other languages: Groovy, XML, YAML, Kotlin and JavaScript.
For example you can write:

kamel init foo.js


This way you get a simple route written in JavaScript.

It's not just that. Often Camel K developers need to add a lot of command line options to configure the final behavior of their integration. For example, you may want to add a custom library with the `-d` option or configure a trait with `-t`. E.g.:

kamel run -d mvn:org.my:lib:1.0.0 -d mvn:org.my:otherlib:2.0.0 -t quarkus.enabled=true Handler.java


Sometimes the number of command line parameters you've to add can become too many. For this reason we've added the possibility to specify them as modeline options in the integration file (done by adding a comment line with `camel-k:` as prefix).

Handler.java

// camel-k: dependency=mvn:org.my:lib:1.0.0 dependency=mvn:org.my:otherlib:2.0.0 trait=quarkus.enabled=true

// ...
// your routes here


Once the options are written in the file, you can run the routes with just:

// simply this, additional args are read from the file
kamel run Handler.java


The other options are taken automatically from the file modeline. The CLI also displays the full command to let you know what's running.

This kind of configuration is extremely useful in CI/CD scenarios because it allows you to have self-contained integration files and you don't need to change the pipeline to setup additional options. If you're curious about the CI/CD configurations, you can follow the tutorial about Tekton pipelines to have more information.

Monitoring and Tracing

Ok, you've finished level 1 of Camel K development and you want to make serious things. You're in a very good position because Camel K provides a lot of useful tools to add visibility on what your integration routes are doing.

Let's suppose you've a Prometheus instance in your namespace and you want to publish your integration metrics:

kamel run Routes.java -t prometheus.enabled=true

That's it. No need to setup services a labels to enable scraping. A default prometheus configuration file is also provided for the integration, with sensible defaults. Of course you also have the option to provide your own configuration for advanced use cases.

Now, let's suppose you want to see what your routes are doing and trace the execution flow of an integration. What you need to do is to install an opentracing compatible application in the namespace, such as Jaeger, and run the integration as:

kamel run Routes.java -t prometheus.enabled=true -t tracing.enabled=true

That's it again. The Camel K operator will add the camel-opentracing library and connect it to the Jaeger collector that is available in the namespace. Here again, advanced use cases are supported.

Master routes

Good old Camel users know why and when master routes are useful, but for those who are not familiar with the term, I'm going to provide a brief explanation.

Whenever you have an integration route that must be running, at any point in time, in at most one single Camel instance, you need to use a master route. Master routes can be declared by simply prefixing the consumer endpoint by the 'master' keyword and a name that will be used to create a named lock, e.g.

from('master:mylock:telegram:bots')
  .to('log:info')


It can be used to print all messages that are sent to your Telegram bot. Since the Telegram API support a single consumer only, you can guard the route with a master prefix to have the guarantee that there will be at most only one consumer at any given time.

If you're wondering how there can be two instances running of you deploy one, well, think just to when you change your code and need to do a rolling update: for some time there'll be two pods running in parallel. In some cases, you may decide to scale your service out but keep only one instance of a particular route among all the pods of your service. Or you may want to embed a master route in a Knative autoscaling service: in this case, the service can scale autonomously based on the load, but there'll be only one telegram consumer at any time.

Master routes work out of the box in Camel K, you just need to put a prefix in your endpoint uri. A leader election protocol based on Kubernetes APIs resource locks will be automatically configured for you!

CronJobs

All complex enough systems contain several scheduled jobs. This is especially true for that part of the system that handles integration with the outside.

Ideally, if you need to execute a quick periodic task, say, every two seconds, you would startup an integration with a route based on timer to execute the periodic task. E.g.

from("timer:task?period=2000")
  .to(this, "businessLogic")

But if the period between two executions, instead of 2 seconds ("2000" in the Camel URI, which is measured in milliseconds) is 2 minutes ("120000") or 2 hours ("7200000")?

You can see that keeping a container with a JVM running for a task that should be executed once every two minutes may be overkill (it is overkill for sure when the period is 2 hours). We live in a time where resources such as memory and CPU are really valuable.

So the Camel K operator automatically handles this situation by deploying your integration not as a Kubernetes deployment, but as a Kubernetes CronJob. This saves a lot of resources, especially when the period between executions is high. When it's time to run your integration code, a container starts, triggers the execution and then gracefully terminates. Everything is handled automatically by Camel K and Kubernetes.

There are cases when you don't want this feature to be enabled, for example, when your code makes use of in memory caches that is better to keep between executions. In these cases, you can safely turn off the feature by passing the flag `-t cron.enabled=false` to the `kamel run` command.

The Cron feature does not only work with the `timer` component. We've also added a cron component since Camel 3.1 that works really well in combination with the cron trait.

So you can also write the cron expression in the route directly:

from("cron:job?schedule=0/5+*+*+*+?")
  .to(this, "businessLogic")

In this case, a new pod with a JVM is started every 5 minutes to execute your scheduled task. For the remaining 4+ minutes you don't use any resource.

Transparency

Camel K does a lot of work for you when you run your integration code in the cluster and it's possible that you put some errors in the code that can block the deployment process. We've added a lot of visibility on the deployment process that now communicates with the users via Kubernetes events that are printed to the console when you use the CLI.

This way you're always notified of problems in the code and you can better understand what to fix to make your integration run.

How to try Camel K 1.0

The first step is to go to the Apache Camel K release page, download the kamel CLI for your OS and put it in your system path.

Installation is done usually using the `kamel install` command, but, depending on the kind of Kubernetes cluster you're using, you may need to execute additional configuration steps.
The Camel K documentation contains a section about installing it on various types of Kubernetes clusters.

If you have trouble or you need to install it on a particular cluster that is not listed, just reach out in the Gitter chat and we'll do our best to help you.

Future

We've reached version 1.0.0 and this is a great milestone for us. But we are not going to stop now: we've big plans for the future and we'll continue to develop awesome new features.

We need your help to improve Camel K and we love contributions!

Join us on:

- Gitter: https://gitter.im/apache/camel-k
- GitHub: https://github.com/apache/camel-k


2020-06-04

Tech Talk tomorrow (friday) - What's new with Apache Camel 3?

This blog is with short noice. Tomorrow on Friday June 5th there will be a DevNation Tech Talk with Andrea Cosentino and myself presenting Apache Camel 3.



Date: June 5, 2020
Time: 13:00 UTC / 15:00 CET / 9:00 AM EDT

Apache Camel is a leading open source integration framework that has been around for more than a decade.

With the release of Apache Camel 3, the Camel family has been extended to include a full range of projects that are tailored to popular platforms including Spring Boot, Quarkus, Kafka, Kubernetes, and others; creating an ecosystem.

Join this webinar to learn what’s new in Camel 3 and about Camel projects:

  • Latest features in Camel 3
  • Quick demos of Camel 3, Camel Quarkus, Camel K, and Camel Kafka Connector
  • Present insights into what's coming next

Registration required at following link.

2020-04-24

Free book on Knative covering Camel K and Kafka and upcoming webinar with live demos

I want to say my congratulations to two of my fellow red hatters, Burr Sutter & Kamesh Sampath whom have published a new book on Knative - Knative Cookbook.



Blog post announcement: book launch.

The book is around 150 pages and is a cookbook style, so it's a great learner book and with step by step instructions to try out first hand.

The book has a chapter on Apache Camel K which is fantastic. They show you how to get started with Camel K and then continue to add Knative into the mix and how Camel K easily adapts to Knative being present in the platform.

Kamesh will show all the greatness of Knative together with Kafka and Kamel, in his upcoming  webinar: 4K Kubernetes with Knative, Kafka, and Kamel scheduled at April 30th. So this is a great opportunity to hear first hand from the author, and see live demos, and just relax with a cup of coffee/tea or maybe even a cold beer. I surely will do that, maybe all 3 ... or skip the tea ;)

So I suggest to go download the free book, and register for the webinar.

2020-04-14

How to quickly run 100 Camels with Apache Camel, Quarkus and GraalVM

Today I continue me practice on youtube and recorded a 10 minute video on creating a new Camel and Quarkus project that includes Rest and HTTP services with health checks and metrics out of the box.


Then comparing the memory usage of running the example in JVM mode vs native compiled with GraalVM. Then showing for the finale how to quickly run 100 instances of the example each on their own TCP port and how quick Camel are to startup and service the first requests faster than you can type and click.

For this demo I am using Java 11, Apache Camel 3.2.0, Quarkus 1.3.2 and GaalVM CE 20.0.0. You can find the source code for the example at camel-quarkus github with instructions how to try for yourself.

We are working on reducing the binary image size for Camel 3.3, by eliminating more classes that GraalVM includes that are not necessary. And we also have an experiment with an alternative lightweight CamelContext that are non dynamic at runtime which can improve this further. And then GraalVM and Quarkus will of course also keep innovative and make it smaller and faster.


2020-03-25

Best Practices for Middleware and Integration Architecture Modernization with Apache Camel

Yesterday I gave the following virtual talk at the Stockholm Meetup.

Best Practices for Middleware and Integration Architecture Modernization with Apache Camel
What are important considerations when modernizing middleware and moving towards serverless and/or cloud native integration architectures? How can we make the most of flexible technologies such as Camel K, Kafka, Quarkus and OpenShift. Claus is working as project lead on Apache Camel and has extensive experience from open source product development.
I thank the organizers Forefront Consulting for inviting me. Unfortunately there was a glitch with the talk yesterday. As I could not be in person then the talk was pre-recorded and was cut half way. So I promised to post a blog today and upload the talk to youtube.


The title and abstract of the talk was somewhat given to me, and so was the session length of 30 minutes. As I am so heavily invested in Apache Camel, then I focused the talk about Camel and its evolution over the last 10 years as introduction and then using the latest innovations from Camel K, Camel Quarkus and Camel Kafka Connectors as the meat of the talk, and with 3 demos.

The talk can be watched on youtube and the slides are here.

2020-03-22

Apache Camel 3.1 - Fast loading of XML routes

A feature that was added to Camel 3.1 is the ability to load XML routes much faster. This is part of the overall work we are doing on making Camel much smaller and faster.

You may say ewww XML. But frankly there are many users of Camel that have built applications with XML for defining routes. In Camel 2.x then you would have to use Spring or OSGi Blueprint for XML routes which both are becoming heavy in modern cloud native world.

In Camel 3 we have a standalone mode for Camel called camel-main. We use camel-main as a common way to bootstrap and configure Camel for standalone, camel-k, camel-quarkus, and for most parts of camel-spring-boot as well. This ensures an unified and consistent developer experience across those runtimes.

Okay this is probably a topic for another blog post to dive into camel-main as a great runtime for quickly running ... just Camel.

So what I wanted to say in this blog post is that we have made it possible to loading XML routes much quicker and with a lot less overhead. In Camel 2.x, and for Spring XML and Blueprint XML they rely on JAXP and JAXB which ... are heavy.

So what we have done for Camel 3.1 is to source code generate a XML parser based on the Camel DSL. This means anything we do changes to the DSL then the parser is re-generated. The parser just uses standard Java so there are no additional 3rd party library dependencies.

For loading XML routes in Camel we now have 2 parsers in the following JARs

- camel-xml-jaxb   (traditional JAXB based as in Camel 2.x)
- camel-xml-io       (new fast and lightweight source code generated parsers)

The example camel-example-main-xml is setup to use the new parser. But you can try for yourself and switch to the jaxb parser by changing the JAR dependency.

Lets see some numbers (note this is just a quick test on my laptop to run this example with the 2 XML parsers).

camel-xml-jaxb: Loaded 1 (808 millis) additional Camel XML routes from: routes/*.xml
camel-xml-io: Loaded 1 (76 millis) additional Camel XML routes from: routes/*.xml

So the new parser is about 10 times faster (76 vs 808 millis).


By profiling the JVM we can see that there is a lot less classes loaded as well: 4734 vs 3892. And on top of that JAXB leaves more objects and classes around in the JVM that may or may not easily be garbage collected, and would also be using more cpu and memory during its parsing.

And then on GraalVM then the new parser would be much quicker as you can avoid having the entire JAXB and JAXP API and implementation on the classpath and for the GraalVM compiler to crunch and compile. And speaking of GraalVM then we are working on some great improvements in the upcoming Camel 3.2 that should help reduce the image size and compilation, and allow to do more dead code elimination and whatnot to make Camel even more awesome. That's yet another topic for another blog post, so stay tuned.


Apache Camel 3.2 - Reflection free configuration of Camel

At the Apache Camel project we are working towards the next upcoming Apache Camel 3.2.0 release, which is planned for next month.

One of the ares we have worked hard on in Camel 3 is to make it smaller and faster. And one aspect of this is also configuration management. You can fully configure Camel in many ways and according to the 12 factor principles, to keep configuration separated from the application. A popular way to configure is to use properties files (eg application.properties) or in Kubernetes you can configure from config maps or environment variables as well.

So we have gradually over Camel 3.0, 3.1 and now 3.2 made configuration faster. With the latest work we are now fully reflection free.



Camel is capable of reporting when reflection based configuration are being used. Which can be configured with:

# bean introspection to log reflection based configuration
camel.main.beanIntrospectionExtendedStatistics=true
camel.main.beanIntrospectionLoggingLevel=INFO

We have prepared the camel-example-main-tiny to report this. The numbers for Camel 3.0, 3.1, and 3.2 are as follows:

Camel 3.0: BeanIntrospection invoked: 12 times
Camel 3.1: Stopping BeanIntrospection which was invoked: 11 times
Camel 3.2: Stopping BeanIntrospection which was invoked: 0 times

What this means is that you can fully configure all your Camel endpoints, components, routes, EIPs, data formats, languages, camel main, camel context, and whatnot, in declarative properties files etc and then at runtime all of this ends up invoking the actual setter methods on all these instances (ie just direct java method calls, no java.lang.reflect).

This is possible because we source code generate configurer classes based on what options are present. And these configurer classes are reflection free. There can be many options so it would be impossible to implement this by hand, see for example the kafka endpoint configurer.

And btw another feature coming in Camel 3.2 is that we made all of the components options available for configuration, before we didn't include nested configuration options. And if you do not like configuring in properties files, then we have type-safe component-dsl and endpoint-dsl as well.


2020-02-27

Upcoming Webinar - What's new in Apache Camel 3

On March 3rd, at 3pm CET (Central European Timezone) I will co-host, together with Andrea Cosentino, a webinar session for 1 hour, where we cover all the great new features that are in the Apache Camel v3 release.


Andrea and I will cover in more details the high level goals of Apache Camel 3, and focus on the key elements about making Camel smaller, lighter, and faster for the cloud native world. So you will find details about what we have done internally to make this happen.

We also cover and introduce Camel K, Camel Quarkus and Camel Kafka Connector where we have 4 live demos ready for you. And finally we present the roadmap for the upcoming releases.

At the end we have Q&A session where we will assist and answer as many questions you may have.

The webinar is free to attend, but requires registration, as the webinar is run by professional media company.

More details and registration here.

PS: Yes we will cover details up till the latest Camel 3.1 release which is going to be released today ;)

2020-02-12

Apache Camel 3.1 - More camel-core optimizations coming (Part 3)

I have previously blogged about the optimizations we are doing in the next Camel 3.1 release


Today I wanted to give a short update on the latest development we have done, as we are closing down on being ready to build and release Camel 3.1 as early as end of this week or the following.

Since part 2, we managed to find additional 10% reduction on object allocations during routing.

We have also continued the effort of configuring Camel via source code generated configurers that performs direct Java method calls vs using java bean reflections. Now all components, data formats, languages, and EIP patterns is complete. Only in more advanced use-cases where configuration is based on nested complex objects that are dynamically configured would be outside the scope of the source code configures and Camel fallback to use reflection.

We also found a way to optimize property placeholder resolution on EIPs to avoid using source code generated configurers which means that there are 200 classes less to load on the classpath, and about 90kb of memory is saved. This is great as these classes and memory were only used during bootstrap of Camel, and now they are all gone.

We also managed to further modulaize camel-core, so JAXB and XML routes are optional.
Even for XML routes (not Spring or Blueprint as they have their own DOM XML parser) we have created an alternative, fast and light-weight pull based parser. The camel-example-main-xml is using this and by comparing JAXB vs Camel XML then its 6x faster (approx 1500 millis vs 250) and loads 700 classes less than JAXB.

However for non XML users (eg using Java DSL) then JAXB can be avoided on the classpath at all, and you can have tiny Camel applications, such as camel-example-main-tiny with the following dependency tree (bold are Camel JARs; the example uses the bean and timer components)

[INFO] org.apache.camel.example:camel-example-main-tiny:jar:3.1.0-SNAPSHOT
[INFO] +- org.apache.camel:camel-main:jar:3.1.0-SNAPSHOT:compile
[INFO] |  +- org.apache.camel:camel-api:jar:3.1.0-SNAPSHOT:compile
[INFO] |  +- org.apache.camel:camel-base:jar:3.1.0-SNAPSHOT:compile
[INFO] |  +- org.apache.camel:camel-core-engine:jar:3.1.0-SNAPSHOT:compile
[INFO] |  +- org.apache.camel:camel-management-api:jar:3.1.0-SNAPSHOT:compile
[INFO] |  +- org.apache.camel:camel-support:jar:3.1.0-SNAPSHOT:compile
[INFO] |  \- org.apache.camel:camel-util:jar:3.1.0-SNAPSHOT:compile
[INFO] +- org.apache.camel:camel-bean:jar:3.1.0-SNAPSHOT:compile
[INFO] +- org.apache.camel:camel-timer:jar:3.1.0-SNAPSHOT:compile
[INFO] +- org.apache.logging.log4j:log4j-api:jar:2.13.0:compile
[INFO] +- ch.qos.logback:logback-core:jar:1.2.3:compile
[INFO] \- ch.qos.logback:logback-classic:jar:1.2.3:compile
[INFO]    \- org.slf4j:slf4j-api:jar:1.7.30:compile

I ran this example with the profiler and configured it to use 10MB as max heap (-Xmx10M) and as the summary shows this can easily be done. About 5mb is used in the heap.



There has also been a few other minor improvements to avoid using Camel 2.x based type converter scanning by default. This reduces a scan on the classpath.

Okay its time to end this blog series and finish up the last bits so we can get Camel 3.1 released.

2020-01-30

Apache Camel 3.1 - More camel-core optimizations coming (Part 2)

I have previously blogged about the optimizations we are doing in the next Camel 3.1 release (part 1).

Today I wanted to post a status update on the progress we have made since, about 4 weeks later.

We have focused on optimizing camel-core in three areas:

  • unnecessary object allocations
  • unnecessary method calls
  • improve performance
In other words we are making Camel create less objects, calling fewer methods, and improving the performance during routing.

To help identify these issues in camel-core we were using a simple Camel route:

from timer:foo
  to log:foo

And other times we focused on longer routes:


from timer:foo
  to log:foo1
  to log:foo2
  to log:foo3
  ...
  to log:fooN

Or the focus on the bean component:

from timer:foo
  to bean:foo

And so on. We also added an option to the timer component to not include metadata so the message dont contain any body, headers or exchange properties. This allowed us to focus on the pure routing engine and its overhead.

So all together this has helped identify many smaller points for improvements that collectively gains a great win.


tl:dr - Show me the numbers

Okay let's post some numbers first and then follow up with details what has been done.

Object Allocations - (5 minute sampling)
Camel 2.25     2.9 M objects created
Camel 3.0       55 M objects created
Camel 3.1      1.8 M objects created

Okay we have to admit that Camel 3.0 has an issue with excessive object allocations during routing. There are no memory leaks but it creates a lot of unnecessary objects. And I will get into details below why.

However what is interesting is the gain between Camel 2.25 and 3.1 (40% less objects created).

Method Calls - (5 minute sampling)

Camel 2.25     139 different Camel methods in use
Camel 3.0      167 different Camel methods in use
Camel 3.1       84 different Camel methods in use

The table above lists the number of methods from Camel that Camel calls during routing. The data does not include all the methods from the JDK. As we cannot optimize those, but we can optimize the Camel source code.

As you can see from the table we have improvement. Camel 3.1 uses less than half of 3.0, and 40% less than Camel 2.2.5.



Camel 3.0
Okay so Camel 3.0 has a problem with using too much memory. A big reason is the new reactive executor which now executes each step in the routing via event looping, by handing over tasks to a queue and having workers that execute the tasks. So this handoff now requires creating additional objects and storing tasks in queue etc.

Some of the biggest wins was to avoid creating TRACE logging message which unfortunately was always created regardless if TRACE logging level was enabled. Another big win was to avoid creating toString representation of the route processes with child elements. Instead Camel now only output the id of the process which is a fast operation and dont allocate new objects.

Another problem was new code that are using java.util.stream. This is both a blessing and a curse (mostly a curse for fast code). So by using plain for loops, if structures, and avoiding java.util.stream in the critical parts of core routing engine we reduces object allocations.

Camel 3 is also highly modularised, and for example in Camel 2.x we had all classes in the same classpath and could use instanceof checks. So in Camel 3 we had some code that performed poorly doing these kind of checks (java util streams again).

Another problem was the reactive executor which was using a LinkedList as its queue. So if you have tasks going into the queue and workers processing them in the same pace, so the queue is empty/drained, then LinkedList performs poorly as it allocates/deallocates the object constantly. By switching to a ArrayQueue which has a pre-allocated size of 16 then there is always room in the queue for tasks and no allocation/deallocation happens.

There are many more optimisations but those mentioned above where likely the biggest problems. Then a lot of smaller optimisations gained a lot combined.


Many smaller optimisations

The UUID generator of Camel is using a bit of string concat which costs. We have reduced the need for generating UUIDs in the message and unit of work so we only generate 1 per exchange.

The internal advices in the Camel routing engine (advice = before/after AOP). Some of these advices has state which they need to carry over from before to after, which means an object needs to be stored. Before we allocated an array for all advices even for those whom do not have state and thus storing a null. Now we only allocate the array with the exact number of advices that has state. (very small win, eg object[6] vs object[2] etc, but this happens per step in the Camel route, so it all adds up.). Another win was to avoid doing an AOP around UnitOfWork if it was not necessary from the internal routing processor. This avoids additional method calls and to allocate a callback object for the after task. As all of this happens for each step in the routing then its a good improvement. 

Some of the most used EIPs has been optimized. For example allows you to send the message to an endpoint using a different MEP (but this is rarely used). Now the EIP detects this and avoids creating a callback object for restoring the MEP. The pipeline EIP (eg when you do to -> to -> to) also has a little improvement to use an index counter instead of java.util.Iterator, as the latter allocates an extra object

Camel also has a StopWatch that used a java.util.Date to store the time. This was optimized to use a long value.

Another improvement is the event notification. We now pre-calculate if its in use and avoid calling it all together for events related to routing messages. BTW in Camel 3.0 the event notifier was refactored to use Java 8 Supplier's and many fancy APIs but all of that created a lot of overhead. In Camel 3.1 we have restored the notifier to be like before in Camel 2.x and with additional optimisations.

So let me end this blog by saying that .... awesome. Camel 3.1 will use less memory, execute faster by not calling as many methods (mind that we may have had to move some code which was required to be called but doing this in a different way to avoid calling too many methods).

One of the bigger changes in terms of touched source code was to switch from using an instance based logger in ServiceSupport (base class for many things in Camel), to use a static logger instance. This means that there will be less Logger objects created and it's also better practice. 


Better performance
Other improvements is that we have moved some of the internal state that Camel kept as exchange properties to fields on the Exchange directly. This avoids storing a key/value in the properties map, but we can use primitives like boolean, int etc. This also performs better as its faster to get a boolean via a getter than to lookup the value in a Map via a key. 


In fact in Camel 3.1 then during regular routing then Camel doesnt lookup any such state from exchange properties which means there is no method calls. There are still some state that are stored as exchange properties (some of those may be improved in the future, however most of these states are only used infrequently). What we have optimized is the state that are always checked and used during routing. 

Exchange getProperty(5 minute sampling)
Camel 2.25     572598   getPropety(String)
Camel 2.25     161502   getPropety(String, Object)
Camel 2.25     161502   getPropety(String, Object, Class)
Camel 2.25     141962   getPropeties()

Camel 3.0      574944   getProperty(String)
Camel 3.0      167904   getPropety(String, Object)
Camel 3.0      167904   getPropety(String, Object, Class)
Camel 3.0       91584   getPropeties()

Camel 3.1           0   getProperty(String)
Camel 3.1           0   getPropety(String, Object)
Camel 3.1           0   getPropety(String, Object, Class)
Camel 3.1           0   getPropeties()


As you can see Camel 2.25 and 3.0 lookup this state a lot. And in Camel 3.1 we have optimized this tremendously and there are no lookup at all - as said the state is stored on the Exchange as primitive types which the JDK can inline and execute really fast.

The screenshot below shows Camel 2.25 vs 3.1. (The screenshot for 3.1 is slightly outdated as it was from yesterday and we have optimised Camel since). See screenshot below:



Okay there are many other smaller optimizations and I am working on one currently as I write this blog. Okay let me end this blog, and save details for part 3.


2020-01-03

Apache Camel 3.1 - More camel-core optimizations coming

Hope all is good and you had a safe entry into 2020. 

The Camel team is already busy working on the next Camel 3.1 version. One of the goals is to continue optimize camel-core, and this time we have had some time to look into finding some hot spots in the routing engine.

One of the aspects we have looked at is also the object allocations that occurs per message that Camel routes. The JVM itself is great at allocation objects and garbage collecting them when they are no longer in use. However there are room for improvements if you can identify a number of objects that is unnecessary per EIP in the route.

So today I found several of these by just running a basic Camel route that is

from(timer:foo?period=1")
  .to("log:foo");

Which basically routes 1000 messages per second. And prints each message to the log.

One of the bigger culprits in object allocations turned out to be human logging for the reactive executor which logs at TRACE level. So by avoiding this we can reduce a great deal of allocations, and string building for logging messages.

Other aspects we have optimised is the to EIP (the most used EIP) which is now smarter in its startup to avoid creating caches that was not necessary. And this goes together with areas where we now lazy creates some features in Camel that were very rarely in use that would otherwise also setup and create some caches. 

We also identified as part of the Camel 3 work, then the LRUCache was not pre warmed up as early as before, which meant Camel would startup a bit slower than it otherwise are capable of. So by moving this warmup to an earlier phase then Camel can startup faster by doing concurrent work on startup until the LRUCache is warmed up (its caffeine cache that needs this).

The log component has also been optimised to reduce its object allocations which building the logging message.

So all together a great day and if we compare startup up a Camel 3.0.0 vs 3.1.0-SNAPSHOT with the Camel route as shown above, then we have an awesome reducing in object allocations per second (thanks to YourKit for profiler).


The profile says that in Camel 3.0.0 then Camel would roughly generate about 22.000 objects per second (routing 1000 messages). And that has been reduced to about 6.000 objects per second in Camel 3.1. That is fantastic, and is almost a 4x reduction.