f 612-605-1978

Tomcat Performance Tools

Posted by dave on Feb 22, 2013 in Blog | No Comments

Tomcat is very popular for the deployment of J2EE web applications that we write at Kettle River. As a matter of fact, the majority of our clients use Tomcat for their Java application server. Most of these applications persist their data in a relational database. Additionally these applications may use external Web services to add features or get data from external systems.

Prudent application development techniques perform well for the anticipated workload. These techniques include
• Sound relational database design
• Proper database implementation
• Database tuning including indexes and transaction management
• Proper attention to the performance of external Web services
• Adequate hardware for the application including memory and disk subsystem.
It is interesting that most of these application development techniques focus on the database. This is because in our experience this is where the most problems lie. Modern hardware (physical or virtualized) can minimize the performance impacts of bad code on processor performance. But you can’t get away from the cost of accessing a disk drive or a network to access the database. A bad database call that takes a second on average that is called every time a page is displayed is going to make for a slow application.

Why is performance important? One obvious reason that applies to internal sites is user productivity. For e-commerce sites, a slow web site can hinder conversion rates. Statistics show that a users will “bounce” or leave a slow web site with alarming frequency. Data from O’Reilly indicates that 2 second slowdown drops revenue by 4%. has a nice writeup on how speeding up a web site improves business for more niche e-commerce sites.

So performance is important. What do you do when performance is poor? Or what do you do to prove to your management that your Web application meets its scalability and performance requirements? Most developers start looking for “obvious” bottlenecks and start optimizing away. Sometimes that approach works, but most of the time this approach leads to un-necessary changes. Or the team makes changes that actually hurt performance.

At Kettle River we use two tools to determine what is causing Tomcat applications to slow down. Those tools are VisualVM and psi-probe. VisualVM provides information about memory usage, garbage collection, CPU and other vital information about an application under test. It is very handy in the situations where you have a test environment that can replicate a performance issue. “psi-probe” is useful in a production environment when you can’t replicate the issue in a test environment.

VirtualVM shows a running Tomcat instance using JMX. There are a bunch of good tutorials on how to run VirtualVM. Here are a couple of possibilities.

Connecting Visual VM to Tomcat
Running VisualVM over SSH

VirtualVM produces some pretty graphs which can give you a quick visual idea as to what is happening. Here is a sample.

VisualVM 1.3.5







“psi-probe” is a web application that deploys in the container so it does not affect your running code. “psi-probe” is available at .
If you suspect a memory issue, psi-probe can show the memory usage. The screenshot below shows a running system with 2 GB of heap allocated to the system. Note that you can advise garbage collection to the application as well as observe the memory usage. This can be quite handy.


psi-probe screenshot






Both psi-probe and VirtualVM are JMX based. So if your application is completely wrapped around the axle, you may not be able to see much. But these tools are invaluable for tracking down performance issues.

While tools like VirtualVM and psi-probe can point out issues, the key is interpretation of the results. It is easy to draw conclusions that may not be correct about the cause of a performance issue. Careful gathering of metrics and analysis are vital to a successful performance tuning program. At the risk of sounding self-service, hiring a consultant that has experience can help isolate performance issues faster and train your team on how to use these tools to their best effect.

Controlling Web Services Timeouts with Threads

Posted by dave on Feb 4, 2013 in Blog | No Comments

We have been working on performance tuning for a new release of a publicly-facing application.  Users hit the site regularly, and we average about 1.6M page views per month on the site, with a heavy concentration on a certain set of pages.  The most frequently used pages call external RESTful Web Services to provide analysis of user content.  The existing version of this site performs well enough, but we need more flexibility so we have decided to rewrite.  There are risks to a rewrite but we thought the added flexibility would be worth it.

One of the issues we have been seeing on the new release is random, sporadic slowdowns in performance.  Generally these slowdowns manifest themselves as large numbers of sockets in CLOSE_WAIT states.  These sockets were opened by Tomcat to call the RESTful Web Service.  Sometimes the slowdown is accompanied by a large spike in CPU utilization.  We first suspected garbage collection issues, but that doesn’t really explain the CLOSE_WAIT sockets owned by Tomcat.  Calling external Web services (or internal Web Services for that matter) is a common architectural approach, so we needed to test how a delay in the external Web service might get the performance problems to manifest themselves in a controlled environment.

Christine and I wrote a JMeter performance test to emulate the most frequently accessed pages of the site (more on that in a separate article).  Then I created a mock server to emulate issues with the web services not returning results quickly.  I should point out that performance issues in remote web services are hard to diagnose.  These issues are NOT necessarily related to the implementation of the web service or its client code—sometimes the internet is just not that fast.  Also, sometimes folks run DoS attacks that inflict damage on the remote web service.  Nevertheless, this test revealed some interesting results.  Running this test with 1,000 threads with random delays of 10 to 30 seconds, promptly shut off all network access to the server until the network stack recovered.  When I finally was able to log back in, I discovered over 100 sockets in CLOSE_WAIT status owned by Tomcat.  Eureka!

This test tells me that we want to insulate ourselves from having the web service take a long time to execute.  It’s time for some threading to time out the request and let us keep on trucking.

Back in JDK 1.0.2, we would write a thread pool backed by an Vector of threads.  The threads would do the actual call of the web service and the Tomcat thread would wait until notified or timed out.  I won’t bore you with the code details.  It involved writing an inner class and using synchronize() blocks with notify() and wait().  But we are using Java 6 and have java.util.concurrent at our disposal.  Let’s see how this can help us out.

We use ReCaptcha as one of our web services.  To start, I moved the code to call Recaptcha to a separate worker class.  This class implements java.util.concurrent.Callable.  Since we need a Boolean to tell us if the captcha was OK, we use the generic interface Callable<Boolean>.   The code is pretty simple.  We put the guts of the Recaptcha call in the call() method and return a Boolean.

    static class ReCaptchaWorker implements Callable<Boolean> {
        ReCaptcha reCaptcha;
        String remoteAddress;
        String recaptcha_challenge_field;
        String recaptcha_response_field;

        ReCaptchaWorker(ReCaptcha reCaptcha, 
                       String remoteAddress,
                       String recaptcha_challenge_field, 
                       String recaptcha_response_field) {
            this.reCaptcha = reCaptcha;
            this.remoteAddress = remoteAddress;
            this.recaptcha_challenge_field = recaptcha_challenge_field;
            this.recaptcha_response_field = recaptcha_response_field;
        public Boolean call() {
            boolean recaptchaOK = true;
            if (StringUtils.isEmpty(recaptcha_response_field) 
               || StringUtils.isEmpty(recaptcha_challenge_field)) {
      "Missing captcha fields.");
                recaptchaOK = false;

            if (recaptchaOK) {
                ReCaptchaResponse captchaResponse = 
                if (!captchaResponse.isValid()) {
                    recaptchaOK = false;
                    logger.error("Recaptcha error=" + 
            return recaptchaOK;

Next, I created an Executor backed by a pool of threads.  I decided to rename the threads and make them daemon, so I could shut down Tomcat without killing the process.

    ExecutorService executor;
    static int captchaWorkerCounter = 0;
    public CaptchaHelper() {
        ThreadFactory factory = new ThreadFactory() {
            public Thread newThread(Runnable r) {
                Thread t = new Thread(r);
                t.setDaemon( true );
                t.setName("CaptchaWorker-" + captchaWorkerCounter);
                return t;
        executor = Executors.newCachedThreadPool(factory);

IMPORTANT SAFETY TIP:  Don’t forget to create the Thread with the Runnable.  Otherwise your code won’t run.  Don’t ask how I know this.  I just wish I could get that hour back…

Calling the web service is easy.  I have left the exception handling out for clarity.  The key is the get() method.  It will wait for 10 seconds then time the request out.

            ReCaptchaWorker worker = new ReCaptchaWorker(
                                     reCaptcha, remoteAddr, 
            Future<Boolean> captchaTask = executor.submit(worker);
            captchaOK = captchaTask.get(10, TimeUnit.SECONDS);

Now we have a thread pool, a worker and a way to timeout the request, all without writing any notify/wait or synchronization blocks.  Pretty cool!

Why is this important? Code that performs well and handles unexpected delays is a key part of writing “boring software”.   Your users expect your code to be fast and predictable.  This technique can help your code meet those goals.

Credits:  I picked up some examples from (specifically article and found this tutorial ( by Jakob Jenkov, to be useful when I messed up the newThread code.

Tags: , ,