ColdFusion performance issues and troubleshooting
Performance issues are one of the biggest challenges to expect when designing and implementing web applications. Performance problems can disrupt your business, which can result in short and long-term loss of revenue.
The major performance issues can be categorized as - CPU hikes, Website crashes, Processing of slow requests, Memory (for example, OutOFMemory, Memory leaks) issues, Error 503/Service unavailable error, Slow performance while running database queries, SecureRandom seed generation on some Linux servers, Network latency, and other similar issues.
ColdFusion is a Java-based application server. Any Java-related change directly impacts ColdFusion. With the introduction of Java 1.8, ColdFusion had to be optimized for Java 1.8. Even after the optimization, there could be few parameters, that might cause the performance hit on the ColdFusion server.
Spikes in CPU usage are the most common performance issue, which we experienced. Usually, the absence of load and performance testing fails to predict the impact on existing CPU utilization. CPU hike can occur due to various reasons, such as:
- Out of memory issues
- Excessive Garbage collection
- Slow database query processing
- Network latency
- Linux random number generation
- Security scanner
Out of memory issues
When CPU surges/spikes are seen in your ColdFusion application, check the ColdFusion logs for OutOfMemory entries. There are two possible scenarios further:
Generally OutOfMemory :Heap not only happens because application usage is higher than the upper limit provided, but also, because a lower value of heap than actual usage could slow down the jvm .
OutOfMemory issues also occur because Garbage Collector is unable to claim memory. This could happen because of strong references to stale objects or aggressive load so that before GC cleans up OOM is thrown. The default value for Maximum JVM Heap Size is 1GB in ColdFusion (2016 release).
Based on your application’s memory usage, update the maximum heap value. Change the value in ColdFusion Administrator or in jvm .config (ColdFusionXXXX/instance_name/bin).
A new flag is available in Java 1.8 (MaxMetaspaceSize), which allows you to limit the amount of native memory used for class metadata.
In a metaspace, most allocations for the class metadata are now allocated out of native memory. By default, a class metadata allocation is limited by the amount of available native memory. Garbage collection of the dead classes and classloaders is triggered once the class metadata usage reaches the “MaxMetaspaceSize”.
Proper monitoring and tuning of the Metaspace will obviously be required in order to limit the frequency or delay of such garbage collections. Excessive Metaspace garbage collections may be a symptom of classes, classloaders memory leak or inadequate sizing for your application. If you don’t specify this flag, the Metaspace will dynamically resize depending on the application demand at runtime.
Excessive Garbage collection
Extra load on a server triggers increased GC and causes CPU spikes. There are four types of Garbage collectors.
For more information on Garbage collections please refer to:
By default, ColdFusion uses parallel GC. You can change the values in jvm.config (ColdFusionXXXX/instance_name/bin):
-XX:+UseG1GC - This is recommended when heap size is large (At least more than 4GB)
For detailed investigation of memory leaks or out of memory errors, a heap dump analysis can be very useful. Add the following jvm arguments in jvm.config(ColdFusionXXXX/instance_name/bin) to obtain heap dump:
If you have JDK installed, run the following command from \jdk\bin directory:
jmap -dump:format=b,file=dump.hprof Where pid is the ColdFusion process id.
You can use Eclipse Memory Analyzer Tool (MAT) to review heap dumps.
Slow Database query processing
ColdFusion logs (Application, Exception, error log) sometimes indicate whether your queries timeout or not. You can then identify slow queries and fix them.
ColdFusion closes the connections after the timeout. CF reuses the idle/unclosed connections. When required unless the connection is still busy executing some query. If the query execution is taking too long, it has to be a problem with either the application or the database. The idle connections are being re-used, as and when required.
Technical details about the timeout:
We take two parameters in admin for this - Timeout and interval.
CF closes a maximum of 5 timed out connections at each interval. For example, if you have 20 open connections with timeout being set to 10 and interval being set to 5, then CF will close:
- 0 connections after 5 mins
- 5 connections after 5 more mins
- 5 more connections after 5 more mins
- 5 more connections after 5 more mins
- 5 more connections after 5 more mins
So to close all the connections (as per above calculation), ColdFusion will take at least 25 mins to close all open connections. The maximum limit of closing 5 timeout connections is not configurable and is by design.
The optimized value for timeout can be set 5 and interval to 1. You can configure them further, as per your application requirement. You can change the database timeout value in CF administrator in Advanced settings of Data & Services > Datasources to optimize idle/unclosed connections.
If the application code resides and is accessed from a shared drive in ColdFusion Application, network latency can cause slow request processing, resulting in performance issues. This can even cause a server to crash/unresponsive. It is recommended to check your internal Network throughput. You can also refer to the information available on the following blog:
You may try:
Add the jvm arguments below to speed up the the processing of cfm pages on network/shared location:
Note that 30 sec is the default timeout
This enables canonical cache that caches the canonical path of a file. This helps, when there are a lot of threads waiting to get path from WinNTFileSystem. While accessing files from a network drive, each “getCanonicalPath” would end up going to network and would become quite expensive task. Enabling this cache means that for the same file, JVM would never go back to disk (till the time it is in cache) to find its path.
Linux random number generation
Random number generation and server startup is slow on Unix platforms for some of the servers. This could be because, /dev/random is used on Unix platforms for random number generation.
java.security.SecureRandom is designed to be crypto secure.
It provides strong and secure random numbers. SecureRandom should be used when high-quality randomness is important and is worth consuming CPU. We can add the below jvm argument, to get rid of performance issue due to random number generation:
If you see CPU spikes at some specific time of the day/week, this could be due to a third party security scanner interfering with your ColdFusion application. The scanner hits the server monitoring port 5500 (by default) with 0.0.0.0, which goes to infinite loop and causes server crash.
To fix this issue, we need to modify the jetty.xml at ColdFusionXXXX\cfusion\lib. Change the Server monitoring IP address from 0.0.0.0 to 127.0.0.1 and restart ColdFusion.
If your program has high codecache memory set via -XX:ReservedCodeCacheSize, you can limit it by disabling code cache flushing. If flushing is disabled, the JIT does not compile methods after the codecache fills up and hence there won’t be CPU hikes. You can add the following jvm argument. This can be used to flush code cache.
You can also disable tieredcompilation with below argument:
-XX:-TieredCompilation (Applicable only with Java 1.8. Java versions less than 8 doesn’t have tiered compilation enabled by default.)
Service unavailable error
503 - Service unavailable is a generic error. Whenever we get this error, the first thing we should check is, whether ColdFusion is started and running or not. In case you experience intermittent 503’s, then its time to investigate the less responsive server, which might be dropping requests. This could be because of Long GC pauses or any reason that could delay response from ColdFusion server. The ColdFusion connector tuning can help us to overcome service unavailable error. Below blog post can be used to tune ColdFusion connector and avoid such errors.
We have also seen some issues because of bugs in few specific update level of java. The best practice would be to keep your ColdFusion Java updated to latest version. Use the blog below to keep Java up to date.
ColdFusion thread dumps
ColdFusion thread dumps can be used to analyze New, Runnable, Blocked, Waiting, Timed_Waiting and
The issues such as Thread race, Deadlock, Hang IO calls, GC/OutOfMemory exceptions, Infinite Loop can be determined using the thread dumps. Following Blog can be used to take thread dump on a ColdFusion server:
If you are on ColdFusion 11 update 12 and ColdFusion (2016 release), you can skip copying threaddump.jar. Use takethreaddump . cfm file to capture the thread dump.
Another issue we have seen in one or two cases, If the performance is impacted by XML parsing, the jvm argument below can fix it:
Other causes of performance issues may include:
- Lack of proper database SQL tuning & capacity planning
- Application specific performance problems
- Lack of proper data caching
- Excessive data caching
- Excessive logging
In case the above steps do not resolve the issue, feel free to contact Adobe support (https://helpx.adobe.com/support/coldfusion.html) for analysis and resolution of the issue.