The simplest is probably a simple jstat reporting tool. We take snapshots of our servers with jstat about once a minute, and save the results in the database. With the results I can generate a lot of useful information about our JVM. To do this I use a few basic cfchart commands and get some basic results like the following, detailing heap usage in the server.
The chart is useful, but it doesn't tell you exactly WHAT is contributing to instability with the Heap.
This brings us to a fun little tool: the Eclipse Memory Analyzer (MAT).
MAT is a tool for analyzing Heap Dumps. What's a heap dump? It's a snapshop of everything in your heap at the point in time the heap dump was made.
Heap Dumps are made with the jmap utility, which comes in the bin directory of every JDK as of Java 5. To do a heap dump just on a Windows machine with a Coldfusion Server running, do as follows:
- Go to the task managerr and find the processID (pid) of the coldfusion server.
- Execute "C:\mypathtojdk\bin\jmap" -dump:format=b,file=C:\MyPathToHeapDumps\heap.bin xxxx, replacing the paths where noted and the xxx will need to be replaced with the pid.
The default option is a report that automatically checks for "Leak Suspects". This is sometimes useful, but not always. With Coldfusion you'll find in any case that you need to drill down through all of the java objects that are part of the CF library before you get to the true offenders.
Here's the leak report; and yeah I know there's not much in that JVM right now. :)
There are a number of reports, and you'll just have to play around to see which one tells you the most. It varies, and depends to some extent on preference. I generally go into the Histogram, sort my objects by retained Heap, look for possible suspects, and begin following the paths to garbage collection.
One tip: at least as of CF 8 you will find that Coldfusion Applications are represented as FastHashTables in the JVM. In the following screenshot I've opened one of these FastHashTables. Inside of it is an object of coldfusion.runtime.ApplicationScope, and we can see which application it is by looking at the attributes on the left.
Obviously if one of these is consuming a lot of memory in your "retained Heap" column, then you know where the culprit lies.