Monitoring any data grid running on a few physical servers is usually a pretty trivial task. Most solutions usually provide some kind of “console” or “gui” that is capable of displaying a few simple values, or it’s usually easy to setup JMX on each JVM and use say JConsole etc. The real challenge arrives when you scale-out a data grid past a few servers (often what people do with Coherence – like 20, 50, 200 or 500+ JVMs) and you need to “visualize” what is happening, especially in real time (ie: with say second accuracy). To be honest, doing this with JConsole just doesn’t cut it for big clusters. What you might use to monitor a few servers or JVMs often doesn’t work when you have a few hundred. GUI design alone becomes and issue – how do you visually layout information so that it’s useful?
One of the nice features of Oracle Coherence is that it provides almost every statistic imaginable through an out-of-the-box clustered implementation of JMX. What is clustered JMX? It basically means that you don’t need to hook-up a JMX console/connection separately to monitor each JVM in a cluster (a complete pain), but you can connect a single JMX console to a single JVM in a cluster and via that JVM access all of the aggregated information about every JVM in the cluster (regardless of how big the cluster is and without having to reconfigure it as the cluster size changes at runtime). While this makes collecting information about a large cluster just as easy as monitoring a single JVM, visualizing the relationships and potential correlations in a complex clustered system often requires more that something like JConsole.
While there are several options available for visualizing JMX information presented by solutions such as Oracle Coherence, including the impressive SL RTV (real time view), Wily Tech Introscope and Oracle’s own Enterprise Manager, a new player has entered the market in the form of ClearStone Live from Evident Software.
As presented at the last Oracle Coherence SIG (in London) by Rob Minaglia and Ivan Ho of Evident Software (an Oracle Partner), ClearStone Live has been explicitly designed from the ground up to manage and visualize large volumes of real-time grid-based information, especially like those that use Oracle Coherence, in a simple and efficient manner.
While it may seem relatively straight forward to visualize and graph information about a grid, one of the biggest challenges (as explained by Ivan) is how to collect, store and report on that information in a real-time manner (say with a second accuracy). They had to build infrastructure to cope with these kinds of demands, both for real-time capture but also for real-time interactive visualization. As explained, basically it’s pointless to be performing 1000′s, 10,000′s or 100,000′s of transactions per second if you can only monitor a system in 30 second (or greater) snap-shots.
To achieve the kinds of performance and throughput required by customers, Evident Software adopted a novel approach – ClearStone Live uses its own Data Grid (based on Oracle Coherence) to manage data. That is, ClearStone Live uses (embeds) Oracle Coherence internally to manage and report on up to 24 hours of real-time information.
Ok that’s cool, but probably the most impressive part in the sneak preview was the extremely rich interface (based on Adobe Flex)… oh and the support for simultaneously monitoring multiple clusters – perfect for multiple grid-based applications or clusters running on multiple/remote sites!
Here are some quick screen shots from the live demonstration of “Live”.
The first visualization is what Evident Software calls “a health visualization” (see below). It essentially shows a quick view of the number of objects, caches, servers, clients, connections, memory utilization and capacity in a Coherence Cluster.
Of course there are a whole bunch of metrics you can select to have displayed in your “health visualization”.
Additionally you can also display the performance characteristics – of course configurable from a variety of sources – all in the one chart. The neat feature here (like in most places), was the ability to scroll backwards in time over through collected information and dynamically reconfigure the charts on-the-fly.
But one of my favorite and possibly the most useful interactive features is that of the annotated charts. They are a bit like the Google Financial charts in that as you scroll backwards through time, ClearStone Live will highlight important events that occurred in the life of the cluster – for example a new server joining – so that you can correlate those events to the impact on other parts of the system.
And last, but certainly not all of the available visualizations, was the “heat map”. This was beautiful in its simplicity in that you could use it to highlight “heat” on either a Cache, Service or JVM level, including the ability to control the color ranges and thus how “hot” things appeared.
While the product is still in alpha (our viewing at the SIG was only a sneak preview), it was really impressive.
I’m certainly looking forward to being able to visualize both application specific and custom MBeans, correlated against cluster-wide Java Platform information (like JVM GC’s etc) when this and Coherence 3.4 become publicly available.
But most importantly, it’s yet more options for those using Coherence to perform enterprise level monitoring.