Monthly Archives: February 2008

Next Coherence Special Interest Group Meeting

Back again for 2008, the next Coherence SIG will be held on Wednesday the 9th of April at the usual place, the Oracle offices in London. We plan on starting at the usual times; registration at 14:00, with first session beginning at 14:30. The schedule and registration will open in a few days.

Reserve a place in your calendars…

Solid State Disks and Oracle Coherence: Start Your Engines Now

Billy Newport (IBM) has a nice post on his blog about the possible demise of Data Grids and XTP with the advances being made in Solid State Disks (SSD).  He raises some very valid points about the potential improvements SSD deliver over traditional disk systems, in particular performance, something that Data Grids and XTP try to address.  So the question is… is Data Grid doomed?  Well, not exactly.  As Billy points out, one of the core issues that Data Grids attempt to address is that of scalability, and SSD doesn’t solve this.  No matter how much of the stuff you think you can stuff into a single (or collection of) servers, the stuff has to be managed, and that essentially means partitioning.

So how does Oracle Coherence fit into this discussion?  How is SSD addressed?  Well put simply, Oracle Coherence (once Tangosol Coherence) was essentially designed from day one (over seven years ago now) to virtualize storage.  In fact Coherence itself doesn’t care what or where the actual underlying data is stored. Coherence will happily manage and automatically partition data across the configured storage, whether it be disk, ram or SSD.  Essentially Coherence virtualizes data management.

While most users adopt the standard practice of using out-of-the-box in-memory storage, the storage sub-system of Coherence can be completely customized and actually ships with several non-memory-based alternative mechanisms for managing data, including a scheme that lets you manage data outside the Java heap, say on ram-disk… SSD here we come!  This is not an all or nothing option.  Individual data storage areas (often called cache regions or domains, but in Coherence terms are called NamedCaches) can use different schemes all within the same cluster of applications.  Better still, the schemes can be composed from other schemes to essentially produce a plethora of other schemes… essentially making storage options virtually limitless (no pun intended :P)

eg: If you want your application to keep recently used data from the grid managed in your local process, but the rest of the data for the grid to be stored directly on disk (say SSD), you can use what’s called the “near-scheme”.  The near-scheme typically combines a local-scheme (for your local data) and some other scheme, like an external-scheme (say using off-heap-memory-mapped storage) for the grid data.  You could even compose a scheme with out-of-the-box Coherence such that recently used data is kept in the Java heap, the next on SSD and then next, in say flat-files, or on another site.

Here’s a brief list of some of the standard Coherence schemes; local-scheme, distributed-scheme, replicated-scheme, near-scheme, optimistic-scheme, overflow-scheme, disk-scheme, read-write-backing-map-scheme (usually used for database integration), external-scheme (often file-based), paged-external-scheme, remote-scheme… and so on.

If it somehow turns out that Coherence doesn’t support how you would like to have data stored, on the devices, technologies or sites that you desire, you can roll-your-own scheme… using the class-scheme.  More information on the caching schemes in Coherence are available here.

Do people actually do this?  Yep.  All of the time.  All of the investment banks in London that use Coherence in some form (and that’s most of them), customize the storage schemes to suit their applications and infrastructure, including a whole bunch that make use of off-heap, memory-mapped files (the next best thing to SSD) for managing data.

Want to know more?  If you’re in London drop by my talk at QCON 2008 where I’ll be talking about this kind of stuff:  Pimp My Data Grid:  New things to do with a Data Grid to deliver better application performance, scalability and resilience.

Turbo-Charging Applications with Data Grids

One of the truly great things about the Tangosol acquisition by Oracle – and so far there have been no bad things, only challenges 😉 – has been “being a part of the team scaling out the Tangosol organization through out Oracle”.  A crucial part of this work, that of which I’ve been heavily involved, has been training up the great solutions architecture folks at Oracle, basically all around the world.

While it’s sometimes been personally challenging being on the road for many months at a time – like regularly crossing three different time-zones a week – it’s great to see the proverbial army of Oracle Coherence solutions architects emerging, in almost every country and every solution domain.  Like scaling out Coherence itself, Oracle has been quietly scaling-out the technical know-how of Coherence through-out the organization globally, just as effectively, simply by adding highly experienced and dedicated resources… and lots of them, in parallel, everywhere at once.

One of the countless passionate-about-solutions-architecture people I’ve met and worked with at Oracle is Tim Middleton, based in Perth, Australia.  Tim’s got a truck load of experience with Java and Database technology, in almost every layer and across many domains.  For the past few months, actually since I met him in August 2007, he’s been instrumental in working with Oracle Coherence in Asia Pacific (together with a bunch of other great people), but has recently started publishing articles on Data Grids – and turbo charging applications.

Below is a link to his first article “Turbo Charging Applications…”, recently published by the Java Developers Journal.  It’s a nice introduction to Distributed Caching and separately, Data Grid concepts, the challenges faced by architects, solutions and code examples.  Take a read.