Category Archives: javaspaces

Javaspaces

Next London Oracle Coherence SIG – The Autumn Edition

The agenda for the next London Oracle Coherence SIG event – The Autumn Edition – on the 15th of October has now been finalized. The theme for the event is “Data Grid Patterns” and will cover some of the more advanced architectural patterns commonly seen in applications and systems using Oracle Coherence in the field.

Registration for the event is here.  Please note:  If you’re not a member of the UK OUG, you may register as a “visitor”.

Three technical talks will be presented.  The first talk will be an introduction to the Data Grid “Command Pattern” – an extension and implementation of the standard Command Pattern but applied to Oracle Coherence.  This is an extremely useful pattern that I’ve seen adopted in many projects around the world, especially in trading and workflow-based systems.  Specialized implementations of this pattern have been used to implement some very large, globally recognized and complex trading systems – so it’s not ‘research’.   It’s very fast, highly scalable (I’ll give figures at the talk), and very resilient.  In the past 18 months I’ve personally used this pattern to help several customers migrate from competitor-pushed architectures where they’ve ended requiring massive hardware spend to avoid poor performance and scalability.  For less than 100k of Java, this pattern could save you a lot of time and money.

The second talk will be an introduction to an implementation of Store and Forward Messaging on Coherence.  ie: a JMS-like implementation (about 30k of Java) that completely avoids the need for individual messaging-servers (ie: hub-based architectures) but provides all of the usual messaging requirements like ordering and guaranteed delivery.  As Nicholas Gregory says,  “it’s hub-less messaging” with the ease, scalability and high-availability of Coherence.  The talk will go into some depth about the implementation (based on the Command Pattern) and how it may be easily extended to perform all kinds of in-process, in-order, guaranteed message (or financial order) processing.  The implementation is based on my earlier ideas about modeling Messaging systems as Financial Exchanges.

The last talk will be an introduction to a new pattern for global cluster replication called the “Push Replication Publishing Pattern”.  As a simple extension to the Store and Forward Messaging Pattern, the talk will cover in detail how to configure, embed and further extend the implementation to solve a range of WAN-based replication requirements. As with the previous pattern talks, we’ll discuss about how Push Replication is being used in the field (for global reference data management and synchronization).

For each of the talks, complete source code and documentation will be made available.

Look forward to seeing you at the event.

The Crying Game: When technology adoption goes horribly wrong

Over the past few years I’ve seen several financial system projects go horribly wrong (well almost), not through any fault of the architects and engineers, but due to what I like to now call an unfortunate “Crying Game” experience.

If you haven’t seen the movie “The Crying Game” and would like to enjoy it without me giving the twist away, stop reading now! If you’re not going to see the movie, you can get a brief synopsis here. If you’ve seen it, enjoy the following while thinking about the twist in at the end of the movie.

In essence a “Crying Game” experience occurs when you’ve made an assumption about the “fitness of purpose” of a technology component, either commercial or open-source and you discover just as you’re about to “go live” (or bet your business on it) that it won’t perform, scale or behave as you expected, regardless of what you do to it, how much source-code you’ve got or what other people and vendors have lead you to believe.

You’ve invested a bunch of time, effort, energy and for that matter passion into a technology only to unexpectedly discover that what you trusted is now completely unacceptable to you. It’s worked perfectly in all of the situations you’ve investigated, you’re very happy with you’re choice, but the “making it happen for real” experience goes horribly wrong due to an “unexpected surprise”.

We’re not talking about a “surprise” like the logging format isn’t quite what you wanted. We’re not talking about a technology that’s been working for a while and is slowly failing some SLA. We’re talking about new information that you discover for yourself as you attempt to go live or early in production that breaks all of your previous assumptions, your architecture and ultimately risks your business.

Like in the movie you’re left with very few choices;

a). Abandon it. Wash your hands of the situation. Walk away and start again, potentially not telling anyone about your potentially embarrassing experience. Obviously this is not often an option in the technology business. Architects that do this soon become well known – all for the wrong reasons.

b). Accept the situation and attempt to work with what you’ve got. You’ll never be truly comfortable and will inevitably feel you’ve been ripped off or let down. If you’ve using a commercial solution you can obviously work with the vendor to “resolve” the issues. If you’re using an open-source solution you can “fix it yourself” or “contribute a solution to better the community”. Alternatively you can get “help” from someone experienced in such matters. Ultimately “surgery” is being proposed. And as we all know, last minute surgery, especially DIY, is going to be messy. You’ll end up with a mutant.

c). Get angry, argue with the solution provider and then replace it with something more appealing – possibly telling everyone about your experience.

d). Change your business to match how the solution provider believes you should work. Again, not usually an option.

Regardless of the decision you take disappointment may reign for a long time. “You’ll never do that again” right?

Crying Games seem to occur for the following reasons;

1). Lack of adequate testing / domain knowledge / technical experience.

While it’s easy to say “you didn’t test enough”, there is only some much testing you can do. There comes a point when you simply have to trust the information you have (from your tests) and trust your expert judgment.

Given the Crying Games I’ve seen I would never claim that the architects, engineers and projects failed in their duty to test. Far from it. I would say they tested more than adequately and decisions were soundly based on the information they had at their disposal.

2). Lack of adequate references.

Taking references is always important to avoid unexpected and costly surprises. Unfortunately “references” these days are always “positive” right? It’s very hard to find referees that offer full disclosure, especially if there is the thought that they may be flamed, harassed, bullied or sued by a vendor / community. The worse kind of referee is one that has been paid-off. ie: gets a discount on their adoption to offer references in the future for others.

If you can find a previous partner that will give you a truthful reference you’ll have a much greater chance of avoiding a Crying Game with your new architecture!

3). Lack of adequate interrogation.

Sounds nasty, but there is nothing like asking tough questions of solution providers. eg: What was your worst loss? Why? What was your best win? Why? What is the worst part of the product? Why? What is the best part? Why? When shouldn’t you use this product? Why? Who are your competitors? Why are they competitors? What deals have you lost to them? Why? Can I talk to them? Which deals did you win from them? Why? What would be a disaster to your business? What is your plan to avoid this? Who do you depend on? Why? Can I talk to them? What relationships / components do you depend on? Why? Have you done this before? When? With whom?

I think you get the picture. Attempt to find out what they are like “without all of the marketing makeup”. Interrogation reveals a lot of information. Do it fast. Be up front. Don’t drag it out. If you can, and you might not be able too, ask the same questions of the provided referees.

4). Lack of integrity.

There are basically two forms of deception; deliberate and accidental. While both are very worrying, being deliberately mislead during the process of adopting software or hardware for an architecture is less likely to be forgiven, especially if your business (or employment) are at risk.

Given the recent projects on which I’ve been involved, it seems that beyond everything else integrity is key. If you can push technology egos to the side, ensure that marketing and sales objectives are not driving factors in your decisions, you’ll probably avoid a Crying Game.

If you ever have concerns about technology providers or your decisions, let people know. It’s not a sign of weakness, but strength. It’s not doubt, but a chance to test early. Always try to break what you have. Break your own ideas. Break others. Fail Fast.

Failing slow seems to put architectures on a path to a Crying Game.

Messaging Systems as Financial Exchanges?

Over the past few 10 months I’ve been involved in a number of challenging enterprise projects that have, to put it simply, had to replace standard asynchronous messaging architectures (like JMS) and Space-based implementations in order to stay in business (meet SLAs etc). One project in particular, where the team had invested over 11 months of their lives with a Space-based implementation, only to find that the vendor supplied architecture did not scale, lost transactions, failed to meet SLAs and ultimately did not make it into production (but that’s another story) lead me to believe that the way we understand, implement and teach scalable system design is, well…. broken (or heavily dependent on the concept of a Transaction – which aren’t as scalable as people seem to assume).

Rather than talking about “what went wrong”, which I’ll be doing over time on this blog (and at Javapolis in December 2007), I thought it might be interesting to reflect on the fundamental challenges of messaging systems, message-based architectures and how they are implemented.

Given my five or so years engineering experience with trading exchanges and automated trading systems (by no means an expert, by no means a pup, but I’ve implemented a couple) I can safely make the following observations.

  1. Most trading systems seem to scale better, have better performance and availability profiles than most implementations of JMS and Javaspaces (well the one’s I’ve worked on seem to anyway).
  2. Almost all successful trading systems seem only to make use of JMS (and Javaspaces if adopted) for system integration ie: exchange -> front office -> middle office -> back office. They are not integral to the exchange / matching processing as they tend to have very high-latencies (seconds not milliseconds). Let me make this very clear – such systems don’t use these technologies in core business logic. Personal experience suggests that even modern messaging/space-based systems have 10x to 1000x higher latencies than that typically required.
  3. Trading systems get their performance and scalability by using different architectural approaches – eg: Avoiding JEE, multi-phase transactions / multi-pass hand-shake protocols / going to disk etc and rely more on Recoverable Computing.

Why should I care? Why does this matter?

It’s pure personal frustration I guess. There’s nothing wrong is JMS, Javaspaces etc. They have a purpose and that purpose is typically integration or ordering of events.

I guess I’m routinely starting to see that the fundamental premises of stateless-ness and bus-based architectures is failing us as we demand scalability etc.

Every week I work with different projects, companies, architectures and architects, all of which face the challenge of delivering predictably scalable systems (10x to 1000x), with mandatory requirements such as high-availability (sub 1, 2 or 3 second recovery) and high-performance (1 to 2 millisecond response time)… to be delivered tomorrow – or in some cases, that afternoon!

In most circumstances I see that the aforementioned design approaches, inter-connected systems via message-buses (put “enterprise” in front of that if you like) or in some very rare cases, a space-based approach, where by messages/entries are placed in a queue/space/topic, written to disk, read out and delivered to a consumer (within a transaction) is practically ensuring all of the requirements mentioned above, simply can’t be met.

So here’s my challenge. If we are so reliant on messaging, why don’t we have implementations that operate in the same manner that we build financial exchanges? Why don’t we learn from lessons of engineering financial trading systems? On one side you have “sellers” – ie: Producers / Publishers / Writers and on the other side you have “buyers” – Consumers / Takers etc. We then simply “cross” (match) the “buyers” and “sellers”. Simple really. In fact the conditions for message delivery are much simpler than those of financial exchanges / automated trading / matching systems, but achieve the same result. Buyers buy, Sellers sell, Publishers publish and Consumers consume.

Perhaps the next JMS / Spaces implementation will be like this… but grid enabled.