dinsdag 24 maart 2009
Last week I had a small conversation about Oracle Coherence on Twitter (#OraCohMDM). Here a more in depth description about the experience we've got with this product. Last year we came across a couple of projects which were an excellent case for the usage of a grid database based on the middle tier.
The first case was about processing bulk data in a small timeframe. Every minute some 150MB XML data was delivered (zipped) to our application by a Web Service. The requirement was that the data needed to be processed within 20 seconds and selected parts of the data needed to be send to to a bunch of subsribers. The architecture should be scalable for consumers and data load. For history reasons all the data needed to be stored in a database.
The second case could be described a having a high CIA constraint (Confidentiality, Integrity and Availablity). Multiple providers send data via Web Service, the messages were, contrary to the first case, small in size. The amount of messages however is very much higher, some 10-100 messages per second.
Since we know that (Oracle) databases are known to be stable, why can't we implement this with the help of a database? The answer is simple, there is a performance bottleneck between the middle tier and the database. In 'normal' situation, when dealing with for instance a web Based transaction application, the JDBC connection is able to do it's work properly. But when faced upon a situation with high data volume or high message volume, JDBC is clogging up. So how does a Grid database does a better job? Elements of success for a Grid database are the ability to scale up and down easiliy and minimize the risk of loosing data when for instance a server goes down. Preferrably there should be no master of the grid, all grid elements are equal. Simply put the Grid works as follows. A Grid is a series of interconnected nodes, which communicate via an optimized network protocol. Every Grid element has knowledge of the grid, what kind of work to do and the existence of their neighbours, just like bees in a beehive. When a new data object is entered in the Grid one Grid node takes responsibility and contacts another Grid node to be it's backup (preferably a node on another server). When a new node enters the grid, it communicates some sort of "Hello, I'm here", just like when you enter a party. All grid nodes start communicating with the new node, and (some) data is redistributed across the grid. This in order to minimize loss of data. When a node leaves the grid and again the (some) data is redistributed, this to avoid that a Data Object is only available on one node.
Oracle obtained Coherence (Tangosol) in 2007 because of it's excellent middle tier Grid capabilities. We've performed tests with Oracle Coherence in different hardware and network configurations. Our tests have shown that Coherence nearly scales linearly, with an optimum of around one node (or JVM) per CPU core. The addition of the JRockit JVM even removes further a smart part of the performance.
So how do you start with Oracle Coherence. Coherence is Java based, so a good knowledge of Java is essential. The data structure is stored as key/value pairs in maps (comparable to the Database tables). On each map a Listener can be placed, so an object change results into an event. By adding Object-Relational mapping (for instance with TopLink) you can percolate the newly added data into the database, or load the data from the database. By adding expiration time to the data within Coherence, you don't have to clean up your data.
Where to get more information
- SOA Patterns, deferred service state
- Oracle Coherence
- Oracle Coherence Incubator
Thanks to Oracle PTS (especially Flavius Sana & Kevin Li)
dinsdag 10 maart 2009
We are moving away from a single application landscape towards a more distributed service oriented landscape. This has impact on the way we have to deal with Data Integrity. In the 'old days' we were confined to one dataset, in which we could check our data integrity by means of simple programming with triggers and constraints, or the addition of a Business Rule Engine which is bound to the datamodel. The only discussion we had was about the place where rules were checked (in the screen or in the database).
I am currently doing (nice) work for a company where they are moving away from their ‘single application’ back-offices (one being USoft, a Dutch company which used to be known as the Business Rule company) towards a Service Oriented environment. The landscape is going to be separated in domains, based upon ownership of the different functionalities.
Business Rule thinking is part of my clients DNA, so we really have to define a thorough solution when validating rules across domains.
The rules of engagement are
-Domains are independent
-Transactions executed in one system may not be hindered by restrictions defined in another system
-Business rules are defined and owned by the domains. These rules can be used across domains, but are always started by the domain who owns the rule.
How do we deal with cross domain Business Rules?
Consider for instance two domains, Relation and Education. Within the Education domain we have a business rule 'An active education can only be connected to an active relation'. It is the responsibility of the Education domain to uphold this rule. When the relation is de-activated within the Relation domain, this is allowed, since the Relation domain is not responsible for the Education domain. So at this point in time we have an invalid situation.
How to deal with this?
We could agree that the Relation domain sends information about the de-activation to a central messaging mechanism where other domains can subscribe for these type of messages (i.e. the Education domain). The Education domain is then responsible for taking actions in order to create a valid situation. This could be the closing of the Education belonging to the Relation. Part of the Governance of the Education domain is then that periodically should be checked if open educations exist which belong to closed relations. This to be sure that invalid situations have occurred and the subscription mechanism did not work right.
The ease of checking our Business Rules and uphold the data integrity in a single application environment is gone in the distributed service oriented world. Who has got experiences/ideas how to deal with the cross domain validation problem?
11G ADF Architceture Architecture BAM birdwatching BPMN Business Process Management Business Rules Capgemini Case Management CEP Coherence CORA cradle to cradle crane Data Grid Data Refinery Design EDA Event Forms Green label IT Modernization Open Source Open Standards Oracle Power grid of applications Process Revitalize Risk roadmap SBA SCA SOA Software Archeology Software Engineering Stability Based Architecture Strategy Sun Sustainable society unen Vendor Based Technology XML