Udi Dahan   Udi Dahan – The Software Simplist
Enterprise Development Expert & SOA Specialist
 
  
    Blog Consulting Training Articles Speaking About
  

Archive for the ‘Autonomous Services’ Category



Inconsistent data, poor performance, or SOA – pick one

Sunday, September 18th, 2011

One of the things that surprises some developers that I talk to is that you don’t always get consistency even with end-to-end synchronous communication and a single database. This goes beyond things like isolation levels that some developers are aware of and is particularly significant in multi-user collaborative domains.

The problem

Let’s start with an image to describe the scenario:

Inconsistency

Image 1. 3 transactions working in parallel on 3 entities

The main issue we have here is that the values transaction 2 gets for A and B are those from T0 – before either transaction 1 or 3 completed. The reason this is an issue is that these old values (usually together with some message data) are used to calculate what the new state of C should be.

Traditional optimistic concurrency techniques won’t detect any problem if we don’t touch A or B in transaction 2.

In short, systems today are causing inconsistency.

Some solutions

1. Don’t have transactions which operate on multiple entities (which probably isn’t possible for some of your most important business logic).

2. Turn on multi-version concurrency control – this is called snapshot isolation in MS Sql Server.

Yes, you need to turn it on. It’s off by default.

The good news is that this will stop the writing of inconsistent data to your database.
The bad news is that it will probably cause your system many more exceptions when going to persist.

For those of you who are using transaction messaging with automatic retrying, this will end up as “just” a performance problem (unless you follow the recommendations below). For those of you who are using regular web/wcf services (over tcp/http), you’re “cross cutting” exception management will likely end up discarding all the data submitted in those requests (but since that’s what you’re doing when you run into deadlocks this shouldn’t be news to you).

The solution to the performance issues

Eventual consistency.

Funny isn’t it – all those people who were afraid of eventual consistency got inconsistency instead.

Also, it’s not enough to just have eventual consistency (like between the command and query sides of CQRS). You need to drastically decrease the size of your entities. And the best way of doing that is to partition those entities across multiple business services (also known in DDD lingo as Bounded Contexts) each with its own database.

This is yet another reason why I say that CQRS shouldn’t be the top level architectural breakdown. Very useful within a given business service, yes – though sometimes as small as just some sagas.

Next steps

It may seem unusual that the title of this post implies that SOA is the solution, yet the content clearly states that traditional HTTP-based web services are a problem. Even REST wouldn’t change matters as it doesn’t influence how transactions are managed against a database.

The SOA solution I’m talking about here is the one I’ve spent the last several years blogging about. It’s a different style of SOA which has services stretch up to contain parts of the UI as well as down to contain parts of the database, resulting in a composite UI and multiple databases. This is a drastically different approach than much of the literature on the topic – especially Thomas Erl’s books.

Unfortunately there isn’t a book out there with all of this in it (that I’ve found), and I’m afraid that with my schedule (and family) writing a book is pretty much out of the question. Let’s face it – I’m barely finding time to blog.

The one thing I’m trying to do more of is provide training on these topics. I’ve just finished a course in London, doing another this week in Aarhus Denmark, and another next month in San Francisco (which is now sold out). The next openings this year will be in Stockholm, London; Sydney Australia and Austin Texas will be coming in January of next year. I’ll be coming over to the US more next year so if you missed San Francisco, keep an eye out.

I wish there was more I could do, but I’m only one guy.

Hmm, maybe it’s time to change that.



The Danger of Centralized Workflows

Wednesday, July 13th, 2011

It isn’t uncommon for me to have a client or student at one of my courses ask me about some kind of workflow tool. This could be Microsoft Workflow Foundation, BizTalk, K2, or some kind of BPEL/orchestration engine. The question usually revolves around using this tool for all workflows in the system as opposed to the SOA-EDA-style publish/subscribe approach I espouse.

The question

The main touted benefit of these workflow-centric architectures is that we don’t have to change the code of the system in order to change its behavior resulting in ultimate flexibility!

Some of you may have already gone down this path and are shaking your heads remembering how your particular road to hell was paved with the exact same good intentions.

Let me explain why these things tend to go horribly wrong.

What’s behind the curtain

It starts with the very nature of workflow – a flow chart, is procedural in nature. First do this, then that, if this, then that, etc. As we’ve experienced first hand in our industry, procedural programming is fine for smaller problems but isn’t powerful enough to handle larger problems. That’s why we’ve come up with object-oriented programming.

I have yet to see an object-oriented workflow drag-and-drop engine. Yes, it works great for simple demo-ware apps. But if you try to through your most complex and volatile business logic at it, it will become a big tangled ball of spaghetti – just like if you were using text rather than pictures to code it.

And that’s one of the fundamental fallacies about these tools – you are still writing code. The fact that it doesn’t look like the rest of your code doesn’t change that fact. Changing the definition of your workflow in the tool IS changing your code.

On productivity

Sometimes people mention how much more productive it would be to use these tools than to write the code “by hand”. Occasionally I hear about an attempt to have “the business” use these tools to change the workflows themselves – without the involvement of developers (“imagine how much faster we could go without those pesky developers!”).

For those of us who have experienced this first-hand, we know that’s all wrong.

If “the business” is changing the workflows without developer involvement, invariably something breaks, and then they don’t know what to do. They haven’t been trained to think the way that developers have – they don’t really know how to debug. So the developers are brought back in anyway and from that point on, the business is once again giving requirements and the devs are the one implementing it.

Now when it comes to developer productivity, I can tell you that the keyboard is at least 10x more productive than the mouse. I can bang out an if statement in code much faster than draggy-dropping a diamond on the canvas, and two other activities for each side of the clause.

On maintainability

Sometimes the visualization of the workflow is presented as being much more maintainable than “regular code”.

When these workflows get to be to big/nested/reused, it ends up looking like the wiring diagram of an Intel chip (or worse). Check out the following diagram taken from the DailyWTF on a customer friendly system:

stateModel

The bigger these get, the less maintainable they are.

Now, some would push back on this saying that a method with 10,000 lines of code in it may be just as bad, if not worse. The thing is that these workflow tools guide developers down a path where it is very likely to end up with big, monolithic, procedural, nested code. When working in real code, we know we need to take responsibility for the cleanliness of our code using object-orientation, patterns, etc and refactoring things when they get too messy.

Here is where I’d bring up the SOA/pub-sub approach as an alternative – there is no longer this idea of a centralized anything. You have small pieces of code, each encapsulating a single business responsibility, working in concert with each other – reacting to each others events.

Productivity take 2: testing and version control

If you’re going to take your most complex and volatile business logic and put it into these workflow tools, have you thought about how your going to test it? How do you know that it works correctly? It tends to be VERY difficult to unit-test these kinds of workflows.

When a developer is implementing a change request, how do they know what other workflows might have been broken? Do they have to manually go through each and every scenario in the system to find out? How’s that for productivity?

Assuming something did break and the developer wants to see a diff – what’s different in the new workflow from the old one, what would that look like? When working with a team, the ability to diff and merge code is at the base of the overall team productivity.

What would happen to your team if you couldn’t diff or merge code anymore?
In this day and age, it should be considered irresponsible to develop without these version control basics.

In closing

There are some cases where these tools might make sense, but those tend to be much more rare than you’d expect (and there are usually better alternatives anyway). Regardless, the architectural analysis should start without the assumption of centralized workflow, database, or centralized anything for that matter.

If someone tries to push one of these tools/architectures on you, don’t walk away – run!



Service Boundaries Aren’t Process Boundaries

Sunday, July 3rd, 2011

boundariesRichard Veryard blogged about the topic of service boundaries in SOA, specifically asking why aren’t more people talking about service boundaries – especially if they’re such a core principle in SOA.

I can only speak for myself on this one, but I guess it’s that there’s just so many times you can repeat yourself.

So, why this post?

Well, Richard was able to dig up an old (2004) presentation I gave about SOA in which I said:

“Services run in a separate process from their clients
A boundary must be crossed to get from the client to the service – network, security, …”

And 7 years later I can say, hand on heart, I was wrong.

Luckily, I’ve spent much of those past 7 years trying to correct that recommendation. One blog post in which I tried to do that (in mid-2007) was On Intermediation and SOA in which I described the relationship between systems (i.e process boundaries) and services:

“all of these “systems” might just end up within the same service, or having parts of them being used by multiple services

There can also be multiple services (or, more accurately, parts of multiple services) deployed together in the same system/process.

And this is nothing new – in the 4+1 Architectural View Model by Philippe Kruchten (1995) we can see very clearly the differentiation between the Logical View (our services) and the Physical View (a.k.a the Deployment View).

These views are orthogonal to each other – multiple elements from one view can map to a single element in another view (and vice versa).

This, if anything, makes it that much harder to identify service boundaries – if they have nothing to do with the existing applications and systems, then what are they? In my blog post on The Known Unknowns of SOA I point to the fact that Business Capabilities are much more appropriate constructs than, say, web services which (as it says in the referenced post) “[are] merely a standardized approach to accessing functionality on remote systems”.

As I bring this post to a close, I’m feeling more comfortable rehashing material I’ve published before:

Logical and Physical Architecture

and the rest of the SOA category on my blog here.

Happy boundary hunting.



The Known Unknowns of SOA

Monday, November 15th, 2010

rumsfeldOne of the better known analysts in the enterprise software area, JP Morgenthal, wrote this post about the relationship between SOA, BPM, and EA. In it he defines SOA as follows:

“SOA is a practice that focuses on modeling the entities, and relationships between entities, that comprise the business as a set of services. This can be done on a small or large scale. Typically, the relationships in this model represent consumer/provider relationships.”

I have some serious concerns about the ramifications of this definition/description.

First of all, when reading “entities”, many people will interpret that to mean the entities found in Entity Relationship Diagrams [ERD] or in Object Oriented Analysis & Design [OOAD]. In both, these entities are identified as the “nouns” of the domain. Examples of these ERD/OOAD-type entities include things like Customer, Order, and Product.

These are almost always the wrong place to start for identifying services in SOA.

Second, on the consumer/provider relationship: on the one had, this fits very well with how web services can consume (or call) other web services. However, the downsides of using web services as services in SOA is becoming well enough known that even in the same post we see this warning:

“Web Services is not SOA, it is merely a standardized approach to accessing functionality on remote systems.”

But the question remains, if a producer/consumer relationship is OK for SOA-type services, why doesn’t that hold for web services? And the answer is… it depends on the type of producer/consumer relationship. The typical relationship is one of synchronous calls from consumer to producer, this is not OK for SOA-type services either.

You see, this synchronous producer/consumer implies a model where services are not able to fulfill their objectives without calling other services. In order for us to achieve the IT/Business alignment promised by SOA, we need services which are autonomous, ie. able to fulfill their objectives without that kind of external help.

Instead, we need to look for a more loosely coupled producer/consumer relationship – like publish/subscribe, where the producer emits events, and the consumer subscribes and handles those events. The reason that this kind of relationship doesn’t hurt autonomy is that it disconnects services on the dimension of time. In order for a service to be able to make a decision autonomously without synchronously calling any other service, using only information provided by events it received in the past, it must be strongly aligned with the business.

Most projects which bandy about the SOA acronym aren’t actually made up of services – they’re made up of XML over HTTP functions calling other XML over HTTP functions, eventually calling XML over HTTP databases. You can layer as much XML and HTTP as you want on top of it, but at the end of the day, most projects are just functions calling functions calling databases – in other words, procedural programming in the large, and no amount of SOAP will wash away the stink.

Here’s a different definition of services for SOA that may communicate a bit better what it’s all about:

A service is the technical authority for a specific business capability.
Any piece of data or rule must be owned by only one service.

What this means is that even when services are publishing and subscribing to each other’s events, we always know what the authoritative source of truth is for every piece of data and rule.

Also, when looking at services from the lense of business capabilities, what we see is that many user interfaces present information belonging to different capabilities – a product’s price alongside whether or not it’s in stock. In order for us to comply with the above definition of services, this leads us to an understanding that such user interfaces are actually a mashup – with each service having the fragment of the UI dealing with its particular data.

Ultimately, process boundaries like web apps, back-end, batch-processing are very poor indicators of service boundaries. We’d expect to see multiple business capabilities manifested in each of those processes.

I know that this may be more confusing than the traditional web services approach but, to paraphrase Donald Rumsfeld, it is better to know that you don’t know, than to not know that you don’t know 🙂



Logical and Physical Architecture

Monday, November 8th, 2010

orthogonalOne architectural misunderstanding I see repeatedly in my work with clients is in the relationship between logical and physical architecture. The most common building-block of these misunderstandings is the web service (or it’s “upgraded” .net counterpart – the WCF service).

Don’t get me wrong, sometimes there is a place for a web service, just not everywhere.

So, what’s the problem?

Well, when developers and architects use web services as the building blocks of their designs, they are creating the same architecture for both the logical and physical elements of their system. Back in 1995, Philippe Kruchten documented his 4 + 1 Architectural View Model in which he outlined 4 + 1 different views that should be used to describe an architecture.

Even though since 1995 the number and types of recommended views of software architecture has evolved (with things like the Zachman Framework for enterprise architecture numbering some 30 views), there is broad agreement that (at the very least) the logical and physical artifacts should likely be designed differently.

Just because two distinct logical components have been identified in the architecture, that doesn’t necessarily mean they should be hosted separately (for example by making each one a web/wcf service). In fact, there are significant disadvantages to doing so (as described in the Fallacies of Distributed Computing).

In some cases, this mistake is exacerbated by a mistaking these components with SOA-type services, resulting in an attempt by developers to have each component have its own contract, which can then be independently versioned. This often results in the need for transformation between the structure of these so-called contracts, but not within the components themselves (oh-no, they’re “autonomous”), but rather in between them using some kind of “ESB” technology.

This architectural style is known as the Broker, Hub and Spoke, Mediator, and most importantly – not SOA. If you find a technology that fits this style perfectly (like BizTalk), that technology is not a Bus, not a Service Bus, and definitely not an Enterprise Service Bus.

One of the problems of this approach is that when any “service” contract changes, you have to change all the transformations in your broker that involve it. Unfortunately, most brokers have no unit-testing facility so it’s very much trial and error, and error, and error. The matter is even more serious since most brokers don’t enable you to have your transformations or orchestrations in source control, so you can’t diff to see what changed from the previous version.

It’s really amazing how much pain can be traced back to that one original misunderstanding.



Clarified CQRS

Wednesday, December 9th, 2009

clarification
After listening how the community has interpreted Command-Query Responsibility Segregation I think that the time has come for some clarification. Some have been tying it together to Event Sourcing. Most have been overlaying their previous layered architecture assumptions on it. Here I hope to identify CQRS itself, and describe in which places it can connect to other patterns.

Download as PDF – this is quite a long post.

Why CQRS

Before describing the details of CQRS we need to understand the two main driving forces behind it: collaboration and staleness.

Collaboration refers to circumstances under which multiple actors will be using/modifying the same set of data – whether or not the intention of the actors is actually to collaborate with each other. There are often rules which indicate which user can perform which kind of modification and modifications that may have been acceptable in one case may not be acceptable in others. We’ll give some examples shortly. Actors can be human like normal users, or automated like software.

Staleness refers to the fact that in a collaborative environment, once data has been shown to a user, that same data may have been changed by another actor – it is stale. Almost any system which makes use of a cache is serving stale data – often for performance reasons. What this means is that we cannot entirely trust our users decisions, as they could have been made based on out-of-date information.

Standard layered architectures don’t explicitly deal with either of these issues. While putting everything in the same database may be one step in the direction of handling collaboration, staleness is usually exacerbated in those architectures by the use of caches as a performance-improving afterthought.

A picture for reference

I’ve given some talks about CQRS using this diagram to explain it:

CQRS

The boxes named AC are Autonomous Components. We’ll describe what makes them autonomous when discussing commands. But before we go into the complicated parts, let’s start with queries:

Queries

If the data we’re going to be showing users is stale anyway, is it really necessary to go to the master database and get it from there? Why transform those 3rd normal form structures to domain objects if we just want data – not any rule-preserving behaviors? Why transform those domain objects to DTOs to transfer them across a wire, and who said that wire has to be exactly there? Why transform those DTOs to view model objects?

In short, it looks like we’re doing a heck of a lot of unnecessary work based on the assumption that reusing code that has already been written will be easier than just solving the problem at hand. Let’s try a different approach:

How about we create an additional data store whose data can be a bit out of sync with the master database – I mean, the data we’re showing the user is stale anyway, so why not reflect in the data store itself. We’ll come up with an approach later to keep this data store more or less in sync.

Now, what would be the correct structure for this data store? How about just like the view model? One table for each view. Then our client could simply SELECT * FROM MyViewTable (or possibly pass in an ID in a where clause), and bind the result to the screen. That would be just as simple as can be. You could wrap that up with a thin facade if you feel the need, or with stored procedures, or using AutoMapper which can simply map from a data reader to your view model class. The thing is that the view model structures are already wire-friendly, so you don’t need to transform them to anything else.

You could even consider taking that data store and putting it in your web tier. It’s just as secure as an in-memory cache in your web tier. Give your web servers SELECT only permissions on those tables and you should be fine.

Query Data Storage

While you can use a regular database as your query data store it isn’t the only option. Consider that the query schema is in essence identical to your view model. You don’t have any relationships between your various view model classes, so you shouldn’t need any relationships between the tables in the query data store.

So do you actually need a relational database?

The answer is no, but for all practical purposes and due to organizational inertia, it is probably your best choice (for now).

Scaling Queries

Since your queries are now being performed off of a separate data store than your master database, and there is no assumption that the data that’s being served is 100% up to date, you can easily add more instances of these stores without worrying that they don’t contain the exact same data. The same mechanism that updates one instance can be used for many instances, as we’ll see later.

This gives you cheap horizontal scaling for your queries. Also, since your not doing nearly as much transformation, the latency per query goes down as well. Simple code is fast code.

Data modifications

Since our users are making decisions based on stale data, we need to be more discerning about which things we let through. Here’s a scenario explaining why:

Let’s say we have a customer service representative who is one the phone with a customer. This user is looking at the customer’s details on the screen and wants to make them a ‘preferred’ customer, as well as modifying their address, changing their title from Ms to Mrs, changing their last name, and indicating that they’re now married. What the user doesn’t know is that after opening the screen, an event arrived from the billing department indicating that this same customer doesn’t pay their bills – they’re delinquent. At this point, our user submits their changes.

Should we accept their changes?

Well, we should accept some of them, but not the change to ‘preferred’, since the customer is delinquent. But writing those kinds of checks is a pain – we need to do a diff on the data, infer what the changes mean, which ones are related to each other (name change, title change) and which are separate, identify which data to check against – not just compared to the data the user retrieved, but compared to the current state in the database, and then reject or accept.

Unfortunately for our users, we tend to reject the whole thing if any part of it is off. At that point, our users have to refresh their screen to get the up-to-date data, and retype in all the previous changes, hoping that this time we won’t yell at them because of an optimistic concurrency conflict.

As we get larger entities with more fields on them, we also get more actors working with those same entities, and the higher the likelihood that something will touch some attribute of them at any given time, increasing the number of concurrency conflicts.

If only there was some way for our users to provide us with the right level of granularity and intent when modifying data. That’s what commands are all about.

Commands

A core element of CQRS is rethinking the design of the user interface to enable us to capture our users’ intent such that making a customer preferred is a different unit of work for the user than indicating that the customer has moved or that they’ve gotten married. Using an Excel-like UI for data changes doesn’t capture intent, as we saw above.

We could even consider allowing our users to submit a new command even before they’ve received confirmation on the previous one. We could have a little widget on the side showing the user their pending commands, checking them off asynchronously as we receive confirmation from the server, or marking them with an X if they fail. The user could then double-click that failed task to find information about what happened.

Note that the client sends commands to the server – it doesn’t publish them. Publishing is reserved for events which state a fact – that something has happened, and that the publisher has no concern about what receivers of that event do with it.

Commands and Validation

In thinking through what could make a command fail, one topic that comes up is validation. Validation is different from business rules in that it states a context-independent fact about a command. Either a command is valid, or it isn’t. Business rules on the other hand are context dependent.

In the example we saw before, the data our customer service rep submitted was valid, it was only due to the billing event arriving earlier which required the command to be rejected. Had that billing event not arrived, the data would have been accepted.

Even though a command may be valid, there still may be reasons to reject it.

As such, validation can be performed on the client, checking that all fields required for that command are there, number and date ranges are OK, that kind of thing. The server would still validate all commands that arrive, not trusting clients to do the validation.

Rethinking UIs and commands in light of validation

The client can make of the query data store when validating commands. For example, before submitting a command that the customer has moved, we can check that the street name exists in the query data store.

At that point, we may rethink the UI and have an auto-completing text box for the street name, thus ensuring that the street name we’ll pass in the command will be valid. But why not take things a step further? Why not pass in the street ID instead of its name? Have the command represent the street not as a string, but as an ID (int, guid, whatever).

On the server side, the only reason that such a command would fail would be due to concurrency – that someone had deleted that street and that that hadn’t been reflected in the query store yet; a fairly exceptional set of circumstances.

Reasons valid commands fail and what to do about it

So we’ve got a well-behaved client that is sending valid commands, yet the server still decides to reject them. Often the circumstances for the rejection are related to other actors changing state relevant to the processing of that command.

In the CRM example above, it is only because the billing event arrived first. But “first” could be a millisecond before our command. What if our user pressed the button a millisecond earlier? Should that actually change the business outcome? Shouldn’t we expect our system to behave the same when observed from the outside?

So, if the billing event arrived second, shouldn’t that revert preferred customers to regular ones? Not only that, but shouldn’t the customer be notified of this, like by sending them an email? In which case, why not have this be the behavior for the case where the billing event arrives first? And if we’ve already got a notification model set up, do we really need to return an error to the customer service rep? I mean, it’s not like they can do anything about it other than notifying the customer.

So, if we’re not returning errors to the client (who is already sending us valid commands), maybe all we need to do on the client when sending a command is to tell the user “thank you, you will receive confirmation via email shortly”. We don’t even need the UI widget showing pending commands.

Commands and Autonomy

What we see is that in this model, commands don’t need to be processed immediately – they can be queued. How fast they get processed is a question of Service-Level Agreement (SLA) and not architecturally significant. This is one of the things that makes that node that processes commands autonomous from a runtime perspective – we don’t require an always-on connection to the client.

Also, we shouldn’t need to access the query store to process commands – any state that is needed should be managed by the autonomous component – that’s part of the meaning of autonomy.

Another part is the issue of failed message processing due to the database being down or hitting a deadlock. There is no reason that such errors should be returned to the client – we can just rollback and try again. When an administrator brings the database back up, all the message waiting in the queue will then be processed successfully and our users receive confirmation.

The system as a whole is quite a bit more robust to any error conditions.

Also, since we don’t have queries going through this database any more, the database itself is able to keep more rows/pages in memory which serve commands, improving performance. When both commands and queries were being served off of the same tables, the database server was always juggling rows between the two.

Autonomous Components

While in the picture above we see all commands going to the same AC, we could logically have each command processed by a different AC, each with it’s own queue. That would give us visibility into which queue was the longest, letting us see very easily which part of the system was the bottleneck. While this is interesting for developers, it is critical for system administrators.

Since commands wait in queues, we can now add more processing nodes behind those queues (using the distributor with NServiceBus) so that we’re only scaling the part of the system that’s slow. No need to waste servers on any other requests.

Service Layers

Our command processing objects in the various autonomous components actually make up our service layer. The reason you don’t see this layer explicitly represented in CQRS is that it isn’t really there, at least not as an identifiable logical collection of related objects – here’s why:

In the layered architecture (AKA 3-Tier) approach, there is no statement about dependencies between objects within a layer, or rather it is implied to be allowed. However, when taking a command-oriented view on the service layer, what we see are objects handling different types of commands. Each command is independent of the other, so why should we allow the objects which handle them to depend on each other?

Dependencies are things which should be avoided, unless there is good reason for them.

Keeping the command handling objects independent of each other will allow us to more easily version our system, one command at a time, not needing even to bring down the entire system, given that the new version is backwards compatible with the previous one.

Therefore, keep each command handler in its own VS project, or possibly even in its own solution, thus guiding developers away from introducing dependencies in the name of reuse (it’s a fallacy). If you do decide as a deployment concern, that you want to put them all in the same process feeding off of the same queue, you can ILMerge those assemblies and host them together, but understand that you will be undoing much of the benefits of your autonomous components.

Whither the domain model?

Although in the diagram above you can see the domain model beside the command-processing autonomous components, it’s actually an implementation detail. There is nothing that states that all commands must be processed by the same domain model. Arguably, you could have some commands be processed by transaction script, others using table module (AKA active record), as well as those using the domain model. Event-sourcing is another possible implementation.

Another thing to understand about the domain model is that it now isn’t used to serve queries. So the question is, why do you need to have so many relationships between entities in your domain model?

(You may want to take a second to let that sink in.)

Do we really need a collection of orders on the customer entity? In what command would we need to navigate that collection? In fact, what kind of command would need any one-to-many relationship? And if that’s the case for one-to-many, many-to-many would definitely be out as well. I mean, most commands only contain one or two IDs in them anyway.

Any aggregate operations that may have been calculated by looping over child entities could be pre-calculated and stored as properties on the parent entity. Following this process across all the entities in our domain would result in isolated entities needing nothing more than a couple of properties for the IDs of their related entities – “children” holding the parent ID, like in databases.

In this form, commands could be entirely processed by a single entity – viola, an aggregate root that is a consistency boundary.

Persistence for command processing

Given that the database used for command processing is not used for querying, and that most (if not all) commands contain the IDs of the rows they’re going to affect, do we really need to have a column for every single domain object property? What if we just serialized the domain entity and put it into a single column, and had another column containing the ID? This sounds quite similar to key-value storage that is available in the various cloud providers. In which case, would you really need an object-relational mapper to persist to this kind of storage?

You could also pull out an additional property per piece of data where you’d want the “database” to enforce uniqueness.

I’m not suggesting that you do this in all cases – rather just trying to get you to rethink some basic assumptions.

Let me reiterate

How you process the commands is an implementation detail of CQRS.

Keeping the query store in sync

After the command-processing autonomous component has decided to accept a command, modifying its persistent store as needed, it publishes an event notifying the world about it. This event often is the “past tense” of the command submitted:

MakeCustomerPerferredCommand -> CustomerHasBeenMadePerferredEvent

The publishing of the event is done transactionally together with the processing of the command and the changes to its database. That way, any kind of failure on commit will result in the event not being sent. This is something that should be handled by default by your message bus, and if you’re using MSMQ as your underlying transport, requires the use of transactional queues.

The autonomous component which processes those events and updates the query data store is fairly simple, translating from the event structure to the persistent view model structure. I suggest having an event handler per view model class (AKA per table).

Here’s the picture of all the pieces again:

CQRS

Bounded Contexts

While CQRS touches on many pieces of software architecture, it is still not at the top of the food chain. CQRS if used is employed within a bounded context (DDD) or a business component (SOA) – a cohesive piece of the problem domain. The events published by one BC are subscribed to by other BCs, each updating their query and command data stores as needed.

UI’s from the CQRS found in each BC can be “mashed up” in a single application, providing users a single composite view on all parts of the problem domain. Composite UI frameworks are very useful for these cases.

Summary

CQRS is about coming up with an appropriate architecture for multi-user collaborative applications. It explicitly takes into account factors like data staleness and volatility and exploits those characteristics for creating simpler and more scalable constructs.

One cannot truly enjoy the benefits of CQRS without considering the user-interface, making it capture user intent explicitly. When taking into account client-side validation, command structures may be somewhat adjusted. Thinking through the order in which commands and events are processed can lead to notification patterns which make returning errors unnecessary.

While the result of applying CQRS to a given project is a more maintainable and performant code base, this simplicity and scalability require understanding the detailed business requirements and are not the result of any technical “best practice”. If anything, we can see a plethora of approaches to apparently similar problems being used together – data readers and domain models, one-way messaging and synchronous calls.

Although this blog post is over 3000 words (a record for this blog), I know that it doesn’t go into enough depth on the topic (it takes about 3 days out of the 5 of my Advanced Distributed Systems Design course to cover everything in enough depth). Still, I hope it has given you the understanding of why CQRS is the way it is and possibly opened your eyes to other ways of looking at the design of distributed systems.

Questions and comments are most welcome.



The Fallacy Of ReUse

Sunday, June 7th, 2009

This industry is pre-occupied with reuse.

There’s this belief that if we just reused more code, everything would be better.

Some even go so far as saying that the whole point of object-orientation was reuse – it wasn’t, encapsulation was the big thing. After that component-orientation was the thing that was supposed to make reuse happen. Apparently that didn’t pan out so well either because here we are now pinning our reuseful hopes on service-orientation.

Entire books of patterns have been written on how to achieve reuse with the orientation of the day.
Services have been classified every which way in trying to achieve this, from entity services and activity services, through process services and orchestration services. Composing services has been touted as the key to reusing, and creating reusable services.

I might as well let you in on the dirty-little secret:

Reuse is a fallacy

Before running too far ahead, let’s go back to what the actual goal of reuse was: getting done faster.

That’s it.

It’s a fine goal to have.

And here’s how reuse fits in to the picture:

If we were to write all the code of a system, we’d write a certain amount of code.
If we could reuse some code from somewhere else that was written before, we could write less code.
The more code we can reuse, the less code we write.
The less code we write, the sooner we’ll be done!

However, the above logical progression is based on another couple of fallacies:

Fallacy: All code takes the same amount of time to write

Fallacy: Writing code is the primary activity in getting a system done

Anyone who’s actually written some code that’s gone into production knows this.

There’s the time it takes us to understand what the system should do.
Multiply that by the time it takes the users to understand what the system should do 🙂
Then there’s the integrating that code with all the other code, databases, configuration, web services, etc.
Debugging. Deploying. Debugging. Rebugging. Meetings. Etc.

Writing code is actually the least of our worries.
We actually spend less time writing code than…

Rebugging code

Also known as bug regressions.

This is where we fix one piece of code, and in the process break another piece of code.
It’s not like we do it on purpose. It’s all those dependencies between the various bits of code.
The more dependencies there are, the more likely something’s gonna break.
Especially when we have all sorts of hidden dependencies,
like when other code uses stuff we put in the database without asking us what it means,
or, heaven forbid, changing it without telling us.

These debugging/rebugging cycles can make stabilizing a system take a long time.

So, how does reuse help/hinder with that?

Here’s how:

Dependencies multiply by reuse

It’s to be expected. If you wrote the code all in one place, there are no dependencies. By reusing code, you’ve created a dependency. The more you reuse, the more dependencies you have. The more dependencies, the more rebugging.

Of course, we need to keep in mind the difference between…

Reuse & Use

Your code uses the runtime API (JDK, .NET BCL, etc).
Likewise other frameworks like (N)Hibernate, Spring, WCF, etc.

Reuse happens when you extend and override existing behaviors within other code.
This is most often done by inheritance in OO languages.

Interestingly enough, by the above generally accepted definition, most web services “reuse” is actually really use.

Let’s take a look at the characteristics of the code we’re using and reusing to see where we get the greatest value:

The value of (re)use

If we were to (re)use a piece of code in only one part of our system, it would be safe to say that we would get less value than if we could (re)use it in more places. For example, we could say that for many web applications, the web framework we use provides more value than a given encryption algorithm that we may use in only a few places.

So, what characterizes the code we use in many places?

Well, it’s very generic.

Actually, the more generic a piece of code, the less likely it is that we’ll be changing something in it when fixing a bug in the system.

That’s important.

However, when looking at the kind of code we reuse, and the reasons around it, we tend to see very non-generic code – something that deals with the domain-specific behaviors of the system. Thus, the likelihood of a bug fix needing to touch that code is higher than in the generic/use-not-reuse case, often much higher.

How it all fits together

Goal: Getting done faster
Via: Spending less time debugging/rebugging/stabilizing
Via: Having less dependencies reasonably requiring a bug fix to touch the dependent side
Via: Not reusing non-generic code

This doesn’t mean you shouldn’t use generic code / frameworks where applicable – absolutely, you should.
Just watch the number of kind of dependencies you introduce.

Back to services

So, if we follow the above advice with services, we wouldn’t want domain specific services reusing each other.
If we could get away with it, we probably wouldn’t even want them using each other either.

As use and reuse go down, we can see that service autonomy goes up. And vice-versa.
Luckily, we have service interaction mechanisms from Event-Driven Architecture that enable use without breaking autonomy.
Autonomy is actually very similar to the principle of encapsulation that drove object-orientation in the first place.
Interesting, isn’t it?

In summary

We all want to get done faster.

Way back when, someone told us reuse was the way to do that.

They were wrong.

Reuse may make sense in the most tightly coupled pieces of code you have, but not very much anywhere else.

When designing services in your SOA, stay away from reuse, and minimize use (with EDA patterns).

The next time someone pulls the “reuse excuse”, you’ll be ready.


Further Reading



Saga Persistence and Event-Driven Architectures

Monday, April 20th, 2009

imageWhen working with clients, I run into more than a couple of people that have difficulty with event-driven architecture (EDA). Even more people have difficulty understanding what sagas really are, let alone why they need to use them. I’d go so far to say that many people don’t realize the importance of how sagas are persisted in making it all work (including the Workflow Foundation team).

The common e-commerce example

We accept orders, bill the customer, and then ship them the product.

Fairly straight-forward.

Since each part of that process can be quite complex, let’s have each step be handled by a service:

Sales, Billing, and Shipping. Each of these services will publish an event when it’s done its part. Sales will publish OrderAccepted containing all the order information – order Id, customer Id, products, quantities, etc. Billing will publish CustomerBilledForOrder containing the customer Id, order Id, etc. And Shipping will publish OrderShippedToCustomer with its data.

So far, so good. EDA and SOA seem to be providing us some value.

Where’s the saga?

Well, let’s consider the behavior of the Shipping service. It shouldn’t ship the order to the customer until it has received the CustomerBilledForOrder event as well as the OrderAccepted event. In other words, Shipping needs to hold on to the state that came in the first event until the second event comes in. And this is exactly what sagas are for.

Let’s take a look at the saga code that implements this. In order to simplify the sample a bit, I’ll be omitting the product quantities.

   1:      public class ShippingSaga : Saga<ShippingSagaData>,
   2:          ISagaStartedBy<OrderAccepted>,
   3:          ISagaStartedBy<CustomerBilledForOrder>
   4:      {
   5:          public void Handle(OrderAccepted message)
   6:          {
   7:              this.Data.ProductIdsInOrder = message.ProductIdsInOrder;
   8:          }
   9:   
  10:          public void Handle(CustomerBilledForOrder message)
  11:          {
  12:               this.Bus.Send<ShipOrderToCustomer>(
  13:                  (m =>
  14:                  {
  15:                      m.CustomerId = message.CustomerId;
  16:                      m.OrderId = message.OrderId;
  17:                      m.ProductIdsInOrder = this.Data.ProductIdsInOrder;
  18:                  }
  19:                  ));
  20:   
  21:              this.MarkAsComplete();
  22:          }
  23:   
  24:          public override void Timeout(object state)
  25:          {
  26:              
  27:          }
  28:      }

First of all, this looks fairly simple and straightforward, which is good.
It’s also wrong, which is not so good.

One problem we have here is that events may arrive out of order – first CustomerBilledForOrder, and only then OrderAccepted. What would happen in the above saga in that case? Well, we wouldn’t end up shipping the products to the customer, and customers tend not to like that (for some reason).

There’s also another problem here. See if you can spot it as I go through the explanation of ISagaStartedBy<T>.

Saga start up and correlation

The “ISagaStartedBy<T>” that is implemented for both messages indicates to the infrastructure (NServiceBus) that when a message of that type arrives, if an existing saga instance cannot be found, that a new instance should be started up. Makes sense, doesn’t it? For a given order, when the OrderAccepted event arrives first, Shipping doesn’t currently have any sagas handling it, so it starts up a new one. After that, when the CustomerBilledForOrder event arrives for that same order, the event should be handled by the saga instance that handled the first event – not by a new one.

I’ll repeat the important part: “the event should be handled by the saga instance that handled the first event”.

Since the only information we stored in the saga was the list of products, how would we be able to look up that saga instance when the next event came in containing an order Id, but no saga Id?

OK, so we need to store the order Id from the first event so that when the second event comes along we’ll be able to find the saga based on that order Id. Not too complicated, but something to keep in mind.

Let’s look at the updated code:

   1:      public class ShippingSaga : Saga<ShippingSagaData>,
   2:          ISagaStartedBy<OrderAccepted>,
   3:          ISagaStartedBy<CustomerBilledForOrder>
   4:      {
   5:          public void Handle(CustomerBilledForOrder message)
   6:          {
   7:              this.Data.CustomerHasBeenBilled = true;
   8:   
   9:              this.Data.CustomerId = message.CustomerId;
  10:              this.Data.OrderId = message.OrderId;
  11:   
  12:              this.CompleteIfPossible();
  13:          }
  14:   
  15:          public void Handle(OrderAccepted message)
  16:          {
  17:              this.Data.ProductIdsInOrder = message.ProductIdsInOrder;
  18:   
  19:              this.Data.CustomerId = message.CustomerId;
  20:              this.Data.OrderId = message.OrderId;
  21:   
  22:              this.CompleteIfPossible();
  23:          }
  24:   
  25:          private void CompleteIfPossible()
  26:          {
  27:              if (this.Data.ProductIdsInOrder != null && this.Data.CustomerHasBeenBilled)
  28:              {
  29:                  this.Bus.Send<ShipOrderToCustomer>(
  30:                     (m =>
  31:                     {
  32:                         m.CustomerId = this.Data.CustomerId;
  33:                         m.OrderId = this.Data.OrderId;
  34:                         m.ProductIdsInOrder = this.Data.ProductIdsInOrder;
  35:                     }
  36:                     ));
  37:                  this.MarkAsComplete();
  38:              }
  39:          }
  40:      }

And that brings us to…

Saga persistence

We already saw why Shipping needs to be able to look up its internal sagas using data from the events, but what that means is that simple blob-type persistence of those sagas is out. NServiceBus comes with an NHibernate-based saga persister for exactly this reason, though any persistence mechanism which allows you to query on something other than saga Id would work just as well.

Let’s take a quick look at the saga data that we’ll be storing and see how simple it is:

   1:      public class ShippingSagaData : ISagaEntity
   2:      {
   3:          public virtual Guid Id { get; set; }
   4:          public virtual string Originator { get; set; }
   5:          public virtual Guid OrderId { get; set; }
   6:          public virtual Guid CustomerId { get; set; }
   7:          public virtual List<Guid> ProductIdsInOrder { get; set; }
   8:          public virtual bool CustomerHasBeenBilled { get; set; }
   9:      }

You might have noticed the “Originator” property in there and wondered what it is for. First of all, the ISagaEntity interface requires the two properties Id and Originator. Originator is used to store the return address of the message that started the saga. Id is for what you think it’s for. In this saga, we don’t need to send any messages back to whoever started the saga, but in many others we do. In those cases, we’ll often be handling a message from some other endpoint when we want to possibly report some status back to the client that started the process. By storing that client’s address the first time, we can then “ReplyToOriginator” at any point in the process.

The manufacturing sample that comes with NServiceBus shows how this works.

Saga Lookup

Earlier, we saw the need to search for sagas based on order Id. The way to hook into the infrastructure and perform these lookups is by implementing “IFindSagas<T>.Using<M>” where T is the type of the saga data and M is the type of message. In our example, doing this using NHibernate would look like this:

   1:      public class ShippingSagaFinder : 
   2:          IFindSagas<ShippingSagaData>.Using<OrderAccepted>,
   3:          IFindSagas<ShippingSagaData>.Using<CustomerBilledForOrder>
   4:      {
   5:          public ShippingSagaData FindBy(CustomerBilledForOrder message)
   6:          {
   7:              return FindBy(message.OrderId)
   8:          }
   9:   
  10:          public ShippingSagaData FindBy(OrderAccepted message)
  11:          {
  12:              return FindBy(message.OrderId)
  13:          }
  14:   
  15:          private ShippingSagaData FindBy(Guid orderId)
  16:          {
  17:              return sessionFactory.GetCurrentSession().CreateCriteria(typeof(ShippingSagaData))
  18:                  .Add(Expression.Eq("OrderId", orderId))
  19:                  .UniqueResult<ShippingSagaData>();
  20:          }
  21:   
  22:          private ISessionFactory sessionFactory;
  23:   
  24:          public virtual ISessionFactory SessionFactory
  25:          {
  26:              get { return sessionFactory; }
  27:              set { sessionFactory = value; }
  28:          }
  29:      }

For a performance boost, we’d probably index our saga data by order Id.

On concurrency

Another important note is that for this saga, if both messages were handled in parallel on different machines, the saga could get stuck. The persistence mechanism here needs to prevent this. When using NHibernate over a database with the appropriate isolation level (Repeatable Read – the default in NServiceBus), this “just works”. If/When implementing your own saga persistence mechanism, it is important to understand the kind of concurrency your business logic can live with.

Take a look at Ayende’s example for mobile phone billing to get a feeling for what that’s like.

Summary

In almost any event-driven architecture, you’ll have services correlating multiple events in order to make decisions. The saga pattern is a great fit there, and not at all difficult to implement. You do need to take into account that events may arrive out of order and implement the saga logic accordingly, but it’s really not that big a deal. Do take the time to think through what data will need to be stored in order for the saga to be fault-tolerant, as well as a persistence mechanism that will allow you to look up that data based on event data.

If you feel like giving this approach a try, but don’t have an environment handy for this, download NServiceBus and take a look at the samples. It’s really quick and easy to get set up.



Backwards-Compatibility: Why Most Versioning Problems Aren’t

Friday, April 10th, 2009

image

I’ve been to too many clients where I’ve been brought in to help them with their problems around service versioning when the solution I propose is simply to have version N+1 of the system be backwards-compatible with version N. If two adjacent versions of a given system aren’t compatible with each other, it is practically impossible to solve versioning issues.

Here’s what happens when versions aren’t compatible:

Admins stop the system from accepting any new requests, and wait until all current requests are done processing. They take a backup/snapshot of all relevant parts of the system (like data in the DB). Then, bring down the system – all of it. Install the new version on all machines. Bring everything back up. Let the users back in.

If, heaven-forbid, problems were uncovered with the new version (since some problems only appear in production), the admins have to roll back to the previous version – once again bringing everything down.

This scenario is fairly catastrophic for any company that requires not-even high availability, but pretty continuous availability – like public facing web apps.

If adjacent versions were compatible with each other, we could upgrade the system piece-meal – machine by machine, where both the old and new versions will be running side by side, communicating with each other. While the system’s performance may be sub-optimal, it will continue to be available throughout upgrades as well as downgrades.

This isn’t trivial to do.

It impacts how you decide what is (and more importantly, what isn’t) nullable.

It may force you to spread certain changes to features across more versions (aka releases).

As such, you can expect this to affect how you do release and feature planning.

However, if you do not take these factors into account, it’s almost a certainty that your versioning problems will persist and no technology (new or old) will be able to solve them.

Coming next… Units of versioning – inside and outside a service.



Self-Contained Events and SOA

Saturday, December 13th, 2008

diamondIn the architectural principle of fully self contained messages, events “can – instantly and in future – be interpreted as the respective event without the need to rely on additional data stores that would need to be in time-sync with the event during message-processing.”

Also, “passing reference data in a message makes the message-consuming systems dependent on the knowledge and availability of actual persistent data that is stored “somewhere”. This data must separately be accessed for the sake of understanding the event that is represented by the message.”

The discussion of self-contained events can be compared to integration databases vs application databases.

Centralized Integration – Pros & Cons

If everything in a system can access a central datastore, it is enough for one party to publish an event containing only the ID of an entity that that party previously entered/updated. Upon receiving that event, a subscriber would go to the central datastore and get the fields its interested in for that ID. The advantage of this approach is that the minimal amount of data necessary crosses the network, as subscribers only retrieve the fields that interest them. Martin Fowler describes the disadvantages as:

“An integration database needs a schema that takes all its client applications into account. The resulting schema is either more general, more complex or both. The database usually is controlled by a separate group to the applications and database changes are more complex because they have to be negotiated between the database group and the various applications.”

This is far from being aligned with the principle of autonomy so important to SOA. In that respect, the architectural principle of self-contained messages points us away from those problems and towards more autonomous services.

However, once we have these autonomous business services in place, we may find that we don’t need 100% fully self-contained messages anymore.

A Real-World Example

Let’s say we have 3 business services, Sales, Fulfillment, and Billing.

Sales publishes an OrderAccepted event when it accepts an order. That event contains all the order information.

Both Fulfillment and Billing are subscribed to this event, and thus receive it.

Fulfillment does not ship products to the customer until the customer has been billed, so it just stores the order information internally, and is done.

Billing starts the process of billing the customer for their order, possibly joining several orders into a single bill. After completing this process, it publishes a CustomerBilled event containing all billing information, as well as the IDs of the orders in that bill. It does not put all the order information in that event, as it is not the authoritative owner of that data.

When Fulfillment receives the CustomerBilled event, it uses the IDs of the orders contained in the event to find the order information it previously stored internally. It does not need to call the Sales service for this information or contact some central Master Data Management system. It uses the data it has, and goes about fulfilling the orders and shipping the products to the customer, finally publishing its own OrderShipped event.

Notice, as well, that in the original OrderAccepted event there were the IDs of products the customer ordered. These product IDs originated from another service, Merchandising, responsible for the product catalog. The same thing can be said for the customer ID originating from another service – Customer Care.

The Issue of Time

One could argue that since subscribers use previously cached data when processing new events, that data might not be up to date. Also, we may have race conditions between our services. In the above example, if Billing was extremely fast and more highly available than Fulfillment. Billing could have received the OrderAccepted event, processed it, and published the CustomerBilled event before Fulfillment had received the OrderAccepted event. In short, the CustomerBilled and OrderAccepted messages could be out of order in Fulfillment’s queue.

What would Fulfillment do when trying to process the CustomerBilled message when it doesn’t have the order information?

Well, it knows that the world is parallel and non-sequential, so it does NOT return/log an error, but rather puts that message in the back of the queue to be processed again later (or maybe in some other temporary holding area). This enables the OrderAccepted message to be processed before the CustomerBilled message is retried. When the retry occurs, well, everything’s OK – it’s worked itself out over time.

In the case where we retry again and again and things don’t work themselves out (maybe the OrderAccepted event was lost), we move that message off to a different queue for something else to resolve the conflict (maybe a person, maybe software). If/when the conflict is resolved (got the Sales system / messaging system to replay the OrderAccepted event), the conflict resolver returns the CustomerBilled message to the queue, and now everything works just fine.

As all of this is occurring, the only thing that’s visible to external parties is that it happens to be taking longer than usual for the OrderShipped event to be published. In other words, time is the only difference.

 

Summary

The problem of non-self-contained events is mitigated first and foremost by business services in SOA, and the apparent issue of time-synchronization by business logic inside these services.

Don’t be afraid to put IDs in your messages and events.

Do be afraid of using those IDs to access datastores shared by multiple “services”.

Using IDs to correlated current events to data from previous events is not only OK, it’s to be expected.

The architectural principle of fully self-contained messages steers us away from the problems of Integration Databases and towards Application Databases, autonomous services, and a better SOA implementation. From there, following the principle of autonomy from a business perspective, will lead us to services not publishing data in their messages that is owned by other services, taking us the next step of our journey to SOA.


Related Content

[Podcast] Message Ordering – Is it cost effective?

Don’t EDA between existing systems

[Podcast] Handling dependencies between subscribers in SOA



   


Don't miss my best content
 

Recommendations

Bryan Wheeler, Director Platform Development at msnbc.com
Udi Dahan is the real deal.

We brought him on site to give our development staff the 5-day “Advanced Distributed System Design” training. The course profoundly changed our understanding and approach to SOA and distributed systems.

Consider some of the evidence: 1. Months later, developers still make allusions to concepts learned in the course nearly every day 2. One of our developers went home and made her husband (a developer at another company) sign up for the course at a subsequent date/venue 3. Based on what we learned, we’ve made constant improvements to our architecture that have helped us to adapt to our ever changing business domain at scale and speed If you have the opportunity to receive the training, you will make a substantial paradigm shift.

If I were to do the whole thing over again, I’d start the week by playing the clip from the Matrix where Morpheus offers Neo the choice between the red and blue pills. Once you make the intellectual leap, you’ll never look at distributed systems the same way.

Beyond the training, we were able to spend some time with Udi discussing issues unique to our business domain. Because Udi is a rare combination of a big picture thinker and a low level doer, he can quickly hone in on various issues and quickly make good (if not startling) recommendations to help solve tough technical issues.” November 11, 2010

Sam Gentile Sam Gentile, Independent WCF & SOA Expert
“Udi, one of the great minds in this area.
A man I respect immensely.”





Ian Robinson Ian Robinson, Principal Consultant at ThoughtWorks
"Your blog and articles have been enormously useful in shaping, testing and refining my own approach to delivering on SOA initiatives over the last few years. Over and against a certain 3-layer-application-architecture-blown-out-to- distributed-proportions school of SOA, your writing, steers a far more valuable course."

Shy Cohen Shy Cohen, Senior Program Manager at Microsoft
“Udi is a world renowned software architect and speaker. I met Udi at a conference that we were both speaking at, and immediately recognized his keen insight and razor-sharp intellect. Our shared passion for SOA and the advancement of its practice launched a discussion that lasted into the small hours of the night.
It was evident through that discussion that Udi is one of the most knowledgeable people in the SOA space. It was also clear why – Udi does not settle for mediocrity, and seeks to fully understand (or define) the logic and principles behind things.
Humble yet uncompromising, Udi is a pleasure to interact with.”

Glenn Block Glenn Block, Senior Program Manager - WCF at Microsoft
“I have known Udi for many years having attended his workshops and having several personal interactions including working with him when we were building our Composite Application Guidance in patterns & practices. What impresses me about Udi is his deep insight into how to address business problems through sound architecture. Backed by many years of building mission critical real world distributed systems it is no wonder that Udi is the best at what he does. When customers have deep issues with their system design, I point them Udi's way.”

Karl Wannenmacher Karl Wannenmacher, Senior Lead Expert at Frequentis AG
“I have been following Udi’s blog and podcasts since 2007. I’m convinced that he is one of the most knowledgeable and experienced people in the field of SOA, EDA and large scale systems.
Udi helped Frequentis to design a major subsystem of a large mission critical system with a nationwide deployment based on NServiceBus. It was impressive to see how he took the initial architecture and turned it upside down leading to a very flexible and scalable yet simple system without knowing the details of the business domain. I highly recommend consulting with Udi when it comes to large scale mission critical systems in any domain.”

Simon Segal Simon Segal, Independent Consultant
“Udi is one of the outstanding software development minds in the world today, his vast insights into Service Oriented Architectures and Smart Clients in particular are indeed a rare commodity. Udi is also an exceptional teacher and can help lead teams to fall into the pit of success. I would recommend Udi to anyone considering some Architecural guidance and support in their next project.”

Ohad Israeli Ohad Israeli, Chief Architect at Hewlett-Packard, Indigo Division
“When you need a man to do the job Udi is your man! No matter if you are facing near deadline deadlock or at the early stages of your development, if you have a problem Udi is the one who will probably be able to solve it, with his large experience at the industry and his widely horizons of thinking , he is always full of just in place great architectural ideas.
I am honored to have Udi as a colleague and a friend (plus having his cell phone on my speed dial).”

Ward Bell Ward Bell, VP Product Development at IdeaBlade
“Everyone will tell you how smart and knowledgable Udi is ... and they are oh-so-right. Let me add that Udi is a smart LISTENER. He's always calibrating what he has to offer with your needs and your experience ... looking for the fit. He has strongly held views ... and the ability to temper them with the nuances of the situation.
I trust Udi to tell me what I need to hear, even if I don't want to hear it, ... in a way that I can hear it. That's a rare skill to go along with his command and intelligence.”

Eli Brin, Program Manager at RISCO Group
“We hired Udi as a SOA specialist for a large scale project. The development is outsourced to India. SOA is a buzzword used almost for anything today. We wanted to understand what SOA really is, and what is the meaning and practice to develop a SOA based system.
We identified Udi as the one that can put some sense and order in our minds. We started with a private customized SOA training for the entire team in Israel. After that I had several focused sessions regarding our architecture and design.
I will summarize it simply (as he is the software simplist): We are very happy to have Udi in our project. It has a great benefit. We feel good and assured with the knowledge and practice he brings. He doesn’t talk over our heads. We assimilated nServicebus as the ESB of the project. I highly recommend you to bring Udi into your project.”

Catherine Hole Catherine Hole, Senior Project Manager at the Norwegian Health Network
“My colleagues and I have spent five interesting days with Udi - diving into the many aspects of SOA. Udi has shown impressive abilities of understanding organizational challenges, and has brought the business perspective into our way of looking at services. He has an excellent understanding of the many layers from business at the top to the technical infrstructure at the bottom. He is a great listener, and manages to simplify challenges in a way that is understandable both for developers and CEOs, and all the specialists in between.”

Yoel Arnon Yoel Arnon, MSMQ Expert
“Udi has a unique, in depth understanding of service oriented architecture and how it should be used in the real world, combined with excellent presentation skills. I think Udi should be a premier choice for a consultant or architect of distributed systems.”

Vadim Mesonzhnik, Development Project Lead at Polycom
“When we were faced with a task of creating a high performance server for a video-tele conferencing domain we decided to opt for a stateless cluster with SQL server approach. In order to confirm our decision we invited Udi.

After carefully listening for 2 hours he said: "With your kind of high availability and performance requirements you don’t want to go with stateless architecture."

One simple sentence saved us from implementing a wrong product and finding that out after years of development. No matter whether our former decisions were confirmed or altered, it gave us great confidence to move forward relying on the experience, industry best-practices and time-proven techniques that Udi shared with us.
It was a distinct pleasure and a unique opportunity to learn from someone who is among the best at what he does.”

Jack Van Hoof Jack Van Hoof, Enterprise Integration Architect at Dutch Railways
“Udi is a respected visionary on SOA and EDA, whose opinion I most of the time (if not always) highly agree with. The nice thing about Udi is that he is able to explain architectural concepts in terms of practical code-level examples.”

Neil Robbins Neil Robbins, Applications Architect at Brit Insurance
“Having followed Udi's blog and other writings for a number of years I attended Udi's two day course on 'Loosely Coupled Messaging with NServiceBus' at SkillsMatter, London.

I would strongly recommend this course to anyone with an interest in how to develop IT systems which provide immediate and future fitness for purpose. An influential and innovative thought leader and practitioner in his field, Udi demonstrates and shares a phenomenally in depth knowledge that proves his position as one of the premier experts in his field globally.

The course has enhanced my knowledge and skills in ways that I am able to immediately apply to provide benefits to my employer. Additionally though I will be able to build upon what I learned in my 2 days with Udi and have no doubt that it will only enhance my future career.

I cannot recommend Udi, and his courses, highly enough.”

Nick Malik Nick Malik, Enterprise Architect at Microsoft Corporation
You are an excellent speaker and trainer, Udi, and I've had the fortunate experience of having attended one of your presentations. I believe that you are a knowledgable and intelligent man.”

Sean Farmar Sean Farmar, Chief Technical Architect at Candidate Manager Ltd
“Udi has provided us with guidance in system architecture and supports our implementation of NServiceBus in our core business application.

He accompanied us in all stages of our development cycle and helped us put vision into real life distributed scalable software. He brought fresh thinking, great in depth of understanding software, and ongoing support that proved as valuable and cost effective.

Udi has the unique ability to analyze the business problem and come up with a simple and elegant solution for the code and the business alike.
With Udi's attention to details, and knowledge we avoided pit falls that would cost us dearly.”

Børge Hansen Børge Hansen, Architect Advisor at Microsoft
“Udi delivered a 5 hour long workshop on SOA for aspiring architects in Norway. While keeping everyone awake and excited Udi gave us some great insights and really delivered on making complex software challenges simple. Truly the software simplist.”

Motty Cohen, SW Manager at KorenTec Technologies
“I know Udi very well from our mutual work at KorenTec. During the analysis and design of a complex, distributed C4I system - where the basic concepts of NServiceBus start to emerge - I gained a lot of "Udi's hours" so I can surely say that he is a professional, skilled architect with fresh ideas and unique perspective for solving complex architecture challenges. His ideas, concepts and parts of the artifacts are the basis of several state-of-the-art C4I systems that I was involved in their architecture design.”

Aaron Jensen Aaron Jensen, VP of Engineering at Eleutian Technology
Awesome. Just awesome.

We’d been meaning to delve into messaging at Eleutian after multiple discussions with and blog posts from Greg Young and Udi Dahan in the past. We weren’t entirely sure where to start, how to start, what tools to use, how to use them, etc. Being able to sit in a room with Udi for an entire week while he described exactly how, why and what he does to tackle a massive enterprise system was invaluable to say the least.

We now have a much better direction and, more importantly, have the confidence we need to start introducing these powerful concepts into production at Eleutian.”

Gad Rosenthal Gad Rosenthal, Department Manager at Retalix
“A thinking person. Brought fresh and valuable ideas that helped us in architecting our product. When recommending a solution he supports it with evidence and detail so you can successfully act based on it. Udi's support "comes on all levels" - As the solution architect through to the detailed class design. Trustworthy!”

Chris Bilson Chris Bilson, Developer at Russell Investment Group
“I had the pleasure of attending a workshop Udi led at the Seattle ALT.NET conference in February 2009. I have been reading Udi's articles and listening to his podcasts for a long time and have always looked to him as a source of advice on software architecture.
When I actually met him and talked to him I was even more impressed. Not only is Udi an extremely likable person, he's got that rare gift of being able to explain complex concepts and ideas in a way that is easy to understand.
All the attendees of the workshop greatly appreciate the time he spent with us and the amazing insights into service oriented architecture he shared with us.”

Alexey Shestialtynov Alexey Shestialtynov, Senior .Net Developer at Candidate Manager
“I met Udi at Candidate Manager where he was brought in part-time as a consultant to help the company make its flagship product more scalable. For me, even after 30 years in software development, working with Udi was a great learning experience. I simply love his fresh ideas and architecture insights.
As we all know it is not enough to be armed with best tools and technologies to be successful in software - there is still human factor involved. When, as it happens, the project got in trouble, management asked Udi to step into a leadership role and bring it back on track. This he did in the span of a month. I can only wish that things had been done this way from the very beginning.
I look forward to working with Udi again in the future.”

Christopher Bennage Christopher Bennage, President at Blue Spire Consulting, Inc.
“My company was hired to be the primary development team for a large scale and highly distributed application. Since these are not necessarily everyday requirements, we wanted to bring in some additional expertise. We chose Udi because of his blogging, podcasting, and speaking. We asked him to to review our architectural strategy as well as the overall viability of project.
I was very impressed, as Udi demonstrated a broad understanding of the sorts of problems we would face. His advice was honest and unbiased and very pragmatic. Whenever I questioned him on particular points, he was able to backup his opinion with real life examples. I was also impressed with his clarity and precision. He was very careful to untangle the meaning of words that might be overloaded or otherwise confusing. While Udi's hourly rate may not be the cheapest, the ROI is undoubtedly a deal. I would highly recommend consulting with Udi.”

Robert Lewkovich, Product / Development Manager at Eggs Overnight
“Udi's advice and consulting were a huge time saver for the project I'm responsible for. The $ spent were well worth it and provided me with a more complete understanding of nServiceBus and most importantly in helping make the correct architectural decisions earlier thereby reducing later, and more expensive, rework.”

Ray Houston Ray Houston, Director of Development at TOPAZ Technologies
“Udi's SOA class made me smart - it was awesome.

The class was very well put together. The materials were clear and concise and Udi did a fantastic job presenting it. It was a good mixture of lecture, coding, and question and answer. I fully expected that I would be taking notes like crazy, but it was so well laid out that the only thing I wrote down the entire course was what I wanted for lunch. Udi provided us with all the lecture materials and everyone has access to all of the samples which are in the nServiceBus trunk.

Now I know why Udi is the "Software Simplist." I was amazed to find that all the code and solutions were indeed very simple. The patterns that Udi presented keep things simple by isolating complexity so that it doesn't creep into your day to day code. The domain code looks the same if it's running in a single process or if it's running in 100 processes.”

Ian Cooper Ian Cooper, Team Lead at Beazley
“Udi is one of the leaders in the .Net development community, one of the truly smart guys who do not just get best architectural practice well enough to educate others but drives innovation. Udi consistently challenges my thinking in ways that make me better at what I do.”

Liron Levy, Team Leader at Rafael
“I've met Udi when I worked as a team leader in Rafael. One of the most senior managers there knew Udi because he was doing superb architecture job in another Rafael project and he recommended bringing him on board to help the project I was leading.
Udi brought with him fresh solutions and invaluable deep architecture insights. He is an authority on SOA (service oriented architecture) and this was a tremendous help in our project.
On the personal level - Udi is a great communicator and can persuade even the most difficult audiences (I was part of such an audience myself..) by bringing sound explanations that draw on his extensive knowledge in the software business. Working with Udi was a great learning experience for me, and I'll be happy to work with him again in the future.”

Adam Dymitruk Adam Dymitruk, Director of IT at Apara Systems
“I met Udi for the first time at DevTeach in Montreal back in early 2007. While Udi is usually involved in SOA subjects, his knowledge spans all of a software development company's concerns. I would not hesitate to recommend Udi for any company that needs excellent leadership, mentoring, problem solving, application of patterns, implementation of methodologies and straight out solution development.
There are very few people in the world that are as dedicated to their craft as Udi is to his. At ALT.NET Seattle, Udi explained many core ideas about SOA. The team that I brought with me found his workshop and other talks the highlight of the event and provided the most value to us and our organization. I am thrilled to have the opportunity to recommend him.”

Eytan Michaeli Eytan Michaeli, CTO Korentec
“Udi was responsible for a major project in the company, and as a chief architect designed a complex multi server C4I system with many innovations and excellent performance.”


Carl Kenne Carl Kenne, .Net Consultant at Dotway AB
“Udi's session "DDD in Enterprise apps" was truly an eye opener. Udi has a great ability to explain complex enterprise designs in a very comprehensive and inspiring way. I've seen several sessions on both DDD and SOA in the past, but Udi puts it in a completly new perspective and makes us understand what it's all really about. If you ever have a chance to see any of Udi's sessions in the future, take it!”

Avi Nehama, R&D Project Manager at Retalix
“Not only that Udi is a briliant software architecture consultant, he also has remarkable abilities to present complex ideas in a simple and concise manner, and...
always with a smile. Udi is indeed a top-league professional!”

Ben Scheirman Ben Scheirman, Lead Developer at CenterPoint Energy
“Udi is one of those rare people who not only deeply understands SOA and domain driven design, but also eloquently conveys that in an easy to grasp way. He is patient, polite, and easy to talk to. I'm extremely glad I came to his workshop on SOA.”

Scott C. Reynolds Scott C. Reynolds, Director of Software Engineering at CBLPath
“Udi is consistently advancing the state of thought in software architecture, service orientation, and domain modeling.
His mastery of the technologies and techniques is second to none, but he pairs that with a singular ability to listen and communicate effectively with all parties, technical and non, to help people arrive at context-appropriate solutions. Every time I have worked with Udi, or attended a talk of his, or just had a conversation with him I have come away from it enriched with new understanding about the ideas discussed.”

Evgeny-Hen Osipow, Head of R&D at PCLine
“Udi has helped PCLine on projects by implementing architectural blueprints demonstrating the value of simple design and code.”

Rhys Campbell Rhys Campbell, Owner at Artemis West
“For many years I have been following the works of Udi. His explanation of often complex design and architectural concepts are so cleanly broken down that even the most junior of architects can begin to understand these concepts. These concepts however tend to typify the "real world" problems we face daily so even the most experienced software expert will find himself in an "Aha!" moment when following Udi teachings.
It was a pleasure to finally meet Udi in Seattle Alt.Net OpenSpaces 2008, where I was pleasantly surprised at how down-to-earth and approachable he was. His depth and breadth of software knowledge also became apparent when discussion with his peers quickly dove deep in to the problems we current face. If given the opportunity to work with or recommend Udi I would quickly take that chance. When I think .Net Architecture, I think Udi.”

Sverre Hundeide Sverre Hundeide, Senior Consultant at Objectware
“Udi had been hired to present the third LEAP master class in Oslo. He is an well known international expert on enterprise software architecture and design, and is the author of the open source messaging framework nServiceBus. The entire class was based on discussion and interaction with the audience, and the only Power Point slide used was the one showing the agenda.
He started out with sketching a naive traditional n-tier application (big ball of mud), and based on suggestions from the audience we explored different solutions which might improve the solution. Whatever suggestions we threw at him, he always had a thoroughly considered answer describing pros and cons with the suggested solution. He obviously has a lot of experience with real world enterprise SOA applications.”

Raphaël Wouters Raphaël Wouters, Owner/Managing Partner at Medinternals
“I attended Udi's excellent course 'Advanced Distributed System Design with SOA and DDD' at Skillsmatter. Few people can truly claim such a high skill and expertise level, present it using a pragmatic, concrete no-nonsense approach and still stay reachable.”

Nimrod Peleg Nimrod Peleg, Lab Engineer at Technion IIT
“One of the best programmers and software engineer I've ever met, creative, knows how to design and implemet, very collaborative and finally - the applications he designed implemeted work for many years without any problems!

Jose Manuel Beas
“When I attended Udi's SOA Workshop, then it suddenly changed my view of what Service Oriented Architectures were all about. Udi explained complex concepts very clearly and created a very productive discussion environment where all the attendees could learn a lot. I strongly recommend hiring Udi.”

Daniel Jin Daniel Jin, Senior Lead Developer at PJM Interconnection
“Udi is one of the top SOA guru in the .NET space. He is always eager to help others by sharing his knowledge and experiences. His blog articles often offer deep insights and is a invaluable resource. I highly recommend him.”

Pasi Taive Pasi Taive, Chief Architect at Tieto
“I attended both of Udi's "UI Composition Key to SOA Success" and "DDD in Enterprise Apps" sessions and they were exceptionally good. I will definitely participate in his sessions again. Udi is a great presenter and has the ability to explain complex issues in a manner that everyone understands.”

Eran Sagi, Software Architect at HP
“So far, I heard about Service Oriented architecture all over. Everyone mentions it – the big buzz word. But, when I actually asked someone for what does it really mean, no one managed to give me a complete satisfied answer. Finally in his excellent course “Advanced Distributed Systems”, I got the answers I was looking for. Udi went over the different motivations (principles) of Services Oriented, explained them well one by one, and showed how each one could be technically addressed using NService bus. In his course, Udi also explain the way of thinking when coming to design a Service Oriented system. What are the questions you need to ask yourself in order to shape your system, place the logic in the right places for best Service Oriented system.

I would recommend this course for any architect or developer who deals with distributed system, but not only. In my work we do not have a real distributed system, but one PC which host both the UI application and the different services inside, all communicating via WCF. I found that many of the architecture principles and motivations of SOA apply for our system as well. Enough that you have SW partitioned into components and most of the principles becomes relevant to you as well. Bottom line – an excellent course recommended to any SW Architect, or any developer dealing with distributed system.”

Consult with Udi

Guest Authored Books



Creative Commons License  © Copyright 2005-2011, Udi Dahan. email@UdiDahan.com