Udi Dahan   Udi Dahan – The Software Simplist
Enterprise Development Expert & SOA Specialist
 
  
    Blog Consulting Training Articles Speaking About
  

Archive for the ‘Availability’ Category



Asynchronous, High-Performance Login for Web Farms

Saturday, November 10th, 2007

Often during my consulting engagements I run into people who say, "some things just can’t be made asynchronous" even after they agree about the inherent scalability that asynchronous communications pattern bring. One often-cited example is user authentication – taking a username and password combo and authenticating it against some back-end store. For the purpose of this post, I’m going to assume a database. Also, I’m not going to be showing more advanced features like ETags to further improve the solution.

The Setup

Just so that the example is in itself secure, we’ll assume that the password is one-way hashed before being stored. Also, given a reasonable network infrastructure our web servers will be isolated in the DMZ and will have to access some application server which, in turn, will communicate with the DB. There’s also a good chance for something like round-robin load-balancing between web servers, especially for things like user login.

Before diving into the meat of it, I wanted to preface with a few words. One of the commonalities I’ve found when people dismiss asynchrony is that they don’t consider a real deployment environment, or scaling up a solution to multiple servers, farms, or datacenters.

The Synchronous Solution

In the synchronous solution, each one of our web servers will be contacting the app server for each user login request. In other words, the load on the app server and, consequently, on the database server will be proportional to the number of logins. One property of this load is its data locality, or rather, the lack of it. Given that user U logged in, the DB won’t necessarily gain any performance benefits by loading all username/password data into memory for the same page as user U. Another property is that this data is very non-volatile – it doesn’t change that often.

I won’t go to far into the synchronous solution since its been analysed numerous times before. The bottom line is that the database is the bottleneck. You could use sharding solutions. Many of the large sites have numerous read-only databases for this kind of data, with one master for updates – replicating out to the read-only replicas. That’s great if you’re using a nice cheap database like mySql (of LAMP), not so nice if you’re running Oracle or MS Sql Server.

Regardless of what you’re doing in your data tier, you’re there. Wouldn’t it be nice to close the loop in the web servers? Even if you are using Apache, that’s going to be less iron, electricity, and cooling all around. That’s what the asynchronous solution is all about – capitalizing on the low cost of memory to save on other things.

The Asynchronous Solution

In the asynchronous solution, we cache username/hashed-password pairs in memory on our web servers, and authenticate against that. Let’s analyse how much memory that takes:

Usernames are usually 12 characters or less, but let’s take an average of 32 to be sure. Using Unicode we get to 64 bytes for the username. Hashed passwords can run between 256 and 512 bits depending on the algorithm, divide by 8 and you have 64 bytes. That’s about 128 bytes altogether. So we can safely cache 8 million of these with 1GB of memory per web server. If you’ve got a million users, first of all, good for you 🙂 Second, that’s just 128 MB of memory – relatively nothing even for a cheap 2GB web server.

Also, consider the fact that when registering a new user we can check if such a username is already taken at the web server level. That doesn’t mean it won’t be checked again in the DB to account for concurrency issues, but that the load on the DB is further reduced. Other things to notice include no read-only replicas and no replication. Simple. Our web servers are the "replicas".

The Authentication Service

What makes it all work is the "Authentication Service" on the app server. This was always there in the synchronous solution. It is what used to field all the login requests from the web servers, and, of course, allowed them to register new users and all the regular stuff. The difference is that now it publishes a message when a new user is registered (or rather, is validated – all a part of the internal long-running workflow). It also allows subscribers to receive the list of all username/hashed-password pairs. It’s also quite likely that it would keep the same data in memory too.

The same message can be used to publish both single updates, and returning the full list when using NServiceBus. Let’s define the message:

[Serializable]
public class UsernameInUseMessage : IMessage
{
    private string username;
    public string Username
    {
        get { return username; }
        set { username = value; }
    }

    private byte[] hashedPassword;
    public byte[] HashedPassword
    {
        get { return hashedPassword; }
        set { hashedPassword = value; }
    }
}

And the message that the web server sends when it wants the full list:

[Serializable]
public class GetAllUsernamesMessage : IMessage
{

}

And the code that the web server runs on startup looks like this (assuming constructor injection):

 

public class UserAuthenticationServiceAgent

    public UserAuthenticationServiceAgent(IBus bus) 
    { 
        this.bus = bus;
        bus.Subscribe(typeof(UsernameInUseMessage)); 
        bus.Send(new GetAllUsernamesMessages());
    }

}

And the code that runs in the Authentication Service when the GetAllUsernamesMessage is received:

 

public class GetAllUsernamesMessageHandler : BaseMessageHandler<GetAllUsernamesMessage>
{
    public override void Handle(GetAllUsernamesMessage message)
    {
        this.Bus.Reply(Cache.GetAll<UsernameInUseMessage>());
    }
}

 

And the class on the web server that handles a UsernameInUseMessage when it arrives:

 

public class UsernameInUseMessageHandler : BaseMessageHandler<UsernameInUseMessage>
{
    public override void Handle(UsernameInUseMessage message)
    { 
        WebCache.SaveOrUpdate(message.Username, message.HashedPassword); 
    }
}

When the app server sends the full list, multiple objects of the type UsernameInUseMessage are sent in one physical message to that web server. However, the bus object that runs on the web server dispatches each of these logical messages one at a time to the message handler above.

So, when it comes time to actually authenticate a user, this the web page (or controller, if you’re doing MVC) would call:

public class UserAuthenticationServiceAgent
{
    public bool Authenticate(string username, string password)
    {
        byte[] existingHashedPassword = WebCache[username];
        if (existingHashedPassword != null)
            return existingHashedPassword == this.Hash(password);

        return false;
    }
}

 

When registering a new user, the web server would of course first check its cache, and then send a RegisterUserMessage that contained the username and the hashed password.

[Serializable]
[StartsWorkflow]
public class RegisterUserMessage : IMessage
{
    private string username;
    public string Username
    {
        get { return username; }
        set { username = value; }
    }

    private string email;
    public string Email
    {
        get { return email; }
        set { email = value; }
    }

    private byte[] hashedPassword;
    public byte[] HashedPassword
    {
        get { return hashedPassword; }
        set { hashedPassword = value; }
    }
}

 

When the RegisterUserMessage arrives at the app server, a new long-running workflow is kicked off to handle the process:

public class RegisterUserWorkflow :
    BaseWorkflow<RegisterUserMessage>, IMessageHandler<UserValidatedMessage>
{
    public void Handle(RegisterUserMessage message)
    {
        //send validation request to message.Email containing this.Id (a guid)
        // as a part of the URL
    }

    /// <summary>
    /// When a user clicks the validation link in the email, the web server
    /// sends this message (containing the workflow Id)
    /// </summary>
    /// <param name="message"></param>
    public void Handle(UserValidatedMessage message)
    {
        // write user to the DB

        this.Bus.Publish(new UsernameInUseMessage(
            message.Username, message.HashedPassword));
    }
}

That UsernameInUseMessage would eventually arrive at all the web servers subscribed.

Performance/Security Trade-Offs

When looking deeper into this workflow we realize that it could be implemented as two separate message handlers, and have the email address take the place of the workflow Id. The problem with this alternate, better performing solution has to do with security. By removing the dependence on the workflow Id, we’ve in essence stated that we’re willing to receive a UserValidatedMessage without having previously received the RegisterUserMessage.

Since the processing of the UserValidatedMessage is relatively expensive – writing to the DB and publishing messages to all web servers, a malicious user could perform a denial of service (DOS) attack without that many messages, thus flying under the radar of many detection systems. Spoofing a guid that would result in a valid workflow instance is much more difficult. Also, since workflow instances would probably be stored in some in-memory, replicated data grid the relative cost of a lookup would be quite small – small enough to avoid a DOS until a detection system picked it up.

Improved Bandwidth & Latency

The bottom line is that you’re getting much more out of your web tier this way, rather than hammering your data tier and having to scale it out much sooner. Also, notice that there is much less network traffic this way. Not such a big deal for usernames and passwords, but other scenarios built in the same way may need more data. Of course, the time it takes us to log a user in is much shorter as well since we don’t have to cross back and forth from the web server (in the DMZ) to the app server, to the db server.

The important thing to remember in this solution is doing pub/sub. NServiceBus merely provides a simple API for designing the system around pub/sub. And publishing is where you get the serious scalability. As you get more users, you’ll obviously need to get more web servers. The thing is that you probably won’t need more database servers just to handle logins. In this case, you also get lower latency per request since all work needed to be done can be done locally on the server that received the request.

ETags make it even better

For the more advanced crowd, I’ll wrap it up with the ETags. Since web servers do go down, and the cache will be cleared, what we can do is to write that cache to disk (probably in a background thread), and "tag" it with something that the server gave us along with the last UsernameInUseMessage we received. That way, when the web server comes back up, it can send that ETag along with its GetAllUsernamesMessage so that the app server will only send the changes that occurred since. This drives down network usage even more at the insignificant cost of some disk space on the web servers.

And in closing…

Even if you don’t have anything more than a single physical server today, and it acts as your web server and database server, this solution won’t slow things down. If anything, it’ll speed it up. Regardless, you’re much better prepared to scale out than before – no need to rip and replace your entire architecture just as you get 8 million Facebook users banging down your front door.

So, go check out NServiceBus and get the most out of your iron.



Make non-stop software simple, make it possible

Friday, August 24th, 2007

Dan Pritchett is calling for Non-Stop Software and I for one am picking up the cry as well. I’d also like non-stop operating systems, databases, and tools to support it. I’d like a system where I won’t be afraid to backup the database while its running. I’d like a system where I can incrementally upgrade the middleware while its running. I’d like programming models where all of this is hidden from the programmer – I understand that this means preventing programmers from working in certain ways (synchronous RPC for one).

I don’t have that today. All I have are patterns which, when strictly applied, bring me close. This is a real challenge with geographically dispersed teams. Its even hard with everyone sitting in the same room.

That is why this call is more directed at the platform and tool vendors than anyone else. Stop changing paradigms all the time. Stick with one or two, make them rock solid. Make it simple to get it right.

We need non-stop software.

Let’s start talking about how to get there from here.



What happens if it fails – circa 1989

Monday, August 13th, 2007

Via Patrick Logan’s Bits of Wisdom: HOPL III’s History of Erlang:

1989 also provided us with one of our first opportunities to present Erlang to the world outside Ericsson. This was when we presented a paper at the SETSS conference in Bournemouth. This conference was interesting not so much for the paper but for the discussions we had in the meetings and for the contacts we made with people from Bellcore. It was during this conference that we realised that the work we were doing on Erlang was very different from a lot of mainstream work in telecommunications programming. Our major concern at the time was with detecting and recovering from errors. I remember Mike, Robert and I having great fun asking the same question over and over again: “what happens if it fails?”— the answer we got was almost always a variant on “our model assumes no failures.”We seemed to be the only people in the world designing a system that could recover from software failures…

Those of you who’ve had the chance to hear me speak about technology and architecture know that one of my favorite ways to win an argument is by saying, “yeah, but what if the server restarts?”

Almost 20 years later, the more things change, the more they stay the same.



No such thing as a centralized ESB

Saturday, June 30th, 2007

Via David McGhee’s Q&A with Dr. Don Ferguson, but read the whole thing.

Q: Could you tell you your thoughts or preference for a distributed or centralized ESB?

DON: there is no such thing as a centralized ESB.

This is the problem with a lot of the products that call themselves ESBs. They are centralized brokers which may be clustered for availability. But they are in no way an implementation of the Bus Architectural Pattern. Please check this before cutting a check to your vendor.

Also, understand that if you do security related things in your ESB, possibly as a part of your routing rules, that if the security infrastructure is centralized that means your ESB is too. Even if it really was distributed to begin with.

Buyer beware.



Space-Based Architecture – scalable, but not much to do with SOA

Wednesday, June 20th, 2007

Space-Based Architecture (or SBA for short) just might be in your future if your building large-scale distributed systems. By focusing on high-throughput and low latency, SBA joins messaging and in-memory data caching and adds a good measure of load partitioning. However, with the entire industry enamoured with SOA, what place is left for SBA?

Before going too far ahead, you might want to take a look at my previous post “Space-Based Architectural Thinking, or listen to my podcast Space-Based Architecture for the Web. There’s also a 30 minute webcast online describing SBA more fully here. I’m also going to try to stay away from things concerning Jini this time after already discussing the connection between Jini and SOA, and the tradeoffs between two general approaches: Tasks and Spaces vs Message and Handlers.

OK, so the issue of state-management is a big one. Everybody wants to work stateless, because it scales. The only problem is that the business processes that we are automating are long running, meaning that there are external systems or people involved. This makes these processes inherently stateful. So, we need a way to scale statefully – SBA gives us that. For some background on the “Shared Nothing Architecture”, I suggest reading this post on inter-process SOA and this one as well.

Availability also has to be handled, not only in terms of having enough servers online to handle the required load but in having all the data required to process each request be accessible. This has often been handled by the database using ACID transactions – durability being that which solved availability issues, but also hurting latency the most. The problem with saving the state of our long-running business processes/workflows in the database is the load and the responsiveness requirements. In many verticals – telcos, financial, and defense to name a few, we need millisecond level latency on each stage of the workflow. This is what leads SBA to the in-memory, replicated data grid.

Note that SBA only intends to take these workflows out of the database, and not anything else – especially not Master Data. The lifetime of these workflows is incredibly short compared to that of master data like customers and products. It will have much different backup strategies as well. In terms of load, these workflows will be heavy on reads and writes together in the same transactions, but quite low in terms of just reads. If we have workflows that perform work in parallel, we easily end up with concurrency requirements that make DBAs cringe under the barrage of short transactions.

If you’re worried that Workflow Foundation (WF) won’t scale because of the above, you needn’t be. You can (more or less easily) replace the persistence mechanism of WF with your own, saving your workflow instances to an in-memory replicated data grid.

By enabling the objects in the grid to call back into logic on your servers, you have, in essence, done messaging and more. The added benefit that SBA receives from this is a unification of technology between caching and messaging. This translates directly to savings when it comes time to cluster each of those technology’s environments.

Finally, if we can find an attribute in the incoming stream of messages that creates a nice even distribution, we can then partition our load between our servers by that key. This will work up to the point where the load per key increases beyond a single server’s capacity, and then we have to look at re-partitioning, a non-trivial problem. However, if we put objects in our grid that represent the master data, and tie them to our workflow instances with both of those tied to the key of our load, a smart infrastructure can make sure all that data is already resident on the server that is handling that piece of the load. That decreases latency even more since we no longer have to pay network roundtrips to collect all the data needed before we can process it. That’s a substantial advantage for the above verticals.

But all of this has nothing to do with SOA.

Sure, it’ll change how we implement our Services internally, but it has no impact on their interfaces or the top-level service decomposition. In the Java community, the word “service” is often used to describe the logic of a system. Great significance is placed on keeping these “services” simple, as in Plain-Old Java Objects. The fact of the matter is that the logic of the system should be simple and independent of other concerns like data access and communcations (a la Web Services), but that does not make it a service, not in the SOA sense.

For more information on what Services in SOA are like, check out this podcast on Business and Autonomous Components in SOA. Actually, SBA will probably have the biggest impact on the way autonomous components will handle service-level agreements.

So, it appears that even with SOA, SBA has its place. The former dealing with business level agility, the latter dealing with all the technical aspects of supporting that agility. If you’re tasked with the designing the architecture of a scalable, available, high-throughput, low-latency distributed system, I’d strongly advise you to look at SBA – the technical value is overwhelming. Even if you don’t utilize all elements of SBA and choose the Master Worker Pattern instead of load partitioning, you’ll find the technologies supporting SBA to be quite flexible in that respect.

Will Space-Based Architectures be a part of your future? I don’t know for sure, but they’re a most welcome part of my present.



On Intermediation And SOA

Saturday, May 19th, 2007

Nick Malik has an interesting post on The value of intermediation in SOA where he starts out suggesting a couple of books that stand at the basis of much of today’s SOA thinking. I agree that far too few people seem to have read them.

In his previous post Is it service-oriented if the message cannot be intermediated, Nick defines intermediability as “SOA should give us the ability to intercept a message going from point A to point B, and react to that message without informing either end of that pipe.”. I’ll respond to this in due course.

Anyway, he continues on by saying “SOA [is] an architecture for Enterprise Application Integration.”

I can’t agree with that statement. The main reason is that EAI puts the application in the center, and that integrating existing applications one of the primary purposes of it. It is my assertion that in order to solve many of the problems that we are having today, we need to take a broader, business based view of the enterprise and model that with services. A service may be implemented with one or more applications. However, my experience has been that these services tend to use parts of existing applications, with multiple services using different parts of the same application. The reason for this is that the applications we have today, especially the ERP monoliths, do a lot, and at the same time, not everything. This is part of the reality that EAI tried to solve, but then got mired down in cross system hell. You just can’t solve poor business decomposition in the technology domain.

The value of putting services at the fore makes it possible to gradually phase out and evolve legacy applications, and migrate costly mainframe apps bit by bit without having these changes ripple out and break other services. The same is true for those systems’ data – backup strategies are defined at the service level, impacted primarily by their Service-Level Agreements.

While I whole-heartedly agree with what Nick has to say in terms of OO intermediation of the Dependency Injection variety, and that scaling up those same concepts in terms of messaging is the right way to go, I take issue with orchestration in the intermediation area. These “tactical changes” need to be done in the context of the top, business-level service strategy. That means that all logic belongs within a service. The “network” between services is just that, a “dumb” network – no business logic of any kind, just technological capabilities like knowing which physical server to route messages to.

In this spirit, I’d like to suggest an alternative solution to the example Nick gives. Here’s the scenario:

Let’s say that system 1 generates an invoice. It sends an event to the world saying “invoice here” and system 2 captures that message. System 2 asks for details about the invoice… perhaps it will place the information on a web site for internal support teams.

Let’s say that we are moving to a CRM solution in our internal support groups. We want to create the information in the CRM system related to the invoices that specific customers have been issued. We need to integrate these two systems. The existing web app needs to have a link to the CRM system’s data, to allow the user to move across easily.

And here is the solution he prescribes:

We can intercept the request for further information from the web app to the publisher. When the publisher responds with information about the invoice, we can insert the invoice in the CRM system, add a link to the CRM record for that invoice to the data structure, and resume our response to the web app. Assuming that our canonical schema has a field for ‘foreign key’, we have just integrated our CRM and web information portal… without changing either one.

Without getting into the business-level analysis of what the correct service decomposition might be, here’s what I suggest (although all of these “systems” might just end up within the same service, or having parts of them being used by multiple services).

First of all, have all information about the invoice available via the message only. This could be done by actually putting all the invoice data in the message, or by placing a URI instead where other systems can HTTP GET it from – REST style. This decreases coupling between the publisher and its subscribers. However, we haven’t solved the problem of our web apps getting access to the relevant data in the CRM system.

The solution presents itself at the business level. The invoice is not “complete” without the appropriate CRM data. Therefore, it does not make sense for a service to publish it that way. Let’s call this service the Purchasing Service. It would handle the workflow of receiving the first system’s event, adding the invoice to the CRM system, and taking the resulting full invoice data and publishing that. All external systems like the web apps would see just the final event. Orchestration, if there even is such a thing, occurs within the service boundary. This technological level intermedation isn’t even a blip at the business level. We can also imagine other services, say a Sales Service, that would use the CRM system as well.

In summary, when moving to SOA, intermediation provides many technological benefits in getting data and behavior to work across existing systems and applications, however it’s laregly a NO-OP at the service level. After phasing out many of those existing applications behind the service boundaries, the same service-level interactions would persist. Your Service-Oriented Architecture would not be any different. That’s the technical agility aspect of SOA.



Grid computing and SOA

Friday, May 18th, 2007

For a great description of what grid computing really is, read this.

What they say about the connection to SOA, though, requires some clarification. Here’s the quote:

There is a lot of talk going on about synergy between Grid Computing and SOA. It is however driven primarily by implementation concerns at this point rather than by any deeper considerations. Clearly, Grid Computing can deliver unchanged value without SOA, yet WS-* based implementation (such as Globus) can be beneficial in some cases (highly distributed heterogeneous environments that should only exist in unfortunate legacy-support situations).

The main thing that I want to call out is that “grids” don’t cross service boundaries – not at the logical level anyway. Although, even if you did share a single grid infrastructure between services implementations, you may have some problems maintaining service-level agreements, autonomy may be put in danger.

Just something to keep in mind.



Using spaces with web services

Saturday, May 5th, 2007

Willam Brogden has an article up on SearchWebServices.com on How Web Services can use JavaSpaces. I don’t want all the Microsoft folks tuning out now that they’ve heard the “J” word, so let me just say that there are technologies out there for .NET too.

A “JavaSpace” is really just a space, which is, at the end of the day, a queryable distributed in-memory hashtable. Something many of us are already doing for caching. The reason you shouldn’t be doing this yourself is simple. While keeping a single hashtable in memory on a single computer and synchronizing it against changes to your database is simple, doing that in a highly available manner across multiple servers is not. Vendors providing solutions in this space include:

But there are others as well. Bottom line: don’t develop one of your own. Do a proof of concept with your short list of vendors and go from there.

The article sums it up nicely like this:

Although JavaSpaces servers are not trivial to set up, they are much easier than any other type of grid computing server. Furthermore, the simplicity of the interface makes the learning curve easier. The greatest advantage of the JavaSpaces approach is the ease with which additional workers can be added to the grid.

It should be clear from the example that there is a lot of extra communication traffic in a JavaSpaces solution so the only reason to use JavaSpaces or any other form of grid computing in support of a Web service is a requirement for computing power or special resources that are not feasible to supply on the server directly.

I have this to add to it. Whereas most traditional systems keep the idea of message-based communication and data caching separate, spaces allow you to kill two birds with one stone. Even if you don’t go the whole Space-Based Architecture route, you’ll find that spaces will fit nicely in your distributed architecture toolkit – I know I did.



Database performance optimization

Monday, April 30th, 2007

I’ve been doing quite a bit of consulting these past weeks around performance issues for database intensive systems. The fact that we use O/R Mapping makes the business logic possible to get right (using the Domain Model Pattern), but adds another dimension to the performance optimization – primarily around limiting the number of roundtrips to the database.

However, in the truly large scale scenarios, that isn’t enough.

On of my customers is having to scale up from 500 concurrent users to 50,000. You need to get seriously close to the metal to handle that. Here’s a great post on the kind of hardware storage considerations I went through that I went through there. Of course, sometimes a database is just the wrong hammer for your screw – sometimes what you really need is a space.



The Enterprise Service Bus and Your SOA

Saturday, April 28th, 2007

It was about a year or so back—I was in the middle of figuring out how to pass an authorization token between trust boundaries when I got a call from our CIO. He had just come back from some conference sponsored by *** (Vendor’s name withheld to protect the clueless) and was brimming with new acronyms.

“Udi”, he says, “I just heard that for us to realize the potential of our SOA, we should be using an ESB.”

I’m sure he said a lot more than that, but that first sentence was enough for me to tune out. I managed to get through the conversation without catching a case of acronym-itis, but my train of thought was broken. I wasted one Google search to find out that ESB meant “Enterprise Service Bus,” got fed up, and went to Starbucks. Of all the overused terms in software, “Enterprise” is by far the most annoying—although “Service” seems to be moving up in the world.

I’ve been developing loosely-coupled systems for awhile now, using all kinds of technologies, and never thought that I was doing anything particularly ground-breaking. So when the CIO came down and introduced me to the army of high-priced consultants that were going to help us redesign our software to become more “service-oriented” I really was interested in seeing what would be different. Soon after, I realized that the vendor-driven SOA meant nothing more than Web Services, XML, and loose coupling—with the mindset of loose coupling being the most important. I’d been “service-oriented” all this time and didn’t even know it. Oh, and it turns out that we didn’t have to redesign anything.

Imagine my utter joy in hearing that the reason our project was in trouble was that we didn’t have an ESB. And all this time I thought it was because the requirements were changing every two weeks.

It was about a month after that that our project manager got promoted for doing such a great SOA implementation and was now in charge of making the whole company, oops, sorry—enterprise, service oriented. I was made “acting project manager” and managed to do one really smart thing pretty quick—get our software through testing and deployed within the month, before too many new requests came in (I think it was possible because most of the stakeholders were on holiday). The system wasn’t that complex; we had the standard HR, accounting, inventory, etc., functionality split up into the same high-level components. Each top-level component had its own server-cluster and database. We pulled the data from each of the database to the data warehouse using classic ETL. Data flowed between the components primarily using publish-subscribe semantics while the client-side software just request-responded what it needed from each component.

We kept on working, pushing out features as fast as we could, until one morning I found the sysadmin at my door.

“We had a problem with some of the disks on the accounting cluster, so we installed it on a different cluster and brought the first one down. We tried a simple test to make sure everything was OK, and it wasn’t. A bunch of things don’t seem to be working.”

After looking around for a bit I found out that the sysadmins had forgotten to update the config files on the servers and the clients. We restarted the server components and they worked fine, but we couldn’t really go around restarting and changing config files on all the clients. Luckily, the sysadmins had it set up so that every client on our domain that logged in to the network could be sent a script to run, so pushing out the new config files was easy. As for the restarting part, we called up the help desk and told them that if (when) someone called about why their software wasn’t working properly, just to tell them to close it and run it again—which apparently is their standard first suggestion anyway.

Well, things lurched along like that for a while as we put out more and more functionality—put up a Web front-end, tied in some business partners, etc. I thought I had everything under control until our COO charged in, with the CIO in tow.

“Our business partners haven’t been able to send us orders for almost a week,” he fumed. “What did you do?!”

When money talks, you’d better believe everybody listens. After some seriously hectic hours of peering through diffs between the deployed source and the previously deployed source, we were getting nowhere. Somebody, I don’t remember who, had the common sense to get the sysadmins in there too. It turned out to have been the same problem as before, but this time with the inventory cluster. So we used the same solution. The problem was that our business partners’ software didn’t get the updated config files. While I was pondering how we could get the same system to work with external partners, my boss called me in for an urgent meeting.

“I just got a call from Jim (the CIO) and he wants you and me to help him explain what happened to the CEO.”

I started to get that sinking feeling, the kind where you know things are going from bad to worse. That afternoon, we all filed in to the chief’s office, bracing ourselves for the worst. He got directly to the point.

“If anything like this happens again, you three are fired. Now get the hell out of my office.”

How’s that for motivation?

And to top it all off, before he hurried off to another meeting, Jim asks us “You guys know about our first audit for Sarbanes-Oxley in three months, right? I don’t want to see any more screw-ups, and this SOX stuff is getting people anxious. We need full audit trails on everything.”

It looks like Moore’s law will continue indefinitely: You will need to handle twice as much crap today as you did 18 months ago.

It was at about this point where I realized that I needed help. I called up one of my old partners in crime, Clem, who had been doing large-scale distributed systems development for awhile. I told him my sorry tale and asked if he could give me a hand. Unfortunately, he was in the middle of some serious crunch time, but he left me with these pearls of wisdom:

“Udi, it’s all in the message. Forget about remote method invocations and pub-subbing events—down on the wire it’s all just messages. The trick is to think of your system as passing messages at the application level as well.

Asynchronous message passing over queues. It’s really quite simple.

Once you’ve packaged everything into the message, that message can be dynamically routed anywhere, and so can its responses. The application doesn’t need to bind against any specific endpoint—it just drops a message addressed to some logical location. Infrastructure can make sure that messages get to the logical recipient, even if they change physical locations.

That infrastructure is what brings about the “Bus” architectural style between your distributed components.”

Luckily I was writing down what he said, because I had to re-read it at least a dozen times for it to sink in. Flashback to that original conversation with the CIO—so that’s what ESBs are for! Well, you wouldn’t have guessed it with all the hype going on—IT/Business Alignment, like that’s going to happen any time soon.

After talking with some ESB vendors, I understood some nuances in what Clem told me. The message passing at the application level is really passing logical messages—a message is an object just like any other. The transformation that logical message undergoes in order to be sent across the wire is something else entirely. We can transform our message to and from XML, binary, text based key-value pairs—anything we need. Finally, the transport used to pass that wire-representation between machines is an infrastructure detail that is also independent of the logical message.

Once my mind wrapped itself around asynchronous messaging, the whole SOA thing became clear. The top-level components we were developing were providing top-level services—requests would queue up like people would at the teller in a bank. A component could send out the exact same message either as a broadcast or a unicast, where the recipients would be able to use the same semantics either way. Exposing a method, subscribing to an event, and handling a message were all the same, both internally and externally.

I can’t explain how much this simplified my view of the distributed world. It kind of felt like dominos—as one thing fell into place, it knocked down something else. I was finally beginning to understand what needed to be changed in our system—and all it took was a multi-million dollar mistake and nearly getting fired.

Needless to say, the whole SOX thing caused all hell to break loose. Our team wasn’t compliant, but then neither was any other team. The same goes for most of the company’s software. But, the reassuring thing for me, was knowing where I was going with our system. It took some time—we redesigned most of the communication paths, found a vendor whose product met our needs (at the right price), and several months later, we rolled out the new version. I wouldn’t say that the rollout was flawless, but I will tell you this—when the sysadmins moved a service from one cluster to another, no config files needed to be pushed out in order for things to work, and, more importantly, no orders were lost. I even got promoted from “Acting Project Manager” to “Project Manager” 🙂



   


Don't miss my best content
 

Recommendations

Bryan Wheeler, Director Platform Development at msnbc.com
Udi Dahan is the real deal.

We brought him on site to give our development staff the 5-day “Advanced Distributed System Design” training. The course profoundly changed our understanding and approach to SOA and distributed systems.

Consider some of the evidence: 1. Months later, developers still make allusions to concepts learned in the course nearly every day 2. One of our developers went home and made her husband (a developer at another company) sign up for the course at a subsequent date/venue 3. Based on what we learned, we’ve made constant improvements to our architecture that have helped us to adapt to our ever changing business domain at scale and speed If you have the opportunity to receive the training, you will make a substantial paradigm shift.

If I were to do the whole thing over again, I’d start the week by playing the clip from the Matrix where Morpheus offers Neo the choice between the red and blue pills. Once you make the intellectual leap, you’ll never look at distributed systems the same way.

Beyond the training, we were able to spend some time with Udi discussing issues unique to our business domain. Because Udi is a rare combination of a big picture thinker and a low level doer, he can quickly hone in on various issues and quickly make good (if not startling) recommendations to help solve tough technical issues.” November 11, 2010

Sam Gentile Sam Gentile, Independent WCF & SOA Expert
“Udi, one of the great minds in this area.
A man I respect immensely.”





Ian Robinson Ian Robinson, Principal Consultant at ThoughtWorks
"Your blog and articles have been enormously useful in shaping, testing and refining my own approach to delivering on SOA initiatives over the last few years. Over and against a certain 3-layer-application-architecture-blown-out-to- distributed-proportions school of SOA, your writing, steers a far more valuable course."

Shy Cohen Shy Cohen, Senior Program Manager at Microsoft
“Udi is a world renowned software architect and speaker. I met Udi at a conference that we were both speaking at, and immediately recognized his keen insight and razor-sharp intellect. Our shared passion for SOA and the advancement of its practice launched a discussion that lasted into the small hours of the night.
It was evident through that discussion that Udi is one of the most knowledgeable people in the SOA space. It was also clear why – Udi does not settle for mediocrity, and seeks to fully understand (or define) the logic and principles behind things.
Humble yet uncompromising, Udi is a pleasure to interact with.”

Glenn Block Glenn Block, Senior Program Manager - WCF at Microsoft
“I have known Udi for many years having attended his workshops and having several personal interactions including working with him when we were building our Composite Application Guidance in patterns & practices. What impresses me about Udi is his deep insight into how to address business problems through sound architecture. Backed by many years of building mission critical real world distributed systems it is no wonder that Udi is the best at what he does. When customers have deep issues with their system design, I point them Udi's way.”

Karl Wannenmacher Karl Wannenmacher, Senior Lead Expert at Frequentis AG
“I have been following Udi’s blog and podcasts since 2007. I’m convinced that he is one of the most knowledgeable and experienced people in the field of SOA, EDA and large scale systems.
Udi helped Frequentis to design a major subsystem of a large mission critical system with a nationwide deployment based on NServiceBus. It was impressive to see how he took the initial architecture and turned it upside down leading to a very flexible and scalable yet simple system without knowing the details of the business domain. I highly recommend consulting with Udi when it comes to large scale mission critical systems in any domain.”

Simon Segal Simon Segal, Independent Consultant
“Udi is one of the outstanding software development minds in the world today, his vast insights into Service Oriented Architectures and Smart Clients in particular are indeed a rare commodity. Udi is also an exceptional teacher and can help lead teams to fall into the pit of success. I would recommend Udi to anyone considering some Architecural guidance and support in their next project.”

Ohad Israeli Ohad Israeli, Chief Architect at Hewlett-Packard, Indigo Division
“When you need a man to do the job Udi is your man! No matter if you are facing near deadline deadlock or at the early stages of your development, if you have a problem Udi is the one who will probably be able to solve it, with his large experience at the industry and his widely horizons of thinking , he is always full of just in place great architectural ideas.
I am honored to have Udi as a colleague and a friend (plus having his cell phone on my speed dial).”

Ward Bell Ward Bell, VP Product Development at IdeaBlade
“Everyone will tell you how smart and knowledgable Udi is ... and they are oh-so-right. Let me add that Udi is a smart LISTENER. He's always calibrating what he has to offer with your needs and your experience ... looking for the fit. He has strongly held views ... and the ability to temper them with the nuances of the situation.
I trust Udi to tell me what I need to hear, even if I don't want to hear it, ... in a way that I can hear it. That's a rare skill to go along with his command and intelligence.”

Eli Brin, Program Manager at RISCO Group
“We hired Udi as a SOA specialist for a large scale project. The development is outsourced to India. SOA is a buzzword used almost for anything today. We wanted to understand what SOA really is, and what is the meaning and practice to develop a SOA based system.
We identified Udi as the one that can put some sense and order in our minds. We started with a private customized SOA training for the entire team in Israel. After that I had several focused sessions regarding our architecture and design.
I will summarize it simply (as he is the software simplist): We are very happy to have Udi in our project. It has a great benefit. We feel good and assured with the knowledge and practice he brings. He doesn’t talk over our heads. We assimilated nServicebus as the ESB of the project. I highly recommend you to bring Udi into your project.”

Catherine Hole Catherine Hole, Senior Project Manager at the Norwegian Health Network
“My colleagues and I have spent five interesting days with Udi - diving into the many aspects of SOA. Udi has shown impressive abilities of understanding organizational challenges, and has brought the business perspective into our way of looking at services. He has an excellent understanding of the many layers from business at the top to the technical infrstructure at the bottom. He is a great listener, and manages to simplify challenges in a way that is understandable both for developers and CEOs, and all the specialists in between.”

Yoel Arnon Yoel Arnon, MSMQ Expert
“Udi has a unique, in depth understanding of service oriented architecture and how it should be used in the real world, combined with excellent presentation skills. I think Udi should be a premier choice for a consultant or architect of distributed systems.”

Vadim Mesonzhnik, Development Project Lead at Polycom
“When we were faced with a task of creating a high performance server for a video-tele conferencing domain we decided to opt for a stateless cluster with SQL server approach. In order to confirm our decision we invited Udi.

After carefully listening for 2 hours he said: "With your kind of high availability and performance requirements you don’t want to go with stateless architecture."

One simple sentence saved us from implementing a wrong product and finding that out after years of development. No matter whether our former decisions were confirmed or altered, it gave us great confidence to move forward relying on the experience, industry best-practices and time-proven techniques that Udi shared with us.
It was a distinct pleasure and a unique opportunity to learn from someone who is among the best at what he does.”

Jack Van Hoof Jack Van Hoof, Enterprise Integration Architect at Dutch Railways
“Udi is a respected visionary on SOA and EDA, whose opinion I most of the time (if not always) highly agree with. The nice thing about Udi is that he is able to explain architectural concepts in terms of practical code-level examples.”

Neil Robbins Neil Robbins, Applications Architect at Brit Insurance
“Having followed Udi's blog and other writings for a number of years I attended Udi's two day course on 'Loosely Coupled Messaging with NServiceBus' at SkillsMatter, London.

I would strongly recommend this course to anyone with an interest in how to develop IT systems which provide immediate and future fitness for purpose. An influential and innovative thought leader and practitioner in his field, Udi demonstrates and shares a phenomenally in depth knowledge that proves his position as one of the premier experts in his field globally.

The course has enhanced my knowledge and skills in ways that I am able to immediately apply to provide benefits to my employer. Additionally though I will be able to build upon what I learned in my 2 days with Udi and have no doubt that it will only enhance my future career.

I cannot recommend Udi, and his courses, highly enough.”

Nick Malik Nick Malik, Enterprise Architect at Microsoft Corporation
You are an excellent speaker and trainer, Udi, and I've had the fortunate experience of having attended one of your presentations. I believe that you are a knowledgable and intelligent man.”

Sean Farmar Sean Farmar, Chief Technical Architect at Candidate Manager Ltd
“Udi has provided us with guidance in system architecture and supports our implementation of NServiceBus in our core business application.

He accompanied us in all stages of our development cycle and helped us put vision into real life distributed scalable software. He brought fresh thinking, great in depth of understanding software, and ongoing support that proved as valuable and cost effective.

Udi has the unique ability to analyze the business problem and come up with a simple and elegant solution for the code and the business alike.
With Udi's attention to details, and knowledge we avoided pit falls that would cost us dearly.”

Børge Hansen Børge Hansen, Architect Advisor at Microsoft
“Udi delivered a 5 hour long workshop on SOA for aspiring architects in Norway. While keeping everyone awake and excited Udi gave us some great insights and really delivered on making complex software challenges simple. Truly the software simplist.”

Motty Cohen, SW Manager at KorenTec Technologies
“I know Udi very well from our mutual work at KorenTec. During the analysis and design of a complex, distributed C4I system - where the basic concepts of NServiceBus start to emerge - I gained a lot of "Udi's hours" so I can surely say that he is a professional, skilled architect with fresh ideas and unique perspective for solving complex architecture challenges. His ideas, concepts and parts of the artifacts are the basis of several state-of-the-art C4I systems that I was involved in their architecture design.”

Aaron Jensen Aaron Jensen, VP of Engineering at Eleutian Technology
Awesome. Just awesome.

We’d been meaning to delve into messaging at Eleutian after multiple discussions with and blog posts from Greg Young and Udi Dahan in the past. We weren’t entirely sure where to start, how to start, what tools to use, how to use them, etc. Being able to sit in a room with Udi for an entire week while he described exactly how, why and what he does to tackle a massive enterprise system was invaluable to say the least.

We now have a much better direction and, more importantly, have the confidence we need to start introducing these powerful concepts into production at Eleutian.”

Gad Rosenthal Gad Rosenthal, Department Manager at Retalix
“A thinking person. Brought fresh and valuable ideas that helped us in architecting our product. When recommending a solution he supports it with evidence and detail so you can successfully act based on it. Udi's support "comes on all levels" - As the solution architect through to the detailed class design. Trustworthy!”

Chris Bilson Chris Bilson, Developer at Russell Investment Group
“I had the pleasure of attending a workshop Udi led at the Seattle ALT.NET conference in February 2009. I have been reading Udi's articles and listening to his podcasts for a long time and have always looked to him as a source of advice on software architecture.
When I actually met him and talked to him I was even more impressed. Not only is Udi an extremely likable person, he's got that rare gift of being able to explain complex concepts and ideas in a way that is easy to understand.
All the attendees of the workshop greatly appreciate the time he spent with us and the amazing insights into service oriented architecture he shared with us.”

Alexey Shestialtynov Alexey Shestialtynov, Senior .Net Developer at Candidate Manager
“I met Udi at Candidate Manager where he was brought in part-time as a consultant to help the company make its flagship product more scalable. For me, even after 30 years in software development, working with Udi was a great learning experience. I simply love his fresh ideas and architecture insights.
As we all know it is not enough to be armed with best tools and technologies to be successful in software - there is still human factor involved. When, as it happens, the project got in trouble, management asked Udi to step into a leadership role and bring it back on track. This he did in the span of a month. I can only wish that things had been done this way from the very beginning.
I look forward to working with Udi again in the future.”

Christopher Bennage Christopher Bennage, President at Blue Spire Consulting, Inc.
“My company was hired to be the primary development team for a large scale and highly distributed application. Since these are not necessarily everyday requirements, we wanted to bring in some additional expertise. We chose Udi because of his blogging, podcasting, and speaking. We asked him to to review our architectural strategy as well as the overall viability of project.
I was very impressed, as Udi demonstrated a broad understanding of the sorts of problems we would face. His advice was honest and unbiased and very pragmatic. Whenever I questioned him on particular points, he was able to backup his opinion with real life examples. I was also impressed with his clarity and precision. He was very careful to untangle the meaning of words that might be overloaded or otherwise confusing. While Udi's hourly rate may not be the cheapest, the ROI is undoubtedly a deal. I would highly recommend consulting with Udi.”

Robert Lewkovich, Product / Development Manager at Eggs Overnight
“Udi's advice and consulting were a huge time saver for the project I'm responsible for. The $ spent were well worth it and provided me with a more complete understanding of nServiceBus and most importantly in helping make the correct architectural decisions earlier thereby reducing later, and more expensive, rework.”

Ray Houston Ray Houston, Director of Development at TOPAZ Technologies
“Udi's SOA class made me smart - it was awesome.

The class was very well put together. The materials were clear and concise and Udi did a fantastic job presenting it. It was a good mixture of lecture, coding, and question and answer. I fully expected that I would be taking notes like crazy, but it was so well laid out that the only thing I wrote down the entire course was what I wanted for lunch. Udi provided us with all the lecture materials and everyone has access to all of the samples which are in the nServiceBus trunk.

Now I know why Udi is the "Software Simplist." I was amazed to find that all the code and solutions were indeed very simple. The patterns that Udi presented keep things simple by isolating complexity so that it doesn't creep into your day to day code. The domain code looks the same if it's running in a single process or if it's running in 100 processes.”

Ian Cooper Ian Cooper, Team Lead at Beazley
“Udi is one of the leaders in the .Net development community, one of the truly smart guys who do not just get best architectural practice well enough to educate others but drives innovation. Udi consistently challenges my thinking in ways that make me better at what I do.”

Liron Levy, Team Leader at Rafael
“I've met Udi when I worked as a team leader in Rafael. One of the most senior managers there knew Udi because he was doing superb architecture job in another Rafael project and he recommended bringing him on board to help the project I was leading.
Udi brought with him fresh solutions and invaluable deep architecture insights. He is an authority on SOA (service oriented architecture) and this was a tremendous help in our project.
On the personal level - Udi is a great communicator and can persuade even the most difficult audiences (I was part of such an audience myself..) by bringing sound explanations that draw on his extensive knowledge in the software business. Working with Udi was a great learning experience for me, and I'll be happy to work with him again in the future.”

Adam Dymitruk Adam Dymitruk, Director of IT at Apara Systems
“I met Udi for the first time at DevTeach in Montreal back in early 2007. While Udi is usually involved in SOA subjects, his knowledge spans all of a software development company's concerns. I would not hesitate to recommend Udi for any company that needs excellent leadership, mentoring, problem solving, application of patterns, implementation of methodologies and straight out solution development.
There are very few people in the world that are as dedicated to their craft as Udi is to his. At ALT.NET Seattle, Udi explained many core ideas about SOA. The team that I brought with me found his workshop and other talks the highlight of the event and provided the most value to us and our organization. I am thrilled to have the opportunity to recommend him.”

Eytan Michaeli Eytan Michaeli, CTO Korentec
“Udi was responsible for a major project in the company, and as a chief architect designed a complex multi server C4I system with many innovations and excellent performance.”


Carl Kenne Carl Kenne, .Net Consultant at Dotway AB
“Udi's session "DDD in Enterprise apps" was truly an eye opener. Udi has a great ability to explain complex enterprise designs in a very comprehensive and inspiring way. I've seen several sessions on both DDD and SOA in the past, but Udi puts it in a completly new perspective and makes us understand what it's all really about. If you ever have a chance to see any of Udi's sessions in the future, take it!”

Avi Nehama, R&D Project Manager at Retalix
“Not only that Udi is a briliant software architecture consultant, he also has remarkable abilities to present complex ideas in a simple and concise manner, and...
always with a smile. Udi is indeed a top-league professional!”

Ben Scheirman Ben Scheirman, Lead Developer at CenterPoint Energy
“Udi is one of those rare people who not only deeply understands SOA and domain driven design, but also eloquently conveys that in an easy to grasp way. He is patient, polite, and easy to talk to. I'm extremely glad I came to his workshop on SOA.”

Scott C. Reynolds Scott C. Reynolds, Director of Software Engineering at CBLPath
“Udi is consistently advancing the state of thought in software architecture, service orientation, and domain modeling.
His mastery of the technologies and techniques is second to none, but he pairs that with a singular ability to listen and communicate effectively with all parties, technical and non, to help people arrive at context-appropriate solutions. Every time I have worked with Udi, or attended a talk of his, or just had a conversation with him I have come away from it enriched with new understanding about the ideas discussed.”

Evgeny-Hen Osipow, Head of R&D at PCLine
“Udi has helped PCLine on projects by implementing architectural blueprints demonstrating the value of simple design and code.”

Rhys Campbell Rhys Campbell, Owner at Artemis West
“For many years I have been following the works of Udi. His explanation of often complex design and architectural concepts are so cleanly broken down that even the most junior of architects can begin to understand these concepts. These concepts however tend to typify the "real world" problems we face daily so even the most experienced software expert will find himself in an "Aha!" moment when following Udi teachings.
It was a pleasure to finally meet Udi in Seattle Alt.Net OpenSpaces 2008, where I was pleasantly surprised at how down-to-earth and approachable he was. His depth and breadth of software knowledge also became apparent when discussion with his peers quickly dove deep in to the problems we current face. If given the opportunity to work with or recommend Udi I would quickly take that chance. When I think .Net Architecture, I think Udi.”

Sverre Hundeide Sverre Hundeide, Senior Consultant at Objectware
“Udi had been hired to present the third LEAP master class in Oslo. He is an well known international expert on enterprise software architecture and design, and is the author of the open source messaging framework nServiceBus. The entire class was based on discussion and interaction with the audience, and the only Power Point slide used was the one showing the agenda.
He started out with sketching a naive traditional n-tier application (big ball of mud), and based on suggestions from the audience we explored different solutions which might improve the solution. Whatever suggestions we threw at him, he always had a thoroughly considered answer describing pros and cons with the suggested solution. He obviously has a lot of experience with real world enterprise SOA applications.”

Raphaël Wouters Raphaël Wouters, Owner/Managing Partner at Medinternals
“I attended Udi's excellent course 'Advanced Distributed System Design with SOA and DDD' at Skillsmatter. Few people can truly claim such a high skill and expertise level, present it using a pragmatic, concrete no-nonsense approach and still stay reachable.”

Nimrod Peleg Nimrod Peleg, Lab Engineer at Technion IIT
“One of the best programmers and software engineer I've ever met, creative, knows how to design and implemet, very collaborative and finally - the applications he designed implemeted work for many years without any problems!

Jose Manuel Beas
“When I attended Udi's SOA Workshop, then it suddenly changed my view of what Service Oriented Architectures were all about. Udi explained complex concepts very clearly and created a very productive discussion environment where all the attendees could learn a lot. I strongly recommend hiring Udi.”

Daniel Jin Daniel Jin, Senior Lead Developer at PJM Interconnection
“Udi is one of the top SOA guru in the .NET space. He is always eager to help others by sharing his knowledge and experiences. His blog articles often offer deep insights and is a invaluable resource. I highly recommend him.”

Pasi Taive Pasi Taive, Chief Architect at Tieto
“I attended both of Udi's "UI Composition Key to SOA Success" and "DDD in Enterprise Apps" sessions and they were exceptionally good. I will definitely participate in his sessions again. Udi is a great presenter and has the ability to explain complex issues in a manner that everyone understands.”

Eran Sagi, Software Architect at HP
“So far, I heard about Service Oriented architecture all over. Everyone mentions it – the big buzz word. But, when I actually asked someone for what does it really mean, no one managed to give me a complete satisfied answer. Finally in his excellent course “Advanced Distributed Systems”, I got the answers I was looking for. Udi went over the different motivations (principles) of Services Oriented, explained them well one by one, and showed how each one could be technically addressed using NService bus. In his course, Udi also explain the way of thinking when coming to design a Service Oriented system. What are the questions you need to ask yourself in order to shape your system, place the logic in the right places for best Service Oriented system.

I would recommend this course for any architect or developer who deals with distributed system, but not only. In my work we do not have a real distributed system, but one PC which host both the UI application and the different services inside, all communicating via WCF. I found that many of the architecture principles and motivations of SOA apply for our system as well. Enough that you have SW partitioned into components and most of the principles becomes relevant to you as well. Bottom line – an excellent course recommended to any SW Architect, or any developer dealing with distributed system.”

Consult with Udi

Guest Authored Books



Creative Commons License  © Copyright 2005-2011, Udi Dahan. email@UdiDahan.com