Mobile.UdiDahan.com
It’s that simple to get my feed straight to your mobile.
|
Comments Posted on Tuesday, July 15th, 2008. |
| ||||||||||||||||
Mobile.UdiDahan.comPosted in General
It’s that simple to get my feed straight to your mobile.
MSDN Magazine Article On Losing DataPosted in Articles
I’ve made the cover of the latest issue of MSDN magazine. OK, enough of that. I really am thrilled that Microsoft has taken a non-technological article and promoted it in such a way. Change is happening, and I like it. With a name as long as most of my variables, this article is quite the failure-fest: Build Scalable Systems That Handle Failure Without Losing Data No drag and drop, no wizards, code completion, or anything. Real world problems with no magic solutions. A real analysis of message loss over HTTP, what additional problems durable messaging brings with it, how integrated systems can lose consistency as the result of database deadlocks, and more. The solutions described are first and foremost about thought processes – knowing the nature of the problem, how to design a solution that addresses it, and what new problems now need to be dealt with. You’ll see exception management strategies (no, you shouldn’t be try-catching), how that feeds into deserialization exceptions, error queues, and finally into a versioning strategy that also addresses the human element. There’s a lot of architectural meat. I had to cut out a lot of filler in order to get it all into the 7 page limit. If you haven’t yet read the book Release It! this article might be considered the Cliff Notes version. But you still should read the book. It really is worth your time. Comments and questions are most welcome.
Make WCF and WF as Scalable and Robust as NServiceBusThis topic is getting more play as more people are using WCF and WF in real-world scenarios, so I thought I’d pull the things that I’ve been watching in this space together: Reliability
|
||||||||||||||||
|
Comments Posted on Monday, June 30th, 2008. |
And doesn’t handle concurrency!
Unless you don’t expose setters.
I guess it depends, doesn’t it?
Well, that was Ted’s assertion in his recent Pragmatic Architecture column on data access.
But, “it depends” doesn’t get the system built, does it?
So, here are some rules for using o/r mapping that will get you 99% of the way there.
Yes, you heard me.
Rules.
They do not depend.
If you’re doing something significantly bigger than enterprise-scale development, and you are already doing this, and it isn’t enough, give me a call. Here we go.
I mean it. Don’t report off of live data.
This isn’t just a o/r mapping thing.
Users can tolerate some, if not quite a lot of latency.And it’s not like objects are even used. It’s just rolled up data. Not a single behaviour for miles.
You want multiple users sharing and collaborating on data, right? Then don’t force them to either overwrite each others data, or throw away their own. There is one simple way to avoid that: Get an object, call a method. Once the object has the most up to date data, pass all the client data in via a method call. The object will decide if its valid, from a business perspective as well, and then update the appropriate fields.
Now your DBAs can vertically partition tables accordingly, and improve throughput. After that, you can increase the isolation level, to improve safety, without hurting throughput.
This will also keep your logic encapsulated, bringing you closer to a true Domain Model.
If your O/R mapping tool requires you to have setters on your domain classes, hide those from your service layer behind an interface.
No o/r mapping required there either. While you probably won’t be showing grids of yesterday’s data to users in an interactive environment, it’s still just data – no behaviour.
However, users should NOT update data in those grids. This gets back to rule 2. Have users select a specific task they want to perform, pop open a window, and have them do it there. Change customer address. Discount order. You get the picture. That way you’ll know what method to call on those objects you designed in rule 2.
Before wrapping up, one small thing.
You can use an O/R mapping tool to do reporting, just, for the love of Bill, don’t use the same classes you designed for your OLTP domain model. But, just because you can, doesn’t necessarily mean you should. Datasets datatables are probably just as viable a solution.
|
Comments [23] Posted on Wednesday, June 25th, 2008. |
It turns out that there was a subtle, yet dangerous problem in the use of System.Transactions – a transaction could timeout, rollback, and the connection bound to that transaction could still change data in the database.
Think about that a second.
Scary, isn’t it?
At TechEd Israel I had a discussion with Manu on this very issue, just under a different hat:
What’s the difference between a short-running workflow and a long-running one?
Manu suggested that we look at the actual time that things ran to differentiate between them. I asserted that if any external communication was involved in some part of state-management logic, that logic should automatically be treated as long-running.
Manu’s reasoning was that the complexity involved in writing long-running workflows was not justified for things that ran quickly, even if there was communication involved. Many developers don’t think twice about synchronously calling some web services in the middle of their database transaction logic. In the many Microsoft presentations I’ve been at on WF, not once has it been mentioned that state machines should be used when external communication is involved.
The problem that I have with this guidance is how do you know how quickly a remote call will return?
Do you just run it all locally on your machine, measure, and if it doesn’t take more than a second or so, then you’re OK?
The fact of the matter is that we can never know what the response time of a remote call will be. Maybe the remote machine is down. Maybe the remote process is down. Maybe someone changed the firewall settings and now we’re doing 10KB/s instead of 10MB/s. Maybe the local service is down and we’re communicating with the backup on the other side of the Pacific Ocean.
But the thing is, Manu’s right.
Writing long-running workflows (with WF) is more complex than is justified. My guess is that since WF wasn’t specifically designed for long-running workflows only, that this complexity crept in.![]()
Sagas in nServiceBus were specifically designed for long-running workflows only.
Maybe that’s what kept them simple.
Since all external communication is done via one-way, non-blocking messaging only, each step of a saga runs as quick as if no communication were done at all. This keeps the time the transaction in charge of handling a message is open as short as possible. That, in turn, leads to the database being able to support more concurrent users.
In short, sagas are both more scalable and more robust.
No need to worry about garbaging-up your database.
|
Comments [2] Posted on Monday, June 23rd, 2008. |
For those people who couldn’t come to TechEd USA and didn’t see my talks on how to build highly scalable web architectures, you’re in luck – Craig, the man behind the Polymorphic Podcast sat down with me and we chatted about what the problems, common solutions, and effective tactics there are in this space. For those of you who were at TechEd and still didn’t come to my talk – what were you thinking?!
🙂
Some of this stuff is a bit counter-intuitive (and not readily supported by the tools available in Visual Studio) so please, do feel free to ask questions (in the comments below).
|
Comments Posted on Thursday, June 19th, 2008. |
One of the things I haven’t like about using IoC containers, AKA dependency injection frameworks, was the string-based configuration model they exposed. In order to set these values, developers had 2 options: either use XML config (usually without the benefit of intellisense or refactoring support), or use code (still quoting property names – again, no intellisense or refactoring support).
In short, there seemed to be a hole in the development model.
Here’s an example from how nServiceBus used to do this:
builder.ConfigureComponent(typeof(HttpTransport), ComponentCallModelEnum.Singleton)
.ConfigureProperty(“DefaultNumberOfWorkerThreads”, 10)
.ConfigureProperty(“DefaultNumberOfSenderThreads”, 10);
The problem was that if a developer got the case of the property wrong, misspelled it in some way, or somebody later refactored/renamed that property, the system would break. It would also be very difficult to figure out why.
Then, a couple of weeks ago, it dawned on me.
This was the same problem we used to have with testing using mock objects – before we had today’s more advanced frameworks. So, the solution must be to use the same techniques. The container should give the developer an object that looks just like their class, but that would intercept all calls. Then, that interceptor could turn those into the config calls shown above. Here’s what the new config model looks like:
HttpTransport transport = builder.ConfigureComponent<HttpTransport>
(ComponentCallModelEnum.Singleton);transport.DefaultNumberOfSenderThreads = 10;
transport.DefaultNumberOfWorkerThreads = 10;
Granted, you’re not going to have tons of code like this. However, for all those parameters which are factory-configured and that customers/integrators shouldn’t tinker with, it makes a difference. The biggest difference is during that time of development where you’ve gotten into preliminary integration tests but the systems components are still being “polished”.
Aside: On the current project that has adopted this model, we’ve probably saved (conservatively) about 3 months of effort with this tiny (?) thing, and this isn’t a huge project. If that’s more than you would’ve thought, well, I was surprised myself. First, understand that in the old config model, everything still compiles and unit tests pass, even though its broken.
Just consider what happens in the lab when this occurs. You have N testers that can’t test the new version, waiting. You have the person who installed the version, trying to figure out what’s wrong. They then call in one of the developers where most of the new development occurred since the previous version. They fiddle around with it, looking at exception traces and whatnot. In the best case, we’re talking about 2 hours from noticing its broken until a new version comes out fixed. Multiply that by N+3 people. Then multiply by the number of versions you do integration tests on in the lab.
Caveat: In the current version, properties must be virtual in order for this to work.
For those of you who want just this feature without nServiceBus, I’ve put up all the binaries here. For the source, you’ll need to go to here.
Let me know what you think – especially if you can take the implementation to the point where it won’t need virtual properties to work 🙂
|
Comments [12] Posted on Friday, June 13th, 2008. |
Prism, AKA Composite Application Guidance + Composite Application Library, is rolling towards a release. I’ve been talking with Glenn Block quite a bit about Prism, and am even on the advisory board (what were they thinking?).
One of the topics not covered by Prism is occasional connectivity, and I would like to say a word or two about that. First of all, if you’re building a standalone client (one that doesn’t communicate with anything), then there’s a good chance that Prism isn’t for you, although you could be composing other standalone client modules. So, if your client isn’t communicating with anything, well, then this post probably won’t interest you that much. Let’s start with…
Networks fail. Period.
This means that your client machine will not always be connected to other servers.
Also, servers fail – critical Windows patches and just regular power outages.
Ergo, your “smart” client will be occasionally connected, whether you planned for it or not.
And please don’t take this post as a “dumping on Prism” post – it isn’t intended that way. Rather, it is about how you should think about designing modules in Prism, and why.
Consider the case where we have two modules being composed in a single client. Each module communicates with a different server. Let’s call these modules Ma and Mb, and the servers Sa and Sb respectively. Now, let’s discuss what occurs given that the modules weren’t designed with occasional connectivity in mind.
User clicks something in Mb which requires communication.
Mb tries to call Sb, say, over HTTP, using a regular web service invocation.
The calling thread, in this case, the one used for user interaction, is blocked waiting for a response from Sb.
Sometime in this call, Sb fails, connectivity goes down, whatever.
30 seconds after the call, the HTTP connection times out.
If something important were happening in Ma at the same time, the user couldn’t even see it, let alone do anything about it since the user interaction thread is stuck. This is a serious concern for the financial services domain, but in many others as well.
I can go on, but I think that that’s enough to paint the picture that if you are building a smart client, there are a lot more things to think about than just learning Prism. That’s my main concern after witnessing what happened around the CAB. Given the learning curve around these frameworks many developers don’t seek to deepen their understanding beyond just becoming proficient with them. This isn’t just centered on the developers, evangelists in Microsoft tend to paint the picture this way:
Once you understand X (CAB, Prism, BizTalk, whatever), all your problems are solved.
That’s not to say there aren’t good things in those technologies, but that’s just it, they’re just tools. Silver hammers and “laser” guided saws do not a master carpenter make. There’s actually a pretty good chance the regular guy will saw their arm off.
I do hope more “instruction manuals” will be coming out of Microsoft on these topics. That’s not to say there aren’t any. Specifically on the topic of occasional connectivity, there is Chapter 4 of the Smart Client Architecture & Design Guide. Unfortunately, it doesn’t say anything about how that connects with the MVC/MVP being used client side (the bits affected by Prism). Chapter 6 of the same guide deals with the client-side threading, but doesn’t address issues like:
I haven’t yet documented all the patterns that answer these questions, but until I do (or Microsoft does), let me offer these few resources which I’ve put out over the years:
There’s also some more links under the Smart Client link of my “First time here?” page.
Also, please join me in asking Microsoft for an update to these guides – comments below or your own blog posts would be great.
|
Comments Posted on Monday, June 9th, 2008. |
For all the people who came to my talk on Web Scalability with Asynchronous Systems Architecture, thanks for coming and being such a great audience. For all my other readers and loyal subscribers, I’ve updated the code since last it was published so you can find the new stuff here.
Here’s the powerpoint
And here’s the code
|
Comments [7] Posted on Friday, June 6th, 2008. |
|
|
Sam Gentile, Independent WCF & SOA Expert
Ian Robinson, Principal Consultant at ThoughtWorks
Shy Cohen, Senior Program Manager at Microsoft
Glenn Block, Senior Program Manager - WCF at Microsoft
Karl Wannenmacher, Senior Lead Expert at Frequentis AG
Simon Segal, Independent Consultant
Ohad Israeli, Chief Architect at Hewlett-Packard, Indigo Division
Ward Bell, VP Product Development at IdeaBlade
Catherine Hole, Senior Project Manager at the Norwegian Health Network
Yoel Arnon, MSMQ Expert
Jack Van Hoof, Enterprise Integration Architect at Dutch Railways
Neil Robbins, Applications Architect at Brit Insurance
Nick Malik, Enterprise Architect at Microsoft Corporation
Sean Farmar, Chief Technical Architect at Candidate Manager Ltd
Børge Hansen, Architect Advisor at Microsoft
Aaron Jensen, VP of Engineering at Eleutian Technology
Gad Rosenthal, Department Manager at Retalix
Chris Bilson, Developer at Russell Investment Group
Alexey Shestialtynov, Senior .Net Developer at Candidate Manager
Christopher Bennage, President at Blue Spire Consulting, Inc.
Ray Houston, Director of Development at TOPAZ Technologies
Ian Cooper, Team Lead at Beazley
Adam Dymitruk, Director of IT at Apara Systems
Eytan Michaeli, CTO Korentec
Carl Kenne, .Net Consultant at Dotway AB
Ben Scheirman, Lead Developer at CenterPoint Energy
Scott C. Reynolds, Director of Software Engineering at CBLPath
Rhys Campbell, Owner at Artemis West
Sverre Hundeide, Senior Consultant at Objectware
Raphaël Wouters, Owner/Managing Partner at Medinternals
Nimrod Peleg, Lab Engineer at Technion IIT
Daniel Jin, Senior Lead Developer at PJM Interconnection
Pasi Taive, Chief Architect at Tieto
© Copyright 2005-2011, Udi Dahan. email@UdiDahan.com