Announcement: First meeting of the IASA-RJ study groups

March 13, 2009

The first official meeting of the IASA-RJ study groups will be happening on Thursday, March 26 from 7 to 9 PM. The idea is to eventually be running (at least) three self-organizing study groups separately. But for this first meeting, and until we decide that we have enough interest for each, we will try to touch on all three topics:

  • A study of the ATAM methodology for evaluating software architectures
  • Study and preparation for The Open Group’s ITAC certification for IT Architects
  • Study of The Open Group Architecture Framework (TOGAF) and preparation for those that want certification

That’s a lot to cover in a single meeting, and a very ambitious undertaking for a single study group. In this first get-together I will be giving an overview of the ATAM, and Marcelo Sávio will present the other two topics. Time permitting, we will then get down to business and discuss how we can organize the study group(s) and where to go from here.

If you’re interested in learning more about any of these topics, and happen to live in the Rio de Janeiro area, let me know. I’ll be happy to provide full information regarding the location of the meeting. Also, space is limited, so the sooner you let me know, the better. Lastly, if you are not yet a member of the IASA-RJ group, go to the Google Groups site and request permission to join, or send me your email and I’ll be happy to oblige.

Hope to see you soon!

What can quality attributes tell you about a system?

October 10, 2008

Are there quality attributes that can be attributed to specific patterns of system or software architecture? And can you learn about systems based on their attributes?

I’m standing at the cashier of a local furniture store, trying to pay for a new dresser with my credit card, and after swiping, the guy looks at the monitor and says, “your account must be blocked.” I’m thinking, “what? I use this card every day, and I pay it on time…”, so I tell him to try again. He swipes it 3 more times (each time having to go through an annoying loop of cancellations and submission screens), each time with the same result. The message is very clear on the screen, albeit without those helpful little details that we like: “Account invalid.”

So the cashier tells me to go over to the phone to call the credit card company and see if they can clear up the confusion. I follow his instructions because there’s not much else I can do about it, although in the back of my mind, I’m thinking “that message popped up WAY too quick to have come from my credit card provider.”

So, there you have it: I’ve made a judgement call on a problem based on a quality attribute. I happen to know that credit card authentication systems follow a basic architectural pattern of client-server, usually over the internet or some form of extranet. One thing we know about client-server architectures is that the performance characteristics of communication between the client and the server (especially when first establishing communication) are orders of magnitude higher than in-process or intra-machine exchanges.

Based on this, there’s that voice in the back of my head telling me, “get off the phone, you idiot! That was a client-side validation error!” That’s a pretty common design pattern for client-server architectures: validate as much of your data locally before you go and waste your time with a long-running remote request that is doomed to fail.

I can’t discern from this if they have a layered structure on the client which first validates data, then passes the information to some business delegate, if they are using some sort of strong domain object called “CreditCard” that does its own validation, or if they just slipped in a little “if” statement in some 1000-line monstrosity of a procedure that will do everything whether the developers intended it or not. Is there any way to tell this from the outside just by the system bahavior? Not directly (that I can think of at least), but what are the quality impacts these different approaches have? We know that the “monster procedure” approach is initially fast to develop, but increasingly hard to maintain: a higher rate of bugs and longer time-to-resolution. I suppose I could try to fish around for bugs, but not with my credit card! A form-based validation scheme tends to decentralize validation and error messages, so I could try different (invalid) credit cards from different points of entry to see if I get different responses.

But I don’t have time for any of this – remember, I’m still on the phone. Yes, although I’ve dialed in to talk to a person, they have (well, their voice recording has) asked me to first enter my credit card number, the 4-digit security code (the example they gave me was for the 4-digit expiration date – wtf?), and the year I became a “member”. Hmm… I guess this is another way of up-front validation. This is actually a business process, not a system process: they are trying to save operator time and reduce errors by performing the validation ahead of time only with the customer. 9 out of 10 times, by the time I get to an operator, they ask me for all the same details all over again. But not this time – the lady on the phone already knows who I am! What does this tell me about their enterprise architecture? I guess it means that at the very least, its a distributed environment with multiple systems interconnected. This could be a SOA-type web services architecture, some sort of peer-to-peer interaction, a shared data store (via mainframe?) or something else.

Later on in the (totally futile) conversation, the operator transfers me to someone else. Wouldn’t you know it, the other person automatically has all my information without having to ask for it yet again – WOW! Well, this probably isn’t peer-to-peer. Although it would work in this situation, there’s generally auditing and tracking of customer support cases, and that sort of supportability requirement requires a central server, or at least a shared data repository. It’s possible that a peer-to-peer broadcast request-response pattern was used for locating an available support person in the right sector and for handing off the case number, however. It’s less traceable, but more scalable when a lot of data needs to be exchanged and peers come and go with frequency. But if I had to make a guess, I’d say this type of system usually calls for more of a centralized hub (mediator? worflow queue?) in order to keep a tight lid on things.

I forgot to mention – after entering my credit card data for validation, there was a 5-second delay before there was any response on the line. This seems to indicate a synchronous validation process… again, client-server. Then they tossed me into the musical wonderland of the on-hold queue. That’s right, I said queue. They didn’t regail me with fascinating tales of how many people were ahead of me, or how long I should expect to wait, but they very well could have. Queues tend to follow a “competing consumer” pattern in which each “consumer” (customer support person) asks the central queue provider to give them the next item (message, job, impatient caller) in a FIFO manner as soon as they’re available. We know that this is a highly-scalable low-latency way to distribute work among a variable number of “fungible” worker resources. The process was so efficient in this case that it cut off my all-instrumental ecstasy prematurely, halfway through the crescendo in “Feelings”. Fortunately, they quickly confirmed for me that my account was fine and that I was wasting my and their time with this multi-system information exchange – it was a client-side error, after all.

Finally, back to the cashier. He’s trying to cancel my order because he’s given up on waiting for me, and people are getting upset. Apparently, maintaining multiple sessions in memory was not a usability concern, although there are patterns that would support it. I get back right about the time he hits “cancel”, so he has to start my process from the beginning. Having confirmed that the problem is client-side, he calls the manager, who applies a contingency mechanism known as “Scotch tape” to the back of my card, an violá! After a several-second-long client-server validation, my purchase is approved.

Really, without a comprehensive investigation, the quality behaviors can only give you an inkling of the architecture backing a black-box system that you are using. But rarely are we asked to do that. The point here is that the common architectural patterns that we use DO have an impact on those quality attributes. In fact, that’s the whole point: we use architectural patterns to get the job done, but we choose between different options based on which one will get it done with the right balance of trade-offs for our system.

That’s the basis of the Architecture Tradeoff Analysis Method (ATAM). Architectures can be analyzed before they are even built based on the architectural patterns they use, and the prioritized quality goals that are meant to be supported by the system. Given a known set of patterns and the impact they can have (both positive and negative) on these attributes, it is possible to know ahead of time if you have chosen a suitable architecture.

Practitioners of ATAM are supposed to maintain their own templates of architectural patterns and their trade-offs. There are also some public architecture pattern repositories, and IBM has their “IBM Patterns for e-Business” site (an IBM consultant once showed me a monstrous internal database that was much more complete than this). But none that I know of are very usable in the context of the ATAM, or just picking between trade-offs. Personal experience and intuition always end up being the guiding forces. Do you know of any good pattern catalogs that can tell you the ups and downs of using a pipe-and-filter approach? Of avoiding downtime via a Watchdog pattern, or how to choose between a relational and a hierarchical database? If so, I’d like to know!