The other day, I was working on enhancements to a type of “job execution harness” that executes parallel tasks in our system. We had started out with the concept of “jobs”, essentially individual commands in a Command design pattern, and had recently evolved the idea of “groups” of jobs for managing dependencies between sets of parallel tasks. (note: just for fun I’m testing out yUML for the images here)As with pretty much any complexity creep story, this one starts out with a pretty simple and elegant design. You basically had an execution harness, which took care of scheduling the jobs to be executed, and the jobs themselves, which were totally independent POJOs, with no knowledge of each other, nor of the harness itself. The harness itself provided monitoring capabilities which reported the start and end of the groups, and of the individual jobs. We were living in separation-of-concerns bliss until the day we were given a new requirement: to support monitoring of job “steps”. “Um, what’s a job step?” we asked. It turns out that some of these jobs can take a looong time to run (several hours). Users wanted a way to see what was going on during these jobs in order to get a feel for when they would be done, and if everything was going ok. We wanted at all costs to preserve the harness-agnostic nature of our jobs. We thought about breaking the bigger jobs up into smaller jobs, but unfortunately, it wasn’t possible since they are essentially atomic units of work. We considered solutions for providing some sort of outside monitor which could somehow generically track the execution, but these steps were basically execution milestones that only the job itself could know. Finally, we knew we were defeated, and gave in: because of one little logging requirement, we were going to have to introduce some sort of callback mechanism to let the once harness-agnostic jobs signal to the harness whenever a step was completed. From a pure layering perspective, you can see that we are in trouble if we are trying to keep all the harness code (in orange) in a top layer, and the business code below. So, what are some possible solutions to the problem? We could:
- Let the monitor know about the progress of the Job steps through some indirect method (e.g. through special text log statements, or indirect indications via data in the database). While it would avoid placing any explicit compile-time dependencies on the Job class to the harness, it would create a very fragile “know without knowing” relationship that the Jobs would have with the harness. Nasty.
- Create a special “StepLoggingJob” abstract class that these Jobs would extend in order to gain access to some special logging facilities. Basically, these Jobs would no longer be POJO classes, in the sense that I used the terms, since they would have to extend a harness-specific framework. Unfortunately, this introduces a circular dependency.
- Inject a special “StepLogger” utility class into the Jobs, either as a class member, or as a parameter on their “execute()” (or whatever) method
Note that we still haven’t really solved the problem… the Job class still requires something in the “harness layer”. If we were using a dynamically typed language, we could do something of a mix between option 1 and option 3 by using duck typing (the Job would know it was getting SOMETHING that could log, but wouldn’t have to know it’s from the harness layer). In order to really separate the dependencies in Java, which we use, we have to create a new layer, the “harness API layer”, and place only the StepLogger interface there:So, what happened? To summarize: because we wanted just a little more logging, we were forced to introduce a whole new layer into our application, and break the concept of 100% harness-agnostic commands. Is this an isolated case? Of course not. You see it all over the place, and logging is a great example of this. Have you ever heard someone talk about aspect-oriented programming (AOP), and give logging as an example? It’s PERFECT! With some simple configuration, you can automatically enable logging on all your methods without a single line of code. So, you can get rid of a ton of boiler-plate code related to logging, and focus on just the business logic, right? Wrong. If that were true, we all would have thrown our logging libraries in the garbage years ago. Instead, Log4J is still one of the most heavily-used tools in our library. Why? Because aspects work by wrapping your methods (and so on) with before and after logic, but they can’t get INSIDE the methods themselves.
I call this Aspect Infiltration: when your non-business infrastructure creeps into your business code. You can see this elsewhere, as well: in the J2EE container whose transaction control isn’t fine-grained enough for you (introduce a UserTransaction interface); in the security service that isn’t sufficiently detailed (give the code access to a security context), and so on. It’s a common issue for any container that wants to do all the work for you. There will come a time when the business code itself just knows better. And you’d better be ready to give it the API that it needs.