The Golden Rule for Coding Standards

July 22, 2011

It can be really tedious to define coding standards to the finest detail. The following rule covers about 80% of what you really need to know:

Do unto your code as you would have others do

This rephrasing of the Golden Rule gets to the heart of what standards are all about. Whatever code you write, realize it is something that you are inflicting on other developers further down the line. Think of it as your legacy, because legacy code is what it is. Imagine yourself having to maintain the code (and you may, in fact, have to), and ask yourself if that’s code you can work with.

Also keep in mind that to some degree, you are setting the standards for all work that follows, especially considering the corollary part of this rule:

Do unto the code as others have already done

If code already exists in the project, chances are there a slew of already-established coding conventions. Before writing a single line of code in a new project, take a look around at what’s already there. Ask yourself some questions, like:

  • How is everyone formatting their code?
  • Which libraries are they already using?
  • Are there already examples of how to do what I have to do?
  • What are the overall design patterns, application layers, and so on that are already in force?

and, lastly,

  • Will I have to come up with some new conventions or solutions that others will have to follow?

By looking around, you can save yourself a LOT of time in terms of decision-making for insignificant issues (8-ball decisions), and may come up with examples that solve the majority of your work right off the bat. If you find that you need to create some new conventions, try to follow the “feel” of the code that has already been established. Otherwise, you’re good to go – just remember the Golden Rule!


Camel Hair, the A-Team and Programmer Cross-pollination

March 22, 2010

Nsaa Adinkra Symbol “He who does not know the real Nsaa buys the fake of it”

The Joys of Cross-Pollination

One subject that comes up often in my work, but in many different guises, is the benefits of “cross-pollinating” your team. It’s a no-brainer that knowledge dissemination across your team results in more consistent work, more flexibility in terms of resource assignments, and in the end, better programmers all around. The ways of disseminating that knowledge range from brown-bag lunch presentations (which generally work for 1 or 2 lunches before they run out of steam) to required code reviews to the extreme programming approach of pair programming. I know a lot of people that cringe at this technique for any number of reasons, but even in places I’ve worked where I’ve encountered unmovable resistance to this method (either because it’s a “waste” of resources, or because the programmers themselves don’t like it), I have always been able to get an exception to the rule under one circumstance: as a way to quickly transfer knowledge from one more experienced developer to another. That’s because there’s no documentation in the world that can replace having someone explain something to you AS YOU DO IT.

Cross-pollination doesn’t end at training, however. And neither does the purpose of pair programming. Common practice states that if there’s a disparity of knowledge, the least “experienced” developer is that one that should be at the keyboard to ensure that they don’t get passed over and left behind. But it also states that the pairs should be swapped with some frequency. The exact frequency is up to the team, but I generally hear anywhere from 2 hours to one day, max. The reason for this is, in part, that “more” or “less experienced” isn’t just a question of who’s older: it’s a matter of context. I, for example, have been working in Java for more than a decade. But when I work on a Rails project, PHP, or something out of my comfort zone, I have pretty much everything to learn from someone that’s been working with the technology longer than I. And when I roll onto a new project, I have to learn everything about their process, business logic, architecture, etc.

Of course, even “equal” developers have a lot to teach each other. I can’t count the number of times I was watching a presentation, doing a code review, or sitting at someone’s table when I saw their fingers hit some magic key combination that has since saved me hours of pointing and clicking. When you work that closely with others, you pick up new keyboard shortcuts, find out about cool tools, see what blogs they’re reading, and on and on. I’ve been trying out ways to replicate this kind of knowledge transfer for distributed teams using Enterprise 2.0 tools and techniques, but nothing so far can match what you get out of working on the same machine as someone else.

Cross-pollination works on other levels, too. I consider myself very lucky for the time I spent working for Sapient. That’s where I learned what it is that an architect does, and chose my personal career path in life. It’s also where I learned how teams get inspired, and how a process can work to bring people of different disciplines together. I’m also lucky to have worked at with fantastic people here in Brazil, where I learned how a process (yes, even a “heavy” process like RUP with CMMI) can work, where I learned about TDD, Continuous Integration, and more importantly, how to stay current and involved in the community at large.

In the meantime, I hear endless stories about software development shops that are stuck in the 20th-century mindset in terms of how to develop software. Everyone’s heard of TDD, but no one bothers to do it. Everyone says they do agile, but no one seems to get it right. I had a recent conversation with a good friend of mine who said that he had just been offered a promotion to be manager of the whole development team after only 6 months at a fairly successful small software and consulting company because he’d just turned around one of their most important projects from a glaring failure to a raging success. What did he do? He introduced the “radical” concepts of unit testing, continuous builds, and functional testing (and I don’t mean automated functional tests – I mean testing the software AT ALL). “In the land of the blind, the one-eyed man is king.”

The A-Team


This is all old news, but what do I make of it? The first thing that came to mind was that there’s a great opportunity here to start a company specializing in fixing up run-down software factories and development teams. There are consultants out there doing the mentoring thing, but the ones I’ve seen generally focus on a couple of weeks of training, then they cut and run. It might be interesting to put an A-Team spin on things to send in some experts that will actually work with the team to get all the right tools installed, and then make sure everyone is properly “cross-pollinated” before leaving. My ex-team at Sakonnet seems to have gotten its share of fame in the local market, since there were a number of employers hoping to snatch us up as soon as they heard we were available, so maybe I could pull us all back together again for this mission – as long as I get to be the one with the cigars.

But thinking about this a little deeper, it occurs to me that it’s a pretty sad thing that there could be such disparity between work environments. I was discussing with my “one-eyed” friend about why this happens, and I came to the conclusion that people only learn what they have to, and only what they are exposed to. There is a saying in Ghana that “He who does not know the real Nsaa [a coarse cloth made from camel hair] buys the fake of it”. And people that have NEVER worked in a place that encourages process improvements and the quest for better practices won’t know what they’re missing. Any newcomer that aspires to modernize their work area is fighting against the tide, and if they want to make any permanent changes, they have to do it quick. As soon as they themselves get comfortable in their new environment, it’s all over. Perhaps the biggest enemy to following best practices and continuous improvement is safe, comfortable, long-term employment. I’ve been reading articles on promoting what people are calling “Employment 2.0” (when people start coining the phrase “Sex 2.0”, I’m unplugging the internet. Umm… nevermind). The idea is that by loosening ties to one single job, you increase competitiveness, you let the cream rise to the top, blah blah blah. And you INCREASE CROSS-POLLINATION.

Cross-Pollination as… a Business Model?

So there it is: it’s good to cross-pollinate your developers with each other. It can also be good to cross-pollinate them with developers from OTHER companies. This already exists: in conferences, which may be sponsored by you employer. Also, in outside professional groups, like user groups, programming Dojos and the like. But… what if we could do this as a business? What if we could combine the idea of the A-Team with employer-supported open cross-pollination? You send in a crack team of two or three senior developers to fix up a dev team’s practices. You also get a team of less-seasoned developers to help out, for CHEAP, or even for FREE. Why? Because some other company is loaning them to the cause as a form of boot camp training for
their own employees. When the short-term gig is over, they get them back, knowing that they have gained some real on-the-job experience working with the A-Team. This sort of developer loan-outs could be staggered as well. A company might loan out some key employees in advance of having the A-Team mentor the whole development department, or they may send them out on a project afterwards as a type of refresher (or because they promised to do the A-Team one favor some time in the future as payment…).

I don’t know if the idea above would really work. I’m just starting to think this through. It could be that with Employment 2.0 on the horizon, no one will want to invest in their employees anymore (in that case, it might be a good investment for you to loan out YOURSELF…). But imagine what regular company cross-pollination could do for the software and IT industry as a whole. If these assumptions are true:

  1. There is great disparity between the productivity of different software development teams
  2. This disparity is caused by a lack of awareness and experience in better practices

then it seems to follow that we would all have something to gain by getting out of our comfort zone every now and then and making yourself a junior to someone’s senior developer. I pity the fool who doesn’t. I’ll save any Hannibal quotes for when I have this plan working…


Test Coverage Reporting on Oracle 10g

February 17, 2009

In a previous post, I mentioned that my colleague Eduardo Morelli and I had made an effort to bring some of the standard quality tools from the software development world (automated tests, and test coverage reporting) over to the RDBMS world. In that post, I went into some detail on FIT4DBs, our automated unit testing framework. Now I’d like to reveal some of the secrets to our Oracle test coverage reporting tool, called OracleTC.

Test Coverage Reporting

OracleTC was more or less modeled on the Cobertura test coverage tool that we are using for our Java code. The basic idea of any tool like this is to perform the following steps:

  1. Somehow wrap your normal code (via an interceptor, a decorator, or some other pattern, usually in as unobtrusive was as possible) with some special code that can observe what is being executed
  2. Execute your suite of tests
  3. Collect statistics regarding the number of lines/paths/functions executed during your test
  4. Generate a report comparing the executed code to the total set of code that SHOULD have (or at least could have) been executed during your tests

The result is what we call the “code coverage” of your test suite. For example, if you are measuring at the level of functions, and your suite executes 15 out of 20 possible functions, your test suite “covers” 75% of your total number of functions. Ideally, you want to be more finer-grained than that. Just because a function has been executed once, it doesn’t mean it’s been fully tested. A good test suite should cover all possible variations on the outcome of a method or function, including edge cases (errors, exceptions, unusual inputs and so on). So, many test coverage suites report on the line, or statement coverage (did your tests execute every single statement in the function?).

The more sophisticated ones will actually try to capture path coverage, which is even more comprehensive than line coverage. For example, imagine you have the following code:

myObj = new Obj()
if (a == true) {
 myObj = null
}
if (b = true) {
 myObj.doSomethingInteresting()
}

In the above example, we could get 100% line coverage by running one test with a=true and b=false, and another with a=false and b=true. Does that mean we’ve tested everything? No. Unfortunately, we’ve missed the rather unfortunate case of when both a and b are true, which would be a disaster.

Observing Tests in Oracle

Anyway, back to OracleTC. We wanted to see if we were covering all our bases with our automated tests. To do this, we had to find a solution for the first step above: how could we slip in some code to actually observe what was going on inside Oracle? The usual solution to this problem, when you don’t have access to the guts of your system, is to write some sort of “pre-compiler” that modifies your code (in this case, the stored procedures and triggers and such) to add some sort of tracing to the works (INSERT INTO ORACLETC_TRACE (PROCNAME, START_TIME, END_TIME, PARAMS) VALUES (blah blah…)). This can be pretty intrusive, and will probably only get you as far as procedure-level coverage reporting. It’s more work than we were ready to put into the tool, and I would rather throw in the towel than go through that.

Fortunately, Eduardo (and Oracle) came to the rescue! He found a package called DBMS_PROFILER, which is provided with Oracle. It’s meant to be used as a tool for performance profiling. But look what it can tell you (text copied from reference link):

  1. Total number of times each line was executed.
  2. Time spent executing each line. This includes SQL statements.
  3. Minimum and maximum duration spent on a specific line of code.
  4. Code that is actually being executed for a given scenario.

There you have it! Test coverage reporting nearly out of the box! The package unobtrusively wraps your code executions, once you’ve enabled it for your session, and reports all its data, line-by-line, into some reporting tables that it sets up. We did have to figure out how to enable the profiling for ALL sessions, since our tests are multi-threaded, but the solution is to use a simple trigger (note that we created our own utility package called PKG_PROFILER to make things easier, but you get the idea):

CREATE OR REPLACE TRIGGER On_Logon
 AFTER LOGON ON schema
 DECLARE
 V_WHO VARCHAR2(200);
 BEGIN
 SELECT sys_context('USERENV', 'OS_USER') || '_' ||
 sys_context('USERENV', 'SID') || '_' ||
 sys_context('USERENV', 'TERMINAL')
 INTO V_WHO
 FROM dual;
PKG_PROFILER.PRC_START_PROFILING('STARTING PROFILER FOR (OS_USER, SID, TERMINAL): ' ||   V_WHO);
 END;
 /
CREATE OR REPLACE TRIGGER On_Logoff
 BEFORE LOGOFF
 ON schema
DECLARE
 BEGIN
 PKG_PROFILER.PRC_END_PROFILING;
 END;
 /

Querying the Results

Once you execute your tests, reports can be easily generated by running queries on this data. As an example, the following query displays which lines took the longest execution time:

select * from (
 select u.unit_name, d.line#, s.text , numtodsinterval (round(d.total_time/1000000000,2), 'MINUTE') minutes
 from plsql_profiler_data d inner join plsql_profiler_units u
 on d.unit_number = u.unit_number
 inner join user_source s on (s.name = u.unit_name and s.type = u.unit_type and s.line = d.line#)
 where u.runid = 1
 order by 4 desc)
 where rownum <= 5
Top 5 most consuming statements

Top 5 most consuming statements

We weren’t able to go the whole 9 yards and figure out path coverage, but thanks to DBMS_PROFILER, OracleTC easily supports line-level coverage reporting. Here’s an example of a procedure we created to report the line coverage at a package-level granularity:

PROCEDURE PRC_TEST_COVERAGE(I_RUNID_START NUMBER, I_RUNID_END NUMBER) IS
 /*
 |
 | Generates a Test Coverage report based on a range of runids.
 | Before activating this procedure, do not forget to set serveroutput
 |
 | Parameter:
 |       I_RUNID_START: bottom run id
 |       I_RUNID_END: top run id
 |
 | Example:
 |       PKG_PROFILER.PRC_TEST_COVERAGE (I_RUNID_START => 1,I_RUNID_END = 862);
 | @TODO:
 |       Create a temporary table to store results
 |       Deal with anonymous blocks
 |
 */
 CURSOR CUNITS IS
 SELECT DISTINCT UNIT_NAME
 FROM PLSQL_PROFILER_UNITS
 WHERE UNIT_NAME LIKE 'PKG_%'
 ORDER BY 1;
 V_COVERAGE NUMBER(5, 2);
 BEGIN
 FOR REG IN CUNITS LOOP
SELECT ROUND(EXEC.NBR / TOTAL.NBR * 100, 2) COVERAGE
 INTO V_COVERAGE
 FROM (SELECT COUNT(*) NBR
 FROM PLSQL_PROFILER_DATA D, PLSQL_PROFILER_UNITS U
 WHERE D.RUNID = U.RUNID
 AND D.UNIT_NUMBER = U.UNIT_NUMBER
 AND D.RUNID BETWEEN I_RUNID_START AND I_RUNID_END
 AND U.UNIT_NAME = REG.UNIT_NAME) TOTAL,
 (SELECT COUNT(*) NBR
 FROM PLSQL_PROFILER_DATA D, PLSQL_PROFILER_UNITS U
 WHERE D.RUNID = U.RUNID
 AND D.UNIT_NUMBER = U.UNIT_NUMBER
 AND D.RUNID BETWEEN I_RUNID_START AND I_RUNID_END
 AND U.UNIT_NAME = REG.UNIT_NAME
 AND D.TOTAL_OCCUR > 0) EXEC;
 DBMS_OUTPUT.PUT_LINE(RPAD(REG.UNIT_NAME, 30, ' ') || '-------------' ||
 TO_CHAR(V_COVERAGE, '999.99'));
 END LOOP;
 END;

The results of executing this procedure look more or less like so:

SQL> EXECUTE PKG_PROFILER.PRC_TEST_COVERAGE (1,862);
PKG_MY_INTERFACE         ------------- 100.00
PKG_MY_RUN               -------------   8.35
PKG_CALC_HIST            -------------  33.33
PKG_COPY_STUFF           -------------  47.89
PKG_PT_OUTPUT_CALC       -------------  16.29
PKG_DATA_VALIDATION      -------------  18.03
PKG_DENORMALIZE_DATA     -------------  42.48
...

This only shows those packages that were actually used during our test suite. What about those that we missed (i.e. with 0% coverage)? Here’s another query that can list those:

select DISTINCT p.object_name
 from plsql_profiler_units u
 right outer
 join user_procedures p on p.object_name = u.unit_name
 where u.unit_name is null
 ORDER BY 1;

We can run similar queries to report on triggers as well.

Shiny Happy Reports

But as I said before, our inspiration was the Cobertura project, and they set the bar pretty high when it comes to coverage reporting. Our next step was to try to replicate their web-based reports that show an overview of the results (package-level), and then lets you drill all the way down to the code itself!

EXTRACTING THE RESULTS

Our first task was to figure out how to get all this data out in a format that could be mastigated and munged by simple scripts until it looks and acts like well-behaved HTML. We chose XML, which has tool support in nearly every scripting and programming language around, and is good for structured data. Once again, Oracle comes to the rescue with their package DBMS_XMLGEN, which can format query result sets into a canonical XML format.

We wanted the whole enchilada, including the source code for our packages, along with the line-by-line statistics. So Eduardo created some procedures that:

  1. Join the data from the profiling stat tables to the Oracle user_source view
  2. Store the results in a “staging table”
  3. Convert the rows to XML-formatted CLOBs (one per module or package – equivalent to one per original source file in our code base)
  4. Spit them out into a folder as XML-formatted text files, using the package name as the file name

Here’s the code that populates the staging table for procedures which will later be converted to XML (note that this work must be repeated with some differences to capture trigger executions as well; the way these are reported in USER_SOURCE is actually a bit tricky):

-- XML_CLOBS will store one clob per line. Each line will correpond to a package, procedure or trigger
 create table xml_clobs (name varchar2(30), result clob);
-- Staging area for procedures and package bodies
create table xml_stage as
 select name, line, max(total_ocurr) total_occur, sum(total_time) total_time, text from
 (select s.name name, s.line line, p.total_occur total_ocurr, p.total_time total_time, s.text text
 from user_source s,
 (select u.unit_name, u.unit_type, d.line#,
 d.total_occur, d.total_time/1000000 total_time
 from plsql_profiler_data d, plsql_profiler_units u
 where u.runid = d.runid
 and u.unit_number = d.unit_number
 and u.unit_type in ('PROCEDURE','PACKAGE BODY')) p
 where s.name = p.unit_name (+)
 and s.line = p.line# (+)
 and s.type = p.unit_type (+)
 and s.type in ('PROCEDURE','PACKAGE BODY'))
 group by name, line, text
 order by 1,2;

Here’s the code for creating the XML CLOBs:

-- XML_CLOBS loading
declare
 q dbms_xmlgen.ctxHandle;
 result clob;
 cursor cModules is select distinct  name from xml_stage order by 1;
 begin
 for reg_module in cModules loop
 q := dbms_xmlgen.newContext ('select line, total_occur, total_time, text from xml_stage where name = '
 || '''' || reg_module.name || '''' || 'order by line');
 dbms_xmlgen.setRowTag(q, 'LINE');
 result:= dbms_xmlgen.getXML(q);
 insert into xml_clobs values (reg_module.name, result);
 dbms_xmlgen.closeContext(q);
 commit;
 end loop;
 end;
 /
 call pkg_utils.PRC_DROP_INDEX('xml_clobs','ix_xml_clobs_01');
 create index ix_xml_clobs_01 on xml_clobs (name);

I’ll skip the code for exporting the CLOBs to the file system, which isn’t too complicated, but a little bit bulky to reproduce here. Once we’re done, we should have a folder full of XML files looking something like this:

<?xml version="1.0"?>
 <ROWSET>
 <LINE>
 <LINE>1</LINE>
 <TEXT>PACKAGE BODY PKG_UTILS AS
 </TEXT>
 </LINE>
 <LINE>
 <LINE>2</LINE>
 <TEXT>
 </TEXT>
 </LINE>
 <LINE>
 <LINE>3</LINE>
 <TOTAL_OCCUR>0</TOTAL_OCCUR>
 <TOTAL_TIME>0</TOTAL_TIME>
 <TEXT>  PROCEDURE PRC_DROP_TABLE(I_TABLE_NAME VARCHAR2,
 </TEXT>
 </LINE>
 <LINE>
 <LINE>4</LINE>
 <TEXT>                           I_FORCE_DROP IN VARCHAR2 := 'FALSE') AS
 </TEXT>
 </LINE>
 ...
 <LINE>
 <LINE>293</LINE>
 <TOTAL_OCCUR>42</TOTAL_OCCUR>
 <TOTAL_TIME>.212</TOTAL_TIME>
 <TEXT>    IF V_PARTITION_EXISTS = 1 THEN
 </TEXT>
 </LINE>
 <LINE>
 <LINE>295</LINE>
 <TOTAL_OCCUR>23</TOTAL_OCCUR>
 <TOTAL_TIME>4782.023</TOTAL_TIME>
 <TEXT>      EXECUTE IMMEDIATE 'ALTER TABLE ' || I_TABLE_NAME ||
 </TEXT>
 </LINE>
 ...
 <LINE>
 <LINE>1129</LINE>
 <TEXT>END;</TEXT>
 </LINE>
 </ROWSET>

TRANSLATING TO THE WEB

Next, I created an Ant script to read each of the XML files, and perform an XSLT transformation on each to produce equivalent HTML files. The XSL script is too gruesome to reproduce here (if anyone wants a copy, I’ll be glad to oblige):

  1. Write the HTML headers
  2. Go through the list of procedures, calculate the coverage of each, and generate a table of the totals
  3. Show all the source code, with counts of the number of times each line was executed and how long the total executions took

The transformation script also sprinkles the output with some useful eye candy: it decorates executable lines with either a red or a green background, depending on whether or not the line was executed in testing. It highlights the start and end of each individual procedure in bold, and provides links from the scores at the top down to the specific code listings. Since it’s a simple text transformation, it must do this based on whatever information is contained in the text file itself. This means we had to make certain assumptions about the source code: a line that begins with “PACKAGE BODY” must be the start of a package definition, “PROCEDURE FOO” or “PROCEDURE FOO(” must be the start of the declaration of a procedure named “FOO”, “END;” is the end of a procedure or block, and so on. It’s not infallible, so a little bit of standardization in code conventions can go a long way here.

Example of a package source report

Example of a package source report

Calculating the coverage was much more challenging than it sounds. It’s based on line coverage, which means you should be able to calculate a percentage by taking [100 * lines executed / total executable lines]. The problem is with the word “executable”. In Java, you generally have a 1-to-1 correspondence between lines of code and executable lines (this isn’t true, but you could probably get close enough by making that assumption). In PL/SQL, you may have SELECT clauses that span tens of lines (we have queries that actually span hundreds), but count as only a single executable statement. Fortunately, as you may see in the XML example above, not every line contains an entry for the values <TOTAL_OCCUR> and <TOTAL_TIME>. It was almost sufficient to count only those lines that contain these tags as executable. The problem is that the profiler provides execution data for some irrelevant lines, such as the package declaration and “END;”. Again, we resorted to a hard-coded list of values that could be ignored for this purpose.

TYING UP SOME LOOSE ENDS

Now our coverage reports were *almost* complete, but for one detail. As I mentioned above, the profiler doesn’t report stats for packages that aren’t executed during testing. We were able to extract their source code into the XML reports, but without stats, there was no way to determine which lines are executable and which aren’t. As a result, our reports were showing a score of “0 out of 0”, which doesn’t quite compute. For this, we came up with what is so far our most inelegant solution, given that it requires manual intervention to execute and maintain. The trick is to force the profiler to execute the packages without confusing these results with the real stats generated during testing. For each non-executed package, Eduardo created a “wrapper” script which exercizes the package somehow (it doesn’t matter how – the results will be ignored), converts the profiling data to invalid scores (-1 for execution occurances and scores) so they won’t be confused with real results, then merges this data into the “xml_stage” table where the real test results have already been stored so that they may be extracted in the final reports. Unfortunately, this approach requires an extra post-execution step before the report is generated, and requires a custom script for each package that you know isn’t being tested. If anyone has a better solution, please let us know!

BRINGING IT ALL TOGETHER

The cherry on top of our coverage reports is the high-level index page, listing all the packages and summing the scores across packages. At this point, I’d had it with SQL and XSLT, and fell back on my trusty Java knowledge. I created a class that reads all the HTML files spit out by the XSL transformation, greps out the total score from the report, sums them up, and generates an HTML file with all totals, plus links to the individual source file reports.

Example report index page

Example report index page

Tying all this together is a Maven project which compiles the Java class, and packages it up with an Ant project containing all the scripts, images, stylesheets and so on needed to generate the reports. Unfortunately, the harness doesn’t automate the Oracle part of the report generation, but once the XML has been exported to files, a single call to “ant” is enough to generate the HTML reports.

Conclusion

It sounds like a lot of work, but we were able to create the initial reports in a couple of days, part-time. The project came together really by accident as a thought experiment that unexpectedly worked. For that, we owe our thanks to Oracle for providing so many out-of-the-box tools and features, to Cobertura for providing an excellent example of coverage reporting for Java, and to Scott Ambler for throwing down the gauntlet in the first place.

Although this tool may seem trivial at first glance, it turns out we now have our hands on something quite valuable. The profiling can be enabled for analysis at any time, and with OracleTC, we can quickly report on the results. While it was created with our automated unit test suite in mind, it is by no means limited to this. For example, we have a type of exercize that we call a “blitz test”, in which we try to simulate realistic production situations. The test entails running some automated scripts while anywhere from 5 to 20 people simultaneously log in to the system and begin executing a variety of test scripts based on different user profiles. With OracleTC, we are able to show afterwards just how much of our PL/SQL was executed during the blitz, allowing us to identify gaps and better focus our tests. You could also conceivably enable this profiling in production for a period of time in order to try and identify functionality that isn’t being used by customers, although I wouldn’t do this if your system is sensitive to performance impacts.

I hope you find this information useful and that some of you are encouraged to try this out for yourselves. I tried to provide enough information to give you a big head start, but this article is already too big for a typical blog post, and couldn’t show you everything. Rather than leave you hanging by breaking this up into multiple posts, I did my best to be brief when possible. My hope is that I’ll have the time (and the permission) to open source the code that we have, and provide adequate documentation. Maybe if enough of you show support, it’ll help get the ball rolling. In the meantime, if anyone out there needs a hand, drop me a line (either reply here, or use the email address in the About page) and let me know. But if you are writing to tell me that Oracle (once again) already has a tool that does this, please be gentle!


Unit tests for the database

December 19, 2008

A while back (actually, over 2 years ago now!), Scott Ambler was here in Rio for the Rio Java Summit that we were hosting. In his usual controversial way, he made the point that people on the database side of things are way behind the curve on the latest advances in software development, especially in relation to agile practices and software development techniques. Of course, he was promoting his then-recently-released book “Refactoring Databases”. But he was certainly spot-on about the fact that people that generally work directly with databases, including programming stored procedures and triggers, don’t follow the same practices as other software developers, especially those in the object-oriented paradigm. Perhaps this is because database schemas are more risky and slow to change than software (his Database Refactoring book shows how hard this is – even “agile” schemas are refactored on a time scale of weeks, months or even years, rather than minutes for software). Or, more likely, it’s because there’s something of an organizational and historical wall between the two camps.

Whatever the cause, one thing is clear: there are very few tools available to database developers for things that software developers come to take for granted: code metrics and analysis, IDEs with support for refactoring, and frameworks for unit testing. What is available is generally either commercial and highly proprietary (Oracle and Quest (the Toad guys) have some fantastic tools for profiling, analysis and even unit testing), or so limited to be almost trivial. Even when the tools exist, almost no one uses them. Ok, I admit I have no data to support that statement, but I’ve never met a pure DBA, Data Architect or database developer that has ever written a unit test harness in the way that software developers know them (have you?). The point is that unit testing and related practices just haven’t made their way into the database community, and as a result there are almost no tools to support them.

FIT4DBs

As part of Scott Ambler’s presentation, he sent out a challenge to anyone interested to start making these tools on their own. As it so happens, my own team had just recently started a project to adapt the FIT testing framework for use with SQL and stored procedures. We’d looked at DbUnit and some others, but the XML-based test and data declarations seemed somewhat unwieldy to us. The result was a framework I called FIT4DBs. We had plans to open source it, but unfortunately the time just hasn’t materialized. I’ll let you know if we ever get around to it, but for now I’d like to just mention a few aspects of the project.

FIT4DBs, as I already mentioned, is based on the FIT framework for testing. FIT comes with its own philosophy about testing in which the business analysts themselves can define requirements in simple HTML or Excel (or some other table-related format). The developers are then able to write “Fixtures” which essentially “glue” these requirements to the system code. The result are requirements that are actually testable. Sounds fantastic, and when it works, it really is.

Simple FIT test for a web site shopping cart

Simple FIT test for a web site shopping cart

For FIT4DBs, however, it wasn’t this philosophy that interested us – it was the table-driven format for declaring tests. It occurred to me that this format is perfect for when you have an algorithm or procedure that is highly data-driven. If you’ve written a lot of unit tests, I’m sure you’ve dealt with these cases before: some methods only require a few tests to test all the variations, but there are some methods that after 10 or 15 or 20 variations, you keep coming up with more (I find I do this a lot when testing regular expressions, for example).


    public void testCalculateLineItemPriceQty1() {
    	ShoppingCart cart = new ShoppingCart();
    	Item item = new Item(101, "The Best of Wonderella", 10.99F);
    	cart.add(item, 1);
    	
    	assertEquals("Price of one item should equal the unit price", 
    			10.99F, cart.calculateLineItemPrice(item), 0.001F);
    }


    public void testCalculateLineItemPriceQtyMany() {
    	ShoppingCart cart = new ShoppingCart();
    	Item item = new Item(693, "1001 Yo Mama Jokes", 43.00F);
    	cart.add(item, 5);
    	
    	assertEquals("Price of multiple items should follow simple multiplication", 
    			215.00F, cart.calculateLineItemPrice(item), 0.001F);
    }

...and so on...


The problem here is that you have a lot of excess code for what is essentially repetitive work – all you’re doing is trying out different inputs and expecting different outputs. If you come up with a new combination of inputs, you have to copy and paste all these lines of code, just to modify one or two parameters. And after a while, it becomes nearly impossible to see exactly which tests you have already created (not to mention all the ridiculous method names). FIT provides a very convenient alternative to this pattern. All you have to do is list the variations of inputs and expected outputs in a table, and viola! If you want to add a new variation, you just have to add a new row in the table.

Same test with some new exception cases

Same test with some new exception cases

So, the first realization was that database procedures tend to be, well, data-driven. But not in the way that we were talking about here. Usually, what matters is the data that already exists in the database… in the form of tables. And really, that’s the hardest part about testing code that depends on the database: setting up the data the way you want it, and making sure it’s there in a pure and unadulterated form (otherwise, you may get incorrect results). What better way to declare your test data than to put in it tables, just the way it should look in the database? So, we made special set up fixtures that let you declare the name of a table and the columns into which you want to insert your test data. Each row in the HTML or spreadsheet declares a row that will be inserted into the database before the test begins. When the test has completed, the tables that were set up are either truncated (the tests assume they were empty prior to initialization), or the transaction is rolled back. In 9 out of 10 cases, this is sufficient for removing any side effects of the test and keeping each test independent of the others.

A set up table for Items in the database

A set up table for Items in the database

Next, we needed some Fixtures for testing stored procedures and free-form SQL. This was relatively simple. A pure text call to SQL or procedure is transformed into a JDBC call through Java.

Commands to execute directly in the database

Commands to execute directly in the database

The hard part is then comparing the results. While simple JUnit tests can usually limit the scope of their actions to the simple input and output of a single method (some tests require mock objects to test side-effects), stored procedures can affect any number of tables in any number of ways (they can even affect the schema itself). Not only that, but triggers may affect even tables not explicitly included in the scope of the procedure itself (and we DO want to test triggers, right?). This is not an easy thing to test, but FIT4DBs makes a heroic effort to make it simple for you.

What it comes down to in the simplest of terms is looking for “diffs” on the data contained in the tables. This is done by first “registering” the tables and the starting data via the set up tables. That’s right, the data goes not only into the database, but into in-memory hash maps that keep track of what the “before” data looked like. Once the test has been executed, the tables can be re-queried to see if their data has changed in any way. A “SELECT *” is executed on each table, and the results are sucked into “after” hash maps. There are then Fixtures that can be used to declare tables for diff comparisons. Each row that is declared in the test is an expected difference. The differences can be of type “inserted”, “deleted” or “updated”. Comparisons are based on the primary keys, which are declared as part of the table syntax, so if a primary key changes as part of the test, it will look like the row has been deleted and a new one has been inserted.

Testing expected changes in the items table

Testing expected changes in the items table

And that’s pretty much it! There are also facilities for things like declaring your connections, and managing suites of tests by, for example, reusing set up data via a custom “include” facility. DBUnit and similar frameworks work in some of the same ways, but they usually declare their set up data via XML. That just never felt right to me – it’s too hard to edit and to read what you have. Additional benefits of FIT itself include the ability to throw in comments wherever you want (anything that’s not in a table is basically considered a “comment” and ignored – this lets you add formatting and even images to your tests if it can help readability), and the fact that the results show up as exactly your input HTML, with green, red and yellow colors to show passes, fails and errors.

Test Coverage

One other piece that is typically missing from the database development suite are tools for reporting on test coverage of procedural code by the unit test suite. This more or less goes without saying, since unit testing in general is rather neglected. Apparently Quest Software offers a rare tool that can provide this sort of information if you happen to have written your tests with their suite, but most of our tests are defined with the FIT4DBs framework. We also have other automated tests not specific to FIT4DBs which exercise our PL/SQL code, so checking coverage only with that tool would be insufficient.

Eduardo Morelli, our Data Architect, has come to the rescue with an excellent, thorough and flexible solution to this problem. Unfortunately, it’s Oracle-specific, so it will only help you if you work with Oracle. I’ll save the details for another post – stay tuned!

So, Mr. Ambler, I hope this all comes as some encouragement to you. It’s been a long road to rise to your challenge, and it’s still ongoing, but Sakonnet at least is one shop that prefers to use agile development practices on both the software and database sides. I hope that some time soon we can get the resources to open source our FIT4DBs framework for others to use. In the meantime, you can give DBUnit and other tools a shot, or just test your code in an integration environment through JUnit, FIT and other automated harnesses. If you use Oracle, you can even use the techniques I’ll describe in my next post to report on your test coverage. It’s all about the data, so get out there and test the code that’s mucking around with it!


Legaciness revealed!

October 1, 2008

It looks like my neverending quest for a Legaciness metric has already come to a satisfyingly swift end! I was just browsing some blogs on agile techniques when I came across a post about fixing bad developer habits on a blog called Agile Tips. Waaaaay down in the footnotes is a link to a handy little tool in the Google Code site called the Testability Explorer.

I couldn’t believe my eyes when I saw it. I felt like Francisco Orellana stumbling across El Dorado. Could this really be the fabled Legaciness metric after all? I have to investigate more, but what I’ve read so far seems to be exactly what I’ve been searching for: a tool designed to calculate the difficulty of testing Java code based on its static dependencies! And from what I’ve read so far, it looks like I was on the right track with the algorithm:

  1. Each class attribute has to be considered by how “injectable” it is (their term, and I like it)
  2. “Global” variables are considered bad if they’re accessible, but not injectable
  3. There’s a transient penalty for methods which call methods which are not injectable (the “inheritance” penalty that I talked about)

They don’t directly address a number of concerns I have, such as:

  1. System calls that make testing difficult (e.g. I/O calls)
  2. Calls to “new” within a method
  3. The ability to “inject” something using the Legacy Code Patterns like “Extract and Override Factory Method” (if the class itself is considered injectable in a method, at least that method won’t get a penalty)

I’m guessing that some of these concerns, like the calls to “new”, would be covered by the transitivity of the complexity scores. I also like the touch that they’ve included cyclomatic complexity into the mix, although I don’t have a feel yet for if that would outweigh the injectability problems – my guess is no. I also don’t know if they take into account 3rd-party libraries. All in all, it looks like their metric is pretty straightforward, easy to understand and useful. I like the simplicity of their approach.

All that’s left now is to try it out! I’ll give it a shot tomorrow and see what it thinks about a certain 550-line file-reading Excel-interpreting monster of a method that’s been keeping me up at night…


Legacy Patterns Decision Tree

August 27, 2008

Way back when I first read the Michael Feathers book on “Working Effectively with Legacy Code”, I decided to write up a quick description of the patterns in the book for my coworkers to look up whenever they need them (basically, when trying to write unit tests for big, monolithic “legacy” classes).

These patterns are extremely intuitive once you’ve gone over them a few times. They’re really just obvious techniques for working around some constraints with as little risk and impact as possible. However, to reduce the learning curve, I created the decision tree below. I actually don’t think it’s very useful in practice, but creating it was a great way for ME to fully understand these patterns and their relationships. Now, I pass it on to you. My hope is that any of you developers out there may learn something from having a look at it. And, who knows, maybe there’s some insight here that can be used in my quixotic quest for the Legaciness metric.

Legacy Code Patterns Decision Tree

Legacy Code Patterns Decision Tree

Note that some of the leaf nodes recommend the use of mock objects, which are not part of the legacy code patterns themselves. I recommend the use of EasyMock, which is the library I’m using, but you can use whatever you prefer. If you can use fake objects, I find that to be generally easier than, and thus preferable to using mock objects.


Varying levels of quality

August 12, 2008

I recently gave a presentation to our development team about Emergent Design, based on the presentation given by Scott Bain at the Dr. Dobbs conference in Chicago. The presentation goes way beyond the use of design patterns to talk about the underlying “principles” (defined more or less as a particular “wisdom” about development) upon which design patterns are based, and specific “practices” (defined as things you should always do because they don’t cost much, and generally embody one of the principles in some concrete way).

Examples of principles include the Open-Closed Principle, and the DRY Principle. Programming by Intention is classified as a practice. He also defines TDD and “Pattern-Oriented Development” as “disciplines”, because they embody principles, but require some level of effort to implement. In the case of disciplines, he says, you should carefully consider the value they provide in your specific situation to see if they make sense for you.

As an architect, it is commonly my responsibility to define standards and best practices, and then make sure one way or another that they are followed. I am often confronted with “special case” scenarios, where for some reason (it’s throwaway code, it’s for a different application, it’s “just a demo”) the developer feels that it would be overkill to follow all our usual standards. This usually goes beyond what Scott Bain defined as disciplines to just about anything: Javadoc, Programming by Intention, writing unit tests, worrying about proper design, using System.out.println(), whatever.

In essence, the question is: is it OK to have different levels of quality for different types of code? Over the years, I’ve found myself waffling on this issue of whether or not it’s necessary follow the standards in these special cases. Recently, I’ve been pretty flexible on the issue. In part it’s because I was introducing some new standards that I felt would be an extra burden, initially, to developers, and I wanted to let them get used to them slowly without them feeling I was being unreasonable about it. In part it’s because I don’t think any practice (or discipline or whatever) should be used blindly without a little thought as to whether or not it’s practical. Allowing for varying levels of quality is one way I could give developers that type of flexibility to decide for themselves.

Giving this presentation has provided me an opportunity to reflect once again on this subject. And once again, I feel I’m swinging back to the other side on this one. There are two reasons for this: the so-called disciplines, and the “killer demo” syndrome.

There’s a reason why “disciplines” are called what they are. It’s because to do them right, to do them efficiently to the point that they really pay off, they require some practice. They also require a conscious decision on the part of the developer to use them. If you’ve ever written code using TDD, you know what that feels like. You’re all ready to start spouting out a brilliant gobs of code that will revolutionize the lives of millions, mow your lawn and pick up the kids from school, get rid of unwanted facial hair, get rid of embarrassing age spots… but then you remember that before all that, you have to write a TEST! Bummer. But for those who have gone through this process, you know that in the end, it’s well worth it, and even enjoyable once you get over that first hump.

Practices aren’t necessarily any easier, at least not until you’ve ingrained them into the way you do your work in general. But these days, I find it hard NOT to write my code using a Programming by Intention approach. It’s just the way I think.

Now, if following disciplines and using best practices takes some practice to get used to, and if in the end you are more productive and produce better quality code, why on earth would you look for opportunities NOT to to get in some good practice? It reminds me of the day I learned how to really type (and not just hunt-and-peck). I knew all about the “home” positions and all that, but had always put off doing it. Then one day, I had a paper to write, and decided that “this is it”. I wasted about two hours worth of productivity struggling to feel out the keys. 17 years later, I’ve never regretted it, and never gone back. My hope would be that programmers would try to follow best practices of their own volition. But when you’ve got deadlines, or when you’re working on a piece of code that you know is heading straight for the recycle bin, perhaps you feel like “cheating” just this once.

As the one who maybe has to play the role of the enforcer now and then, that’s really not a strong enough reason for me to stick my nose where it doesn’t belong. I have enough things to worry about already. The thing is that it can become my problem when that code that was heading straight for the garbage compactor makes a sudden left turn towards the big production onramp, and merges with the flow of traffic right into trunk. This is what Joe Yoder calls the “Killer Demo”, or “Throwaway Code” problem in his Big Ball of Mud article. It happens all the time. I just recently oversaw two or three “demos” that were merged into our application. Some were given the proper time for a complete rewrite, but some were just kinda tidied up, and that’s all. I am also working with some “throwaway” scripts that we’ve been maintaining for about 18 months now. That’s a pretty long time to work with something that was written on the sly.

So, what to do? Should I make an about face and start enforcing standards on all code at all times? I still think developers should be allowed to think for themselves. If they are coming to a different conclusion than I would, it could be a matter of opinion. I also think it’s a matter of culture. If developers aren’t being inspired to do these things for themselves, then maybe there’s more I could do to change that. What was that commandment again? Oh, right. “Show the way”.