Test Coverage Reporting on Oracle 10g

February 17, 2009

In a previous post, I mentioned that my colleague Eduardo Morelli and I had made an effort to bring some of the standard quality tools from the software development world (automated tests, and test coverage reporting) over to the RDBMS world. In that post, I went into some detail on FIT4DBs, our automated unit testing framework. Now I’d like to reveal some of the secrets to our Oracle test coverage reporting tool, called OracleTC.

Test Coverage Reporting

OracleTC was more or less modeled on the Cobertura test coverage tool that we are using for our Java code. The basic idea of any tool like this is to perform the following steps:

  1. Somehow wrap your normal code (via an interceptor, a decorator, or some other pattern, usually in as unobtrusive was as possible) with some special code that can observe what is being executed
  2. Execute your suite of tests
  3. Collect statistics regarding the number of lines/paths/functions executed during your test
  4. Generate a report comparing the executed code to the total set of code that SHOULD have (or at least could have) been executed during your tests

The result is what we call the “code coverage” of your test suite. For example, if you are measuring at the level of functions, and your suite executes 15 out of 20 possible functions, your test suite “covers” 75% of your total number of functions. Ideally, you want to be more finer-grained than that. Just because a function has been executed once, it doesn’t mean it’s been fully tested. A good test suite should cover all possible variations on the outcome of a method or function, including edge cases (errors, exceptions, unusual inputs and so on). So, many test coverage suites report on the line, or statement coverage (did your tests execute every single statement in the function?).

The more sophisticated ones will actually try to capture path coverage, which is even more comprehensive than line coverage. For example, imagine you have the following code:

myObj = new Obj()
if (a == true) {
 myObj = null
}
if (b = true) {
 myObj.doSomethingInteresting()
}

In the above example, we could get 100% line coverage by running one test with a=true and b=false, and another with a=false and b=true. Does that mean we’ve tested everything? No. Unfortunately, we’ve missed the rather unfortunate case of when both a and b are true, which would be a disaster.

Observing Tests in Oracle

Anyway, back to OracleTC. We wanted to see if we were covering all our bases with our automated tests. To do this, we had to find a solution for the first step above: how could we slip in some code to actually observe what was going on inside Oracle? The usual solution to this problem, when you don’t have access to the guts of your system, is to write some sort of “pre-compiler” that modifies your code (in this case, the stored procedures and triggers and such) to add some sort of tracing to the works (INSERT INTO ORACLETC_TRACE (PROCNAME, START_TIME, END_TIME, PARAMS) VALUES (blah blah…)). This can be pretty intrusive, and will probably only get you as far as procedure-level coverage reporting. It’s more work than we were ready to put into the tool, and I would rather throw in the towel than go through that.

Fortunately, Eduardo (and Oracle) came to the rescue! He found a package called DBMS_PROFILER, which is provided with Oracle. It’s meant to be used as a tool for performance profiling. But look what it can tell you (text copied from reference link):

  1. Total number of times each line was executed.
  2. Time spent executing each line. This includes SQL statements.
  3. Minimum and maximum duration spent on a specific line of code.
  4. Code that is actually being executed for a given scenario.

There you have it! Test coverage reporting nearly out of the box! The package unobtrusively wraps your code executions, once you’ve enabled it for your session, and reports all its data, line-by-line, into some reporting tables that it sets up. We did have to figure out how to enable the profiling for ALL sessions, since our tests are multi-threaded, but the solution is to use a simple trigger (note that we created our own utility package called PKG_PROFILER to make things easier, but you get the idea):

CREATE OR REPLACE TRIGGER On_Logon
 AFTER LOGON ON schema
 DECLARE
 V_WHO VARCHAR2(200);
 BEGIN
 SELECT sys_context('USERENV', 'OS_USER') || '_' ||
 sys_context('USERENV', 'SID') || '_' ||
 sys_context('USERENV', 'TERMINAL')
 INTO V_WHO
 FROM dual;
PKG_PROFILER.PRC_START_PROFILING('STARTING PROFILER FOR (OS_USER, SID, TERMINAL): ' ||   V_WHO);
 END;
 /
CREATE OR REPLACE TRIGGER On_Logoff
 BEFORE LOGOFF
 ON schema
DECLARE
 BEGIN
 PKG_PROFILER.PRC_END_PROFILING;
 END;
 /

Querying the Results

Once you execute your tests, reports can be easily generated by running queries on this data. As an example, the following query displays which lines took the longest execution time:

select * from (
 select u.unit_name, d.line#, s.text , numtodsinterval (round(d.total_time/1000000000,2), 'MINUTE') minutes
 from plsql_profiler_data d inner join plsql_profiler_units u
 on d.unit_number = u.unit_number
 inner join user_source s on (s.name = u.unit_name and s.type = u.unit_type and s.line = d.line#)
 where u.runid = 1
 order by 4 desc)
 where rownum <= 5
Top 5 most consuming statements

Top 5 most consuming statements

We weren’t able to go the whole 9 yards and figure out path coverage, but thanks to DBMS_PROFILER, OracleTC easily supports line-level coverage reporting. Here’s an example of a procedure we created to report the line coverage at a package-level granularity:

PROCEDURE PRC_TEST_COVERAGE(I_RUNID_START NUMBER, I_RUNID_END NUMBER) IS
 /*
 |
 | Generates a Test Coverage report based on a range of runids.
 | Before activating this procedure, do not forget to set serveroutput
 |
 | Parameter:
 |       I_RUNID_START: bottom run id
 |       I_RUNID_END: top run id
 |
 | Example:
 |       PKG_PROFILER.PRC_TEST_COVERAGE (I_RUNID_START => 1,I_RUNID_END = 862);
 | @TODO:
 |       Create a temporary table to store results
 |       Deal with anonymous blocks
 |
 */
 CURSOR CUNITS IS
 SELECT DISTINCT UNIT_NAME
 FROM PLSQL_PROFILER_UNITS
 WHERE UNIT_NAME LIKE 'PKG_%'
 ORDER BY 1;
 V_COVERAGE NUMBER(5, 2);
 BEGIN
 FOR REG IN CUNITS LOOP
SELECT ROUND(EXEC.NBR / TOTAL.NBR * 100, 2) COVERAGE
 INTO V_COVERAGE
 FROM (SELECT COUNT(*) NBR
 FROM PLSQL_PROFILER_DATA D, PLSQL_PROFILER_UNITS U
 WHERE D.RUNID = U.RUNID
 AND D.UNIT_NUMBER = U.UNIT_NUMBER
 AND D.RUNID BETWEEN I_RUNID_START AND I_RUNID_END
 AND U.UNIT_NAME = REG.UNIT_NAME) TOTAL,
 (SELECT COUNT(*) NBR
 FROM PLSQL_PROFILER_DATA D, PLSQL_PROFILER_UNITS U
 WHERE D.RUNID = U.RUNID
 AND D.UNIT_NUMBER = U.UNIT_NUMBER
 AND D.RUNID BETWEEN I_RUNID_START AND I_RUNID_END
 AND U.UNIT_NAME = REG.UNIT_NAME
 AND D.TOTAL_OCCUR > 0) EXEC;
 DBMS_OUTPUT.PUT_LINE(RPAD(REG.UNIT_NAME, 30, ' ') || '-------------' ||
 TO_CHAR(V_COVERAGE, '999.99'));
 END LOOP;
 END;

The results of executing this procedure look more or less like so:

SQL> EXECUTE PKG_PROFILER.PRC_TEST_COVERAGE (1,862);
PKG_MY_INTERFACE         ------------- 100.00
PKG_MY_RUN               -------------   8.35
PKG_CALC_HIST            -------------  33.33
PKG_COPY_STUFF           -------------  47.89
PKG_PT_OUTPUT_CALC       -------------  16.29
PKG_DATA_VALIDATION      -------------  18.03
PKG_DENORMALIZE_DATA     -------------  42.48
...

This only shows those packages that were actually used during our test suite. What about those that we missed (i.e. with 0% coverage)? Here’s another query that can list those:

select DISTINCT p.object_name
 from plsql_profiler_units u
 right outer
 join user_procedures p on p.object_name = u.unit_name
 where u.unit_name is null
 ORDER BY 1;

We can run similar queries to report on triggers as well.

Shiny Happy Reports

But as I said before, our inspiration was the Cobertura project, and they set the bar pretty high when it comes to coverage reporting. Our next step was to try to replicate their web-based reports that show an overview of the results (package-level), and then lets you drill all the way down to the code itself!

EXTRACTING THE RESULTS

Our first task was to figure out how to get all this data out in a format that could be mastigated and munged by simple scripts until it looks and acts like well-behaved HTML. We chose XML, which has tool support in nearly every scripting and programming language around, and is good for structured data. Once again, Oracle comes to the rescue with their package DBMS_XMLGEN, which can format query result sets into a canonical XML format.

We wanted the whole enchilada, including the source code for our packages, along with the line-by-line statistics. So Eduardo created some procedures that:

  1. Join the data from the profiling stat tables to the Oracle user_source view
  2. Store the results in a “staging table”
  3. Convert the rows to XML-formatted CLOBs (one per module or package – equivalent to one per original source file in our code base)
  4. Spit them out into a folder as XML-formatted text files, using the package name as the file name

Here’s the code that populates the staging table for procedures which will later be converted to XML (note that this work must be repeated with some differences to capture trigger executions as well; the way these are reported in USER_SOURCE is actually a bit tricky):

-- XML_CLOBS will store one clob per line. Each line will correpond to a package, procedure or trigger
 create table xml_clobs (name varchar2(30), result clob);
-- Staging area for procedures and package bodies
create table xml_stage as
 select name, line, max(total_ocurr) total_occur, sum(total_time) total_time, text from
 (select s.name name, s.line line, p.total_occur total_ocurr, p.total_time total_time, s.text text
 from user_source s,
 (select u.unit_name, u.unit_type, d.line#,
 d.total_occur, d.total_time/1000000 total_time
 from plsql_profiler_data d, plsql_profiler_units u
 where u.runid = d.runid
 and u.unit_number = d.unit_number
 and u.unit_type in ('PROCEDURE','PACKAGE BODY')) p
 where s.name = p.unit_name (+)
 and s.line = p.line# (+)
 and s.type = p.unit_type (+)
 and s.type in ('PROCEDURE','PACKAGE BODY'))
 group by name, line, text
 order by 1,2;

Here’s the code for creating the XML CLOBs:

-- XML_CLOBS loading
declare
 q dbms_xmlgen.ctxHandle;
 result clob;
 cursor cModules is select distinct  name from xml_stage order by 1;
 begin
 for reg_module in cModules loop
 q := dbms_xmlgen.newContext ('select line, total_occur, total_time, text from xml_stage where name = '
 || '''' || reg_module.name || '''' || 'order by line');
 dbms_xmlgen.setRowTag(q, 'LINE');
 result:= dbms_xmlgen.getXML(q);
 insert into xml_clobs values (reg_module.name, result);
 dbms_xmlgen.closeContext(q);
 commit;
 end loop;
 end;
 /
 call pkg_utils.PRC_DROP_INDEX('xml_clobs','ix_xml_clobs_01');
 create index ix_xml_clobs_01 on xml_clobs (name);

I’ll skip the code for exporting the CLOBs to the file system, which isn’t too complicated, but a little bit bulky to reproduce here. Once we’re done, we should have a folder full of XML files looking something like this:

<?xml version="1.0"?>
 <ROWSET>
 <LINE>
 <LINE>1</LINE>
 <TEXT>PACKAGE BODY PKG_UTILS AS
 </TEXT>
 </LINE>
 <LINE>
 <LINE>2</LINE>
 <TEXT>
 </TEXT>
 </LINE>
 <LINE>
 <LINE>3</LINE>
 <TOTAL_OCCUR>0</TOTAL_OCCUR>
 <TOTAL_TIME>0</TOTAL_TIME>
 <TEXT>  PROCEDURE PRC_DROP_TABLE(I_TABLE_NAME VARCHAR2,
 </TEXT>
 </LINE>
 <LINE>
 <LINE>4</LINE>
 <TEXT>                           I_FORCE_DROP IN VARCHAR2 := 'FALSE') AS
 </TEXT>
 </LINE>
 ...
 <LINE>
 <LINE>293</LINE>
 <TOTAL_OCCUR>42</TOTAL_OCCUR>
 <TOTAL_TIME>.212</TOTAL_TIME>
 <TEXT>    IF V_PARTITION_EXISTS = 1 THEN
 </TEXT>
 </LINE>
 <LINE>
 <LINE>295</LINE>
 <TOTAL_OCCUR>23</TOTAL_OCCUR>
 <TOTAL_TIME>4782.023</TOTAL_TIME>
 <TEXT>      EXECUTE IMMEDIATE 'ALTER TABLE ' || I_TABLE_NAME ||
 </TEXT>
 </LINE>
 ...
 <LINE>
 <LINE>1129</LINE>
 <TEXT>END;</TEXT>
 </LINE>
 </ROWSET>

TRANSLATING TO THE WEB

Next, I created an Ant script to read each of the XML files, and perform an XSLT transformation on each to produce equivalent HTML files. The XSL script is too gruesome to reproduce here (if anyone wants a copy, I’ll be glad to oblige):

  1. Write the HTML headers
  2. Go through the list of procedures, calculate the coverage of each, and generate a table of the totals
  3. Show all the source code, with counts of the number of times each line was executed and how long the total executions took

The transformation script also sprinkles the output with some useful eye candy: it decorates executable lines with either a red or a green background, depending on whether or not the line was executed in testing. It highlights the start and end of each individual procedure in bold, and provides links from the scores at the top down to the specific code listings. Since it’s a simple text transformation, it must do this based on whatever information is contained in the text file itself. This means we had to make certain assumptions about the source code: a line that begins with “PACKAGE BODY” must be the start of a package definition, “PROCEDURE FOO” or “PROCEDURE FOO(” must be the start of the declaration of a procedure named “FOO”, “END;” is the end of a procedure or block, and so on. It’s not infallible, so a little bit of standardization in code conventions can go a long way here.

Example of a package source report

Example of a package source report

Calculating the coverage was much more challenging than it sounds. It’s based on line coverage, which means you should be able to calculate a percentage by taking [100 * lines executed / total executable lines]. The problem is with the word “executable”. In Java, you generally have a 1-to-1 correspondence between lines of code and executable lines (this isn’t true, but you could probably get close enough by making that assumption). In PL/SQL, you may have SELECT clauses that span tens of lines (we have queries that actually span hundreds), but count as only a single executable statement. Fortunately, as you may see in the XML example above, not every line contains an entry for the values <TOTAL_OCCUR> and <TOTAL_TIME>. It was almost sufficient to count only those lines that contain these tags as executable. The problem is that the profiler provides execution data for some irrelevant lines, such as the package declaration and “END;”. Again, we resorted to a hard-coded list of values that could be ignored for this purpose.

TYING UP SOME LOOSE ENDS

Now our coverage reports were *almost* complete, but for one detail. As I mentioned above, the profiler doesn’t report stats for packages that aren’t executed during testing. We were able to extract their source code into the XML reports, but without stats, there was no way to determine which lines are executable and which aren’t. As a result, our reports were showing a score of “0 out of 0”, which doesn’t quite compute. For this, we came up with what is so far our most inelegant solution, given that it requires manual intervention to execute and maintain. The trick is to force the profiler to execute the packages without confusing these results with the real stats generated during testing. For each non-executed package, Eduardo created a “wrapper” script which exercizes the package somehow (it doesn’t matter how – the results will be ignored), converts the profiling data to invalid scores (-1 for execution occurances and scores) so they won’t be confused with real results, then merges this data into the “xml_stage” table where the real test results have already been stored so that they may be extracted in the final reports. Unfortunately, this approach requires an extra post-execution step before the report is generated, and requires a custom script for each package that you know isn’t being tested. If anyone has a better solution, please let us know!

BRINGING IT ALL TOGETHER

The cherry on top of our coverage reports is the high-level index page, listing all the packages and summing the scores across packages. At this point, I’d had it with SQL and XSLT, and fell back on my trusty Java knowledge. I created a class that reads all the HTML files spit out by the XSL transformation, greps out the total score from the report, sums them up, and generates an HTML file with all totals, plus links to the individual source file reports.

Example report index page

Example report index page

Tying all this together is a Maven project which compiles the Java class, and packages it up with an Ant project containing all the scripts, images, stylesheets and so on needed to generate the reports. Unfortunately, the harness doesn’t automate the Oracle part of the report generation, but once the XML has been exported to files, a single call to “ant” is enough to generate the HTML reports.

Conclusion

It sounds like a lot of work, but we were able to create the initial reports in a couple of days, part-time. The project came together really by accident as a thought experiment that unexpectedly worked. For that, we owe our thanks to Oracle for providing so many out-of-the-box tools and features, to Cobertura for providing an excellent example of coverage reporting for Java, and to Scott Ambler for throwing down the gauntlet in the first place.

Although this tool may seem trivial at first glance, it turns out we now have our hands on something quite valuable. The profiling can be enabled for analysis at any time, and with OracleTC, we can quickly report on the results. While it was created with our automated unit test suite in mind, it is by no means limited to this. For example, we have a type of exercize that we call a “blitz test”, in which we try to simulate realistic production situations. The test entails running some automated scripts while anywhere from 5 to 20 people simultaneously log in to the system and begin executing a variety of test scripts based on different user profiles. With OracleTC, we are able to show afterwards just how much of our PL/SQL was executed during the blitz, allowing us to identify gaps and better focus our tests. You could also conceivably enable this profiling in production for a period of time in order to try and identify functionality that isn’t being used by customers, although I wouldn’t do this if your system is sensitive to performance impacts.

I hope you find this information useful and that some of you are encouraged to try this out for yourselves. I tried to provide enough information to give you a big head start, but this article is already too big for a typical blog post, and couldn’t show you everything. Rather than leave you hanging by breaking this up into multiple posts, I did my best to be brief when possible. My hope is that I’ll have the time (and the permission) to open source the code that we have, and provide adequate documentation. Maybe if enough of you show support, it’ll help get the ball rolling. In the meantime, if anyone out there needs a hand, drop me a line (either reply here, or use the email address in the About page) and let me know. But if you are writing to tell me that Oracle (once again) already has a tool that does this, please be gentle!