So I’m a little confused by some of the mixed messaging. I would like the opportunity for each of you to weigh in on the actual problem set in front of me. In Jessica’s email for the off-site, she has asked for me and Mike to spend roughly 2 hours discussing Automated Unit testing. That’s fine and dandy, but after reading through Steve’s Quality Plan, discussing one on one with Mike and another one on one with Marc Nadeau, I’m not convinced that everyone is aligned on the problem.
Each of you has one or more arguments about unit testing. In general each of you desire greater coverage of our APIs through more unit testing so that we can hand-over a more functionally stable product. From my conversation with Marc, he’s concerned that the unit tests today are not only lacking in coverage, but failed to be executed by engineering on a consistent basis. Rather, the only execution occurs during the nightly builds and is then managed and controlled by QA. Essentially, our engineers are throwing their code over the wall with the promise of the nightly testing loops to catch their mistakes. When I discussed this further with David Ashman and ad hoc with a few engineering managers, there was a confirmation that unit tests were not being run on any consistent basis. The reason…the API harness takes too long to run. So it’s safe to say that there are opportunities for improvement with unit testing in categories such: coverage, decomposition, execution and accountability.
When I talked with Mike, he’s of a slightly different opinion about the problem set I am to tackle. He called-out the problem related to improving the reliability and timing of code handoffs. This is covered to a degree in your quality plan as item #3 (Requirements – Collaboration, Validation & Traceability). I’m not quite sure if this a problem that I can get my hands around as I might be too removed from the Sprint lifecycle and fairly address the problem.
My request would be to focus on the framework process of improved quality as part of the engineering lifecycle. Below are some of my initial thoughts…
Steve’s slide 17 does a great job at calling out the key issues of our unit test approach: coverage gaps, quality issues, time management, execution and measurement. I think they are a good place for us to start. My only change would be to separate static analysis and even dynamic analysis from unit testing, and keep it under the umbrella of code quality.
Coverage: I personally have not performed a coverage analysis effort as of yet. So ideally I would like to have the time to perform an audit, whether that be from an independent inspector or by a small team responsible for providing thorough measurements. I would love to be able to go back and introduce more coverage where we are lacking. We should, as we always intended empower the module/product teams to do such, while at the same time augment the coverage through some professional services. The coverage problem is really about the go-forward plan. Steve calls out an interesting point about unit tests not being accounted for with more precision during the sprints. I would add an additional point that unit tests are not collaboratively discussed and transparently presented as sprint deliverables.
Talking Point 1 on Coverage: Recommend we bring in outside consulting firm with our $133k to help us measure our true coverage, the quality of our tests and help us work through the problems of collaborative unit test development. There are companies out there like Stelligent, or we could bring the former president of Stelligent, Andrew Glover into work on this problem with us. I actually grew up with Andrew and we have traded a few emails over the past 3 years. He’s got a lot of ideas and could be real helpful in addressing our deficiencies. If a consultant comes in, it’s going to take time away from team members, whether those team members are Engineering Managers, Architects, Engineers or Performance Engineers. We also invested in Clover, but are we seeing a return on that investment?
Talking Point 2 on Coverage: We have said over and over that the teams would be tasked with going back and adding coverage. It’s been lip service to now. We haven’t seized any opportunities in our schedule to take this on. In fact, we haven’t even put it as measureable output for the development teams. I propose that we put together a measureable objective for development managers to plan for unit test coverage windows throughout 2011. We put coverage objectives for all legacy APIs in the plans for all contributing engineers. We have to make time and set targets. This might require us to do a proof of concept with a product team for a short period of time to gauge feasibility and effort, as well as work out the wrinkles in rolling something of this magnitude out to the team.
Talking Point 3 on Coverage: We only get so far with Talking Point 2 as our developers are consumed with new functional development and maintenance. I suggest we invest in some of our professional services funds to tackle back coverage. I’m not sure how much Talking Point 1 will cost. I would assume at least $50k, so that might only leave us a little over $80k to spend. That’s not a whole lot and might only add up to about 1 or 2 full time off-shore developers. This might be good for chipping away at consistent contributions, but how productive would 1 or 2 contractors really be? It may make sense to try to obtain enough funding for each product/module team. If a contractor(s) is/are used, it’s going to require management of that person, meaning planning, training, reviewing, etc. from one or more people.
Talking Point 4 on Coverage: As it relates to new development or even maintenance development for that matter, I strongly advise that we have each team be 100% transparent about their unit tests. I might have to explain this in person, but I would propose a scenario in which each engineer would have to produce a unit test plan as part of the sprint. The plan could obviously change each week depending on the changes in requirements and use cases. You make the plan a measureable artifact that the engineer has to produce to the architects and development managers. This would be accessible to the entire team including QA, UX, etc…The team would have to account for the plan, as well as completing the work in their estimation and development efforts. So in order to do something like this, we need to figure out how to make it fit without taking away functional development time. My main motivation is to empower the engineers on the team to be responsible for more than just their code. They have to present a plan on the testability of their code and execute on that plan.
Unit Testing: If we look beyond coverage, there are some tangible problems for us to tackle. The first problem that’s more engineering oriented is the issue over the time it takes to execute unit tests. It appears that the unit testing framework that exists today is not well decomposed. For example, if I’m working on the Grade Center, I can’t just run a suite of tests in an automated fashion tied to the Grade Center unless I manually target the one or two tests that I want to run. If I run a sub-set of tests, then I obviously don’t have to wait an hour. Considering their is belief that the coverage we do have his minimalistic (less than 30%), if we add more tests, we are only making the process last longer. Second, there is evidence that unit tests are being avoided on some of the most complex pieces of code in the product. Third, not all code should be tested by a JUnit test. We may have to consider alternative testing approaches such as HTTPUnit, JSUNIT, DBUnit, or even our own Selenium scripts. We can’t look at testing as the sole responsibility for our QA organization.
Talking Point 1 on Unit Testing: I think this is a solvable problem, but one that I don’t have firsthand experience with. So I would need to either run the tests myself and see how the API test harness has been coded, or assign one of my engineers. We could establish a project that tackles making the harness faster, as well as more decomposed for better decoupling of tests within the suite.
Talking Point 2 on Unit Testing: As I noted above in Talking Point 4 on Coverage, we have to make a change in the way we do business by making our teams accountable for the transparency of their unit tests and approaches to testability as part of the planning/design during a sprint. The teams should agree that the most complex areas of code and the areas with the greatest risk should in fact have the greatest amount of engineering testability. This will require executive motivation and measurement with accountability.
Talking Point 4 on Unit Testing: I mentioned the need for transparency and collaboration of unit testing plans. I also mentioned that the more complex or greatest risk, the stronger the need for a unit test. That still doesn’t address the problem that QA has in that engineering is not running their tests. So I am of the mindset that we need to establish some process changes around code check-ins. We mandate code reviews. As part of that process, or as a supplementary process we need to establish a review of Unit tests new and old the affected area or component that is being developed against. It might be as simple as providing the results of the unit test as part of the check-in changelist. It may be as simple as having team members run the suite and authorize the check-in based on success with the unit test. In either case, engineers have to be accountable for running their tests and measured as part of their code commit lifecycle.
Code Quality: I would prefer to not lump code quality and static analysis under unit testing, but rather group unit testing and static analysis under the category of code quality. It’s clear that we failed as an organization to implement static code analysis as both a formal process, as well as an enterprise tool suite. The fall of code project that engineering undertook missed the mark. The project was more of a tool evaluation which never productized into our lifecycle. It never formalized into our methodology. It was sadly a tool project. I believe static analysis has to be a process and workflow with decision points and accountability. No one tool can solve all of our problems. Each team should have different stakes and interest. For example, we in performance engineering make use of 2 tools (findbugs and PMD as well as custom rules. Security engineering will attempt to use those tools as well, but we plan on using IBM’s AppScan source code suite as well as a handful of other tools to help us meet our static analysis goals. Engineering may need additional tools, so too may UX. Right now we aren’t doing anything formal from a UX code quality standpoint other than accessibility testing. We are doing some light JSLint, but nothing real formal on our CSS.
Talking Point 1 on Code Quality: My first point about building a process and not simply selecting a tool set is absolutely critical. We need to figure out how static analysis can be incorporated into our lifecycle. How and when do we define rules? Who and when are responsible for responding to rules? How to guarantee that these tools are simply not ignored, but are built into our workflow? How do we make teams accountable for responding to their issues?
Talking Point 2 on Code Quality: The tool set does become important when we have established processes. We need to make the tool sets more enterprise oriented, meaning that they are not running on someone’s desktop only. Desktop execution is for IDE execution only. Server side continuous analysis is what I am talking about. We need to integrate these tools into a formal configuration management process. The workflow discussed in Talking Point 1 on Code Quality has to establish rules and procedures for maturing rules, phasing them out when they are not applicable and most importantly responding to issues.
Talking Point 3 on Code Quality: Any team contributing code and/or inspecting code such as DBAs, UX, Perf and Security have to identify the appropriate static tools for inspection. We can’t approach a one tool fits all model. We have to be open to the idea that the tools are interchangeable and the process is maturing, but consistent.