One day, we were discussing some thoughts about defect root cause analysis in our office. “An exception is always a mistake of a programmer,” a project manager explained his rule of thumb to me. “An analyst would never prescribe an exception in the specs, now would he?”
As a former tester and programmer, I wasn’t able to withstand the temptation to run scenarios in my mind where the exception is exactly the result of analyst’s job - mostly unintenionally, though. For example, various mistakes or omissions in interface description can easily cause an exception. Also, specifying changes in one part of a system while neglecting other affected parts can cause the outdated parts to malfunction. A specification of a change request might not fully take into account all the features that had been built into the system before, thus causing hiccups in some less common situations. Whenever some aspect has managed to escape attention of everybody in the line, it’s hard to blame somebody specifically.
Of course, there are many more ways a exception might sneak into existence without any vicious deed from the programmer. An exception could have been caused by system architect by deciding to use a library that turns out to be crippled with bugs. The excpetion could be the fault of the guy who voted to buy those cheap network hubs from the sale in discount store instead of the more expensive ones that had real metal casings and had actually been tested for compliance with gazillion standards. An exception could be caused by some bizarre setting in application server’s configuration that Mike the Server Guy has tried to tune to gain speed. The occasional strange noises from the air conditioner in server room could be the lead to the memory leaks that seem to occur whenever the sun has been smiling all day.
Now, complacently glazing over the ruins of the “rule of thumb” I just bashed, I’d like to argue whether blaming anybody is necessary at all. In a software development team, which one is more important - producing working software, or punishing people for making mistakes? Mind you, learning, development and finding the optimal solution are truth-seeking activities that inherently include making many mistakes. By shooting down the curious explorers we inadvertently tell everybody to keep stomping the old known track and avoid trying anything new, avoid to evolve. While some mistakes will be avoided and production therefore somewhat higher, killing the desire to enhance oneself will hurt a lot in the long run. The software industry evolves way too fast to afford die-hard developers with only wired-in practices from their early days.
In an environment where hunters are bashing bushes looking for witches to burn, fear is the driving force. Dreams of jelled team or happy employees can be buried six feet under.
What to do instead? How to survive without dead bodies hanging from trees as a lesson for all? In my opinion, mistakes in software should be tolerated, but attention must be paid to not miss their educational value. Striving for deep understanding of every single bug and the factors that allowed it to emerge instead of applying “rules of thumb” not only helps to avoid similar bugs in the future, but also helps to craft a better fix for it and gives deeper understanding of the system. In a team, some most interesting or hideous cases could be published for sharing the knowledge - in neutral objective tone, avoiding to vilify any single person. Unfortunately, more often than not the schedules exist in denial of the fact that properly debugging a feature takes more time than writing the first version of it.
I am happy to see this philosophy giving good results in our product, Changelogic. We do acknowledge that mistakes are made during development, thus we have several traps to catch them - change review, task verification, version acceptance. The traps are there to isolate the customer from the bugs that slip in during development. While we do show statistics about how effective each trap is and how fast tasks are processed by the team as a whole, there is intentionally no easy way to get summaries by person. In my mind, seeing a software product forming from the efforts of a team is way more closer to truth than thinking of it as a sum of results of individual developers. After all, successful individuals are no use when the team fails.
Entry Filed under: Various
| M | T | W | T | F | S | S |
|---|---|---|---|---|---|---|
| « Aug | Oct » | |||||
| 1 | 2 | 3 | ||||
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | |
4 Comments Add your own
1. Kristjan Kanarik | September 25th, 2006 at 12:12
Right on point, Imre. During last 10 months I’ve been in a major client facing role (analyst/PM) and when something has gone wrong, I’ve been asked a number of times “how could that have happened? who is to blame?”. I mean, come on, does someone actually include “looking for victims and punishing them, softly” as an activity in the project plan? Not so.
Same goes for any other activity that hasn’t been planned for - even refactoring which Karel has bashed about, endless optimizing (probably Karel has bashed about that as well!
etc. If it isn’t sensible (doesn’t create value for the customer), don’t do it.
2. Tim Williscroft | September 26th, 2006 at 00:16
You might find solace in the works of W Edawards Deming.
(The TQM guy)
The first rule of TQM is “Drive out fear”
It works for manufacturing, why couldn’t it work for software.
3. Blanka | September 26th, 2006 at 13:59
Blaming and punishing should really be the last option. The worst about it is that is sends this message to the team: don’t try anything new because if you fail for any reason, you will be punished. Why should someone give the very best of him, his creativity, for a fair chance to be ridiculed later?
4. AhtiK | October 1st, 2006 at 21:46
I see blaming often as an inhumane and unprofessional way of trying to urge people to find the real cause.
“What went wrong, how can we improve ourselves next time?”
vs
“You have 50% of bugs still unfixed and we supposed to be live since yesterday and client is paying less than we pay you and your shirt stinks!”
Blaming is like irrogant problem statement without cause-analysis from a distance and doing this in a bossy way
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed