Changelogic weblog

Nonlinearities of software development
  

What you’ll write tomorrow is not what you think today

My favourite idea from Extreme Programming is that you should not run your imagination wild and try to conjure future requirements from a chrystal ball. Instead, you should design and write something lightweight that works for current requirements and refactor it later, if needed.

Why have I grown fond of this philosophy? Wouldn’t it be easier to implement those future features if they have already been taken into account? Maybe, in theory. But along the way, I have seen enough really bad code (oops, some of it written by me) that tries to fulfill some needs that were imagined to be worthy one day. Of course, requirements changed as they always do (unless you work in some dreamy place with thick forests and a nice waterfalls), and the feature was never fully implemented. Often, the feature had never even made it into the requirements list.

As the application doesn’t really support the anticipated feature, it is confusing to read and hard to understand such code. Since the feature was not needed at the time, the feature was not thoroughly tested, so there are inevitably some bugs that might even hinder useful parts of the code. And since some later changes have not taken into account the fancy feature-ought-to-be, the code has become stale so awfully that it would be much easier to design the feature from scratch anyway.

Of course, as any useful idea, this practice has its limitations. For an intended public API, you’ll want to get as much of naming, structure and scope nailed down as possible before considering the work done. Any reorganizing you’d want to introduce after publication would break applications that have already been written against your API. That would eat up your API’s trust quota quite fast. You’ll need at least three applications experimentally using your API before you can consider it generic enough for wider use.

6 comments  October 3rd, 2006  Posted by Imre Lumiste  

Aim at a developer, shoot the team

One day, we were discussing some thoughts about defect root cause analysis in our office. “An exception is always a mistake of a programmer,” a project manager explained his rule of thumb to me. “An analyst would never prescribe an exception in the specs, now would he?”

As a former tester and programmer, I wasn’t able to withstand the temptation to run scenarios in my mind where the exception is exactly the result of analyst’s job - mostly unintenionally, though. For example, various mistakes or omissions in interface description can easily cause an exception. Also, specifying changes in one part of a system while neglecting other affected parts can cause the outdated parts to malfunction. A specification of a change request might not fully take into account all the features that had been built into the system before, thus causing hiccups in some less common situations. Whenever some aspect has managed to escape attention of everybody in the line, it’s hard to blame somebody specifically.

Of course, there are many more ways a exception might sneak into existence without any vicious deed from the programmer. An exception could have been caused by system architect by deciding to use a library that turns out to be crippled with bugs. The excpetion could be the fault of the guy who voted to buy those cheap network hubs from the sale in discount store instead of the more expensive ones that had real metal casings and had actually been tested for compliance with gazillion standards. An exception could be caused by some bizarre setting in application server’s configuration that Mike the Server Guy has tried to tune to gain speed. The occasional strange noises from the air conditioner in server room could be the lead to the memory leaks that seem to occur whenever the sun has been smiling all day.

Now, complacently glazing over the ruins of the “rule of thumb” I just bashed, I’d like to argue whether blaming anybody is necessary at all. In a software development team, which one is more important - producing working software, or punishing people for making mistakes? Mind you, learning, development and finding the optimal solution are truth-seeking activities that inherently include making many mistakes. By shooting down the curious explorers we inadvertently tell everybody to keep stomping the old known track and avoid trying anything new, avoid to evolve. While some mistakes will be avoided and production therefore somewhat higher, killing the desire to enhance oneself will hurt a lot in the long run. The software industry evolves way too fast to afford die-hard developers with only wired-in practices from their early days.

In an environment where hunters are bashing bushes looking for witches to burn, fear is the driving force. Dreams of jelled team or happy employees can be buried six feet under.

What to do instead? How to survive without dead bodies hanging from trees as a lesson for all? In my opinion, mistakes in software should be tolerated, but attention must be paid to not miss their educational value. Striving for deep understanding of every single bug and the factors that allowed it to emerge instead of applying “rules of thumb” not only helps to avoid similar bugs in the future, but also helps to craft a better fix for it and gives deeper understanding of the system. In a team, some most interesting or hideous cases could be published for sharing the knowledge - in neutral objective tone, avoiding to vilify any single person. Unfortunately, more often than not the schedules exist in denial of the fact that properly debugging a feature takes more time than writing the first version of it.

I am happy to see this philosophy giving good results in our product, Changelogic. We do acknowledge that mistakes are made during development, thus we have several traps to catch them - change review, task verification, version acceptance. The traps are there to isolate the customer from the bugs that slip in during development. While we do show statistics about how effective each trap is and how fast tasks are processed by the team as a whole, there is intentionally no easy way to get summaries by person. In my mind, seeing a software product forming from the efforts of a team is way more closer to truth than thinking of it as a sum of results of individual developers. After all, successful individuals are no use when the team fails.

4 comments  September 21st, 2006  Posted by Imre Lumiste  

Software risk profiles

What I’ve been saying so far about refactoring probably looks like this:

Karel Kravik: “What happens if you refactor your software? You devote a lot of resource to it, but its current value remains the same and you’re happy if it still works the same way. So why should one do refactoring?”

And as I’ve read out from feedback, people think that now an arrogant salesman speaks out against refactoring.

Actually, that is not the case.

What I’ve been trying to say is that when you refactor your software or do something that does not directly increase its value, like improving maintainability or simplifying future enhancements, you’re not doing something wrong, but you should consider how it affects your risk profile.

Investments into software

A lot of decisions people make in the software scene can be considered as investments, but people making them are not aware of it. What else would you name the process of putting in money (people’s salaries and other expenses) in order to win back more money later (by selling your products and services)?

When developing software product, you can invest into two kinds of assets:

  • Tangible assets: new features, new functionality, bug fixing, etc
  • Intangible assets: maintainability, scalability, extendibility, reusability, etc

These decisions in turn affect your product’s current value and future value as well as your risk and payoff.

Let’s put these terms onto a basic chart:

The blue part is probably where the most software falls. This picture does not say how valuable a piece of software is, it just tries to guess the proportion of current value and future value, so the centre is not zero but a 50-50 distribution.

The example from the lower left corner could be a script that parses your account statement and calculates how much you’ve been spending on gasoline opposing to Spring framework in the upper right corner.

But let’s go on and see what we read out of this picture.

How the value is perceived?

Your end users see only current value and tend to underestimate future value meaning short-term money is in the lower left quarter.

In the contrary techies want only to invest into future value and tend to underestimate the need for current value.

Customer Guy: “We need this fancy starring feature like you see in GMail tomorrow.”

Architecture Guy: “Can’t do. Our architecture doesn’t support Ajax so we need to deal with it on the architecture level. Let’s get it right at the first time, this way it will be a lot cheaper than the later refactorings.”

Framework Guy: “You have to begin from a good framework. Everything else will follow. I suggest you to take Snooble Web Toolkit.”

The decision-maker must find the right balance for the situation. If you’re developing open-source project or have shitloads of VC money to put at risk, you can enter the scene from the upper right corner, but if your existence depends on the current value, you have to start from lower left quarter.

Risk and payoff

Investments into intangible values protect you in the situations you have been thinking about. Say you invested a lot of money into architecture scaling up to million concurrent users in the first place and now, you suddenly have one million concurrent users, your architecture works and you make a fortune.

The other alternative is that you have invested a lot of money and the tiny chance that there will be million users won’t realize, so the architecture has been a waste.

Karel Kravik: “It would be reasonable to by a lottery ticket when your chances multiplied by the prize are more than the money you actually pay for it. Usually what you see is exactly the opposite, but people still buy them because it’s pocket change and fun.

Karel Kravik: “I’m sure aspect oriented programming is fun, but introducing it is no pocket change.”

Imre Lumiste: “The same actually applies to the lower left quarter too – if something is cheap and less risky to implement, doesn’t mean the customers actually want it.”

So the decision maker should consider if the money invested into intangible values and the probability of the situations they’ll make money in is really balanced.

Paths you can choose

Every decision toward concreteness increases current value (assuming it’s made by homo economicus) but also narrows the set of paths you could choose between in the future, that is, it might decrease your future value. So abstraction introduces some risks, but also takes down others – in the way, abstraction is a method of handling uncertainty, too.

If you’ve made enough investments to the values we previously called intangible (maintainability, scalability, etc) you are not so vulnerable to fluctuations in the scene. Let’s take the earlier example – thanks to scalable architecture your systems run regardless if you have 10, 1000 or million users.

Karel Kravik, just couldn’t keep quiet: “The problem is if you make money with 10 users.”

A good example of handling uncertainty with abstraction can be seen in amateurish startups (where there quite a lot uncertainty involved) developing systems that want to be everything to everybody.

Karel Kravik: “When Webmedia was little less than half years old and I had just finished my very first project there, we started a ‘universal content management system’, where everything was described with 3 tables. Later we found that we can describe any database with just 3 tables and even later when the XML hype started we saw that the system could be made even simpler – just one table, one column and one record containing all the XML.”

Weebl: “How rare.”

Stories from the wild - Changelogic

  1. Changelogic started out as in-house pet project for change management, the first version was 3 screens (change list, adding form, editing form) and was called Arendusweb (Development web in Estonian).
  2. It slowly gained future value as people used it, so we saw it had some potential; we added English version, task and release management processes, new design for user interface, the new name Changelogic was introduced.
  3. During the previous year Changelogic made a huge investment into intangible values – the client side was rewritten from scratch in Java (it was in Ant XML/Javascript before), it was redesigned so that it would be possible to support other version control tools without much hassle, it’s a lot more maintainable now, it’s whole lot easier to test the client with automatic tests, the logging is improved, web services were ported to WSDL. Almost nothing that would please our users directly, but it was simple to implement Subversion support after that – a tangible value.
  4. During step 4 (ongoing efforts) we plan to add support for other branching models, configurable workflow, etc.

Karel Kravik: “Almost every meeting someone bashes me for not including some Webmedia specific properties to tasks or things like that. My answer has always been ‘no’ because that will make it applicable only in-house and cut its future value.”

Disclaimer: the curve on the chart is nothing but a curve, a speculation of what it might look like.

Stories from the wild - Araneaframework

  1. Aranea too started as a quick need for some framework in project X-Gate, it had basic session management, authentication and things like that.
  2. After X-Gate was finished, the interfaces were cleaned up, bugs were fixed and the package names were changed from xgate to JWLF (Java Web Layer Framework – don’t blame yourself if you haven’t heard about it).
  3. At the time some other guys felt an urgent need for a library that would make it easier to build standard lists and forms, so you don’t have to paste around the code that displays 10 rows in a list and adds pages if there are more than 10 records to show.
  4. Later the lists&forms library was merged with JWLF to something called WM-UI (Webmedia User Interface tools), what cut the nasty and resource hungry XML/XSL transformations coming from JWLF, added JSP support and didn’t expect you to use EJB.
  5. And now the whole project is hundred times refactored, cleaned up, merged with more dirty code and cleaned up once again, documented, abstracted, generalized and is about to be released as Araneaframework 1.0.

Disclaimer: I’m sure people who have actually written code using the frameworks I named, will consider my interpretation oversimplified, so feel free to correct me.

It’s about the direction

It’s actually not very important how you position yourself on the chart I gave; it’s enough if you can choose the quarter.

However, what is important is the direction where your next decisions will take you and if you’re happy with your new risk profile.

Add comment  September 1st, 2006  Posted by Karel Kravik  

Agile Manifesto is a journey, not the destination

We cannot get started without making clear what the Agile Manifesto is. So let’s push that out quickly. Agile Manifesto is a document that states we should prefer:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

Being a curious person I tend to ask “why?”. Does following these principles make your process agile? If you want to be agile, you probably have to deploy these techniques, but are they enough?

It depends on how you define agile:

Brian Marick, Agile software development and Glade air freshener: “While I like the word “agile” as a token naming what we do, I was there when it was coined. It was not meant to be an essential definition. It was explicitly conceived of as a marketing term: to be evocative, to be less dismissible than “lightweight” (the previous common term).”

James Bach, Who Stole Agile?: “It was always my understanding that “agile” meant agile. The Agile Manifesto looks to me like an enumeration of the factors that allow software development to be agile. That’s why I like the manifesto.”

So if you define “agile” as the marketing form of lightweight, you’ve probably done enough when you’ve follow the Manifesto. But if you want agile in the meaning it’s defined in dictionary, that is, quick, the Manifesto can lead you the way, but not take you to the destination.

As marketing can assign some good-sounding label to anything and find blind followers, we’re not interested in the first definition. We’ll rather take a look why should we produce software fast. Why exactly is faster better? Why should we take agility further than states the Manifesto?

Karel Kravik: “There’s simple truth behind the scenes but nobody has yet stated it in that way.”

Software has time value

Time value is best explained in money example: €100 now is worth more than €100 in a year – if you had €100 now, you could put into deposit and earn interest during the year, that is, if the interest rate were 10%, you would have €110 at the end of the year. Yet there is another factor that eats away your interest earnings – inflation. Inflation is a property of money to lose value during time. So if during the same year inflation is 3%, you end up with €110 having the purchasing power only about €106.8.

Wikipedia thoroughly explains the time value of money.

Now picture you’re a custom software developer. Your clients order software from you because:

  • They can cut more costs than they pay for your software
  • They can earn more money with software than they pay for it

People don’t order software because they find it fashionable to use it instead of holding paper documents in archives; they lose money while not having it.

If you’re a software product developer, it’s even simpler – as long as you cannot sell your product, you’re burning cash developing it.

Basically, not having software costs money.

It may not be simple, but if we really wanted, we could calculate the expenses of Hansabank when all the payments now going through Internet bank were made in bank offices. We only had to take the number of payments and an average cost of payment which, in turn, depends on the average salary of cashiers in different regions…

Imre Lumiste, Changelogic technical lead, also engaged in job interviews: “That’s a good impossible question.”

Karel Kravik: “And here is a rhetoric question – have you ever asked yourself why we see so much beta software nowadays?”

We see beta products because they’re there to gain mindshare, to get publicity, to build community around the product. Now when you want to calculate future value of money, you have to know the interest rate, but can you predict future mindshare?

You don’t know the interest rate

Here the analogy ends; it’s not nearly as simple to calculate the future value of software, in fact, it’s impossible. What make it impossible are things you don’t know and cannot predict – you are not aware of the risks you’re taking nor the optimal criteria for setting your priorities.

Karel Kravik: “Imagine you put €10M into deposit and you don’t know the interest rate – would you prefer taking your interest yearly, monthly or weekly all other circumstances being the same? Or maybe even daily?”

The point I’m trying to make is that as long as you don’t have the software (or feature) you don’t know how it will affect the future value. Now it’s your choice if you want to figure it out once a year or once a week - the fundamental difference here is that the wider the time window is the bigger is the risk you’re exposing yourself to.

You’re exposing yourself to the possibility that maybe the customers don’t like your software; maybe it doesn’t satisfy their needs nor match their expectations. So they end up not buying it any more, in turn meaning no money for the developers. And isn’t it rational to find that out after a month rather than a year?

Here is the place where all the fluff about agility and adaptability starts to make sense – rather than trying to predict the future value long ahead, you check it periodically, the shorter the period the less you have to predict.

Karel Kravik: “Just like long development in isolated branches is more probable to result conflicts during merge, long development without reality check is likely to end up with conflicts between your opinion of what is good and your customers’ expectations.”

There is one more factor that comes into play when we’re calculating time value – inflation.

You don’t know the inflation rate

The other essential component when calculating future value of money is inflation – the rate by which money loses its value.

By default your software too loses value; it loses value even if it’s not ready. And it may even never get ready, because your competitor brings out a superior product half a year before you.

So when you’re planning a release you have to draw a line somewhere – feature A goes in, feature B doesn’t, can we afford refactoring or not; you have to draw the line what is “good enough”.

The problem is that “good enough” is perceived differently by developers and the people who really use the software. Developers like it if the code is elegant, abstracted to the right level, optimized to the right level, readable and maintainable to the right level. A programmer is actually never quite satisfied with the code; he can tinker with it forever.

Now from the standpoint of business people – they are satisfied if it looks beautiful and does what it should do, they don’t care exactly how ugly it might look to a developer.

The twist is that if you focus too much on, say, maintainability, it could happen that you will never have to maintain it because there will be no chance to sell it in the first place. Focusing on technical perfection does not increase your user base.

Karel Kravik: “It’s like taking a loan with 10% interest rate and put it to a deposit with 3% yearly interest rate.”

To turn that interest rate differential into your favor you have to be lightweight; you have to operate cheaply; you have to be really focused on the core concept and push it out as soon as you can; as soon it is “good enough”.

Karel Kravik: “Just as developers occasionally write temporary code to test technical concepts, software products or individual features can be considered as tests too – tests of business ideas. There is no point to design them for a long run until you are not sure they’ll survive.”

The Manifesto

While the Manifesto gives us some clues how to be more lightweight, that is, cheap and if development is cheaper, we can probably release faster too, it doesn’t explicitly state what is the reasoning behind this; neither does it tell us how to be quick.

Karel Kravik: “If I had to sum this post up in Manifesto-like manner it would look like this:”

  • Working software now over working software in the future
  • Frequent reality checks over long predictions
  • Good enough over perfect
Add comment  August 23rd, 2006  Posted by Karel Kravik  

Is refactoring economically justified?

Villu Ruusmann, programmer, Changelogic: “Why are we committing these jar files to CVS? This is such a waste! We should keep them in Maven repository.”

Karel Kravik, product manager, Changelogic: “Why? How much costs this extra 200KB?”

Villu Ruusmann: “C’mon man, this just isn’t elegant solution! It needs serious refactoring.”

Karel Kravik, mumbling angrily: “Yeah, exactly. Elegance. Refactoring. J2EE, XML, BMW.”

What do we know that Villu doesn’t? And what does Villu know that we don’t?

Villu knows that setting up Maven repository and configuring your project to use it will take 8 hours.

What he doesn’t know is that 8 hours of his time will cost us, let’s roughly guess, €100. What he can calculate, but it doesn’t come to his mind, is that 200KB of disk space will cost as much as €0.00057 (assuming you can get 70GB for €200, probably even cheaper since I don’t follow the prices). Even if we store that 200KB file in CVS for 3000 times, it still costs us - only - €1.7.

So where exactly is the elegance in paying €100 to save €1.7?

Alternative cost

Poor Villu, we are still not finished with him. What can Villu do with 8 hours?

Imre Lumiste, senior programmer, Changelogic: “He could fix some bugs.”

Karel Kravik: “Oh really. What bugs do we have?”

The Answer I Would Like To Hear: “We have here bug number 3828, which is causing every 10th client turning down our software. Of course everybody knows that an average client will bring us €5000 a year. At the moment we have 10 potential clients per day, so we’re losing €5000 every day. The problem is identified; it takes 8 hours to fix.”

But what is the real answer? It goes something like this:

Imre Lumiste: “We have here bug number 3828. This is some strange memory leak in our transaction engine that is probably caused from miscommunications with the database driver so that every once in a while a transaction is timed out. We have to investigate it carefully and maybe even contact the vendor support.”

Mmm-mmm, what else can we do with €100?

John B., marketing director: “Hey man, this is our moment! I just got an offer to advertise our product on the first page of Arvutimaailm. And it’s only €99.”

Suppose I am not very technical person (that’s sad, but true) and the memory leak leaves me somewhat untouched as I didn’t get the Answer I Would Like To Hear, so I decide to spend my €100 on marketing and buy the advertisement in Arvutimaailm (Computerworld, Estonian PC Magazine).

What I don’t know yet, is that CEO of Krakozhia Programming Farms, Slobodan Skradin (older brother of Milan Skradin, CIO of Krakozhia Telecom, which outsources all development to Krakozhia Development Farms) will find this issue of Arvutimaailm on the plane seat next to him on his flight from London to New York and as the Changelogic commercial is the only dang thing in the language he can understand, that is, English, as the rest of the content being in some strange Eastern European dialect called Estonian, it grabs his attention.

The rest is history. After a month he signs an agreement to purchase no less than 3500 (it is what the name suggests - a farm!) Changelogic licenses along with another 650 grand worth of consulting, customizations and on-site training.

This is a single biggest software deal in Estonia after Skype founders got rich, which, in turn, doesn’t count because they sold their company, not their software.

Karel Kravik, in the cover story of Äripaev (Business Day, Estonian Wall Street Journal): “I’m telling my team every day that you must have clear priorities set.”

Nassim Taleb, Fooled By Randomness: “Lucky fool.”

We are option blind

Backspin.

Mister Krakozhia Programming Farms missed that plane; Arvutimaailm was never left on his neighbor seat because there was a small delay because of jammed printer so the plane left even earlier than first Arvutimaailms were carried out to subscribers. So it didn’t even reach the guy who could have left it in that plane. Nobody else looked at the commercial longer than it takes to turn the page.

Backspin.

You’re still in a point when you have to make the decision. You don’t know which scenario will become true, you can only guess.

Steve Jobs, Stay Hungry, Stay Foolish: “You can’t connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future. You have to trust in something-your gut, destiny, life, karma, whatever.”

Nassim Taleb: “[This is what I call] Narrative Fallacy. Creating a story post-hoc so that an event will seem to have a cause.”

Problems with optimal choices

Firstly, people are probably not even aware of the options they can choose between. For example, I’m pretty sure, most of the technical staff will insist on fixing the bugs before continuing with new developments. Basically that is not bad, but there still has to be an economic justification. Some seasoned development leads choose between bugs and new features, but they never think that maybe they could use available resources in some area where they aren’t involved as day job, for example marketing.

Karel Kravik: “I have seen people fixing a ’serious flaw’ which consists of optimizing 10% something that takes 0.5% of overall request time and are happily unaware of whatever takes the rest 99,5%.”

Secondly, people are not aware of the optimization criteria; they cannot make the best decisions just because they don’t even come close to be aware of what the right optimization criteria might be. For techies it may be some kind of abstraction, optimization, refactoring, the Elegance of the solution.

Although, when your product ships, it’s very likely that nobody cares about the Elegance, except if the Elegance is the part of the solution; something included in your sales pitch. If you optimize a video game to run really fast, it’s justified, saving 200KB of disk space nowadays is not unless you’re optimizing Linux to run from 1.44′’ diskette.

Karel Kravik: “Sometimes this is due to poor communication, but during a lot of debates I’ve been holding, people simply do not accept the argument of economic justification.”

Raakel Kaakiv, Karel’s alter ego: “Maybe this is because the hype is blind, you either do refactor or don’t, 1 or 0. Actually there a full scale between them. What complicates the problem is that the scale is highly context dependent, what works in one project or team, won’t do in others.”

Rakuke Kakuke, Karel’s another alter ego, a better one: “Typically innovative things are not economically justified from day one.”

Raakel Kaakiv: “Would you be still in business after a year if all you do is innovation? Would the decision be simpler if you knew that your chances of success are 1:1 000 000 when you’re innovating and 1:100 if you spend your resources in a bit more risk averse way?”

Rakuke Kakuke: “Define what is more risk averse! I see no risk because I haven’t put a single penny in this business.”

Let’s leave these guys arguing, it’s usual, and see what happens if we knew the right optimization criteria.

Thirdly, people are not very good at evaluating the options they have. That is not because of these people were incompetent or plain fool, but because the outcomes are often unpredictable. Even worse, they represent the kind of unpredictability where you cannot calculate your odds.

And still, at the end of the day, you have to make choices.

Despair Inc, Indifference: “It takes 43 muscles to frown and 17 to smile, but it doesn’t take any to sit there with a dumb look on your face.”

Making it easy

Karel Kravik: “Just bringing out the key points this messy rant was about.”

So when you have to choose between multiple options ask yourself:

  1. Is this the full set of options? Do you see the whole picture?
  2. What are the actual criteria you’re evaluating your options against? Do you see the whole spectrum of criteria?
  3. Can you evaluate the alternatives? Can you predict the whole chain of events that will unfold after your decision?

Here is a little hint about the ultimate optimization criteria: 

Joel Spolsky: “The real goal for software companies should be converting capital into software that works.”

4 comments  August 14th, 2006  Posted by Karel Kravik  

Software development is risk taking

With this post I’ll open next top level topic - handling uncertainty in software development.

Let me list some risks from an average software development project:

  • Will the functionality be useful for our customers? (”user risks”)
  • Will there be showstopper bugs? Problems with updates? (quality risks)
  • Will we hit the due date? Will we be in budget? (”resource risks”)
  • (this is not full list, you can come up with your own items)

I’m lining them up in this order because they depend on each other in the same order. If users don’t find our product useful, our product cannot be considered as quality product and we must most certainly spend more money to make it better. On the other hand, if we have resource shortage, our product probably won’t be as good as we’d expect.

Now suppose you have exposed yourself to these risks and run into problems.

Jason Fried: “[You can create] Less Software. Less Software allows you to distribute your time and energy across less features. More attention to less stuff will make that less stuff better.”

Karel Kravik: “Sometimes cutting the scope is really an option. But it’s more usable in product development world and not always possible in custom software development where you have your functionality fixed in contract.”

Angry reader: “I didn’t know this kind of contracts still exist!”

Karel Kravik: “Sorry, they do. Go to your Department of Defense and say you’re now going agile. Gimme €15M and let’s see what we can do.”

The application management case

So when we’re starting to develop or maintain software, we’re always taking risks. Sometimes bigger, sometimes smaller but there is always some amount of it.

Milan Skradin, CIO, Krakozhia Telecom: “Downtime of my applications can not be more than 8 hours in a year under any circumstances. I’m losing thousands in an hour.”

Branko Slavic, project manager, Krakozhia Telecom: “I need to deliver new functionality at least every month.”

Yes, the point can be best made with software maintenance. Picture you signed SLA under these conditions.

Mihajlo D., release manager, Krakozhia Telecom, comments with an evil grin: “Keep in mind that every update takes minimum one hour. This is the absolute minimum.”

Although this post is turning out to be more like a play, we must give word to Kristjan Kanarik, project manager, Webmedia: “The only solution is that every update must be successful. We go through rigorous test procedures as usual, no, heck, we’ll double these efforts. We have well-documented acceptance and release procedures, we have excellent people. We can do it.”

Andrus Tamboom, perfectionist, Webmedia: “We’ll write a test robot.”

Karel Kravik, Changelogic salesman: “They use Changelogic to manage the process.”

Hubert, La Haine: “Heard about the guy who fell off a skyscraper? On his way down past each floor, he kept saying to reassure himself: So far so good… so far so good… so far so good. How you fall doesn’t matter. It’s how you land!”

Of course, you do everything to keep the risk down. Documented procedures, senior staff, doubled hardware. But it can only be a question of few minutes and you’re already breaching this SLA.

The problem is that you’re depending on some random outcome. We may argue that it’s not entirely random, yes, it’s just so complex that to a naked eye it’s indistinguishable from randomness. Suppose some anonymous (wished to stay anonymous) programmer made a mistake writing a shutdown procedure for the database engine you’re using. So the database shutdown takes 14 minutes, cold reboot, recover from backup, an hour and a half altogether instead of usual 2 minutes.

It has nothing to do with the professionalism of your team or the quality of your testing procedures. But it’s very hard to argue it was force majeure, too. Nor does it give you any means to justify the bad performance in your annual report.

You cannot observe the generator

No formal methods exist to manage this kind of risks; this is the kind of randomness where you don’t even know what your chances are.

Nassim Taleb, Fooled By Randomness, basing his example onto Russian roulette: “Unlike a well-defined precise game like Russian roulette, where the risks are visible to anyone capable of multiplying and dividing by six, one can not observe the barrel of reality. Very rarely the generator is visible to the naked eye.”

This is what he calls “unstructured randomness”, contrasting it to structured randomness like we see in, say, gambling, where you typically can calculate your odds.

Or to put it even more simply: we don’t know what we don’t know.

Probably the term “random” is what makes it really hard to accept, so let’s rather use the term “unpredictable”; it doesn’t change the result.

Karel Kravik: “Take any of the risks I mentioned in the beginning, call me amateur, but I say they are quite unpredictable. Users can change their mind, no software can be called bug free, estimates are rarely correct.”

My coworker, whispering: “I can almost hear the angry crowd.”

Karel Kravik: “What’s the problem? If they have accepted agile development for the right reasons, that is, because of better risk management and not because of they can write less documentation, they’re already accepting that software development is somewhat unpredictable.”

Martin Fowler, The New Methodology: “Predictability is a very desirable property. However if you believe you can be predictable when you can’t, it leads to situations where people build a plan early on, then don’t properly handle the situation where the plan falls apart. You see the plan and reality slowly drifting apart. For a long time you can pretend that the plan is still valid. But at some point the drift becomes too much and the plan falls apart.”

Martin Fowler: “Usually the fall is painful.”

Why waterfall doesn’t work

We had an example, quite extreme, from software maintenance, but let me bring another one from development world too.

Nassim Taleb, on ITConversations, author’s free transcript: “To predict the trajectory of one billiard ball - trivial, two - it’s OK, but to predict nine you need to take weight of every person in a room into account, because of gravity effect. […] To predict up to 53, you need to take into account every single particle in the universe.”

Karel Kravik, project manager: “Here is a list of about 400 tasks. We have to give estimates. This is a plan for our next year and a half in project Changelogic.”

Imre Lumiste, senior programmer: “Hmm, OK. Let’s say we implement the 10 with the highest impact first; should I be able to estimate how long will the next 10 after these take (and so on) considering they are depending on each other? Of course I can come up with some numbers, but they aint any better than your own estimates. Or any other guy’s from hallway.”

Karel Kravik, wannabee Dilbert character: “Be professional.”

It could be your project plan.

If nothing more, this is one single example of where your predictions are shaky at best. But if you use a bit of imagination, you’d probably find it more like an example why there is so much agility hype out there instead of highly predictive waterfall.

There will be no grande finale about the Changelogic guys inventing new risk management methodologies here, instead of, maybe next time you’re taking responsibilities, think how big may be the part you don’t know about; how surprising turns your venture might take because of the things you didn’t see coming.

Or at least how much you may be underestimating the error rate of your predictions.

2 comments  August 7th, 2006  Posted by Karel Kravik  

Changelogic’s branching model

In a previous issue we discussed the evolution of branching models, we identified 4 levels of sophistication, but we didn’t solve the problems occurring on Level 2.

In this issue we take a look at how Changelogic branching model copes with these problems.

The problems when using the “branch-off on release” model were:

  1. Difference between mainline and release line may become significant, multiple merges bring even more complexity
  2. Customers cannot handle the situation when bug are fixed in production but not in latest test version
  3. Developers are not able nor willing match the risk associated with the change to right code line

Sergey, developer: “I committed the code and sent you the time report. My job is done. I’m pretty tired, I’ll go home now.”

Karel Kravik, release manager: “Where did you commit it?”

Sergey, looking angry: “Whaddaya mean where?? Into CVS!”

To find a relief to the first two points mentioned above, Changelogic deploys technique we call “continuous forward integration”. Don’t confuse it with “continuous integration” which is a lot broader term. 

Continuous forward integration

Basically, if you would apply continuous forward integration to branch-off on release model, you get the situation where every bug fix committed to release line is also automatically merged into mainline. As a result mainline has every bug fix that is available in release lines from the moment they are committed and also may have some new functionality.

As the bug fixes are usually quite small pieces of code touching only a handful of files, the double integration does not need much effort. If there are conflicts, they are easier to solve compared to the situation where you have a lot of different bug fixes and you don’t exactly know which line of code belongs to which fix.

Sergey, the angry developer: “You mean that every time I commit something to release line, I have to merge it to mainline too? Come on, this is not optimal! I’m a programmer, not integrator.”

Karel Kravik: “There is a separate role called integrator, but you may also deploy the merge your own code pattern.”

Branko Slavic, project manager, Krakozhia Telecom: “At least if a bug is fixed now, it IS fixed unconditionally, I don’t have to mess around and look for the branches where it’s fixed and where it’s not. Or if it’s fixed at all.”

Karel Kravik, Changelogic salesman: “Actually there is a functionality called ‘version differences’ in Changelogic that enables you to find out what changes integrated between any two versions of your software.”

Early integration

Continuous bugfix propagation also promotes early integration as I understand it.

Karel Kravik, configuration manager: “I think the whole point of early integration is that the earlier you have a likely release package assembled, the earlier you can tackle the possible issues (compare it to the cherry picking philosophy, where you choose the changes shortly before release).”

Or to put it in another words: the earlier you discover problems, the cheaper they are to fix.

I don’t know where the problem is, but having spoken to many people about early integration, I found out in their understanding early integration means that they have no branches at all and they propagate committing (integrating) code as soon as there is any code.

Agile developer: “Have you read James Bach? He says: if it exists, I want to test it.”

Karel Kravik: “No problem, if you want to cooperate with tester, share your private environment or private branch.”

In a way I picture software development, changes with different risk levels cannot be integrated into one line, just because we misunderstand the early integration. I’m fully supporting early integration in the meaning I described - once you have established a release line, you should send your current most likely release candidate to testing, even knowing it’s really not there.

Which brings us to risk levels.

Risk levels (the tofu scale)

Laura Wingerd, Practical Perforce: “Release codelines are highest on the tofu scale; they are firm. They don’t change much, and even the slightest changes to them can impact release schedules because of their rigorous review and testing requirements. Development codelines are soft-they’re changing rapidly, the software in them is farthest from release, and there may not even be tests yet for their newest development.”

Karel Kravik: “Along with assessing the codeline’s ability to absorb risk, we should assess the risk of changes and match them.”

If we think about changes flowing into our application, we see that not all of them have the same impact to our application, if we generalize here a little, we’d get the following risk levels (one can always add, but this is a likely classification):

  • experimental
  • new functionality
  • not critical bug fix
  • critical bug fix

On the other hand we have release lines with various maturities:

  • new development
  • stabilization
  • accumulated maintenance
  • critical bug fix

All we should do is to recognize the risk level of change and match it to a release lines’ where it should go. It’s not one-to-one relation here and it may have some outside constraint like, say, customers wish “I want to see contracts list in the next week’s release”, but it’s also not so hard to map them. It would probably look like this:

  • experimental change goes to new development line
  • new functionality goes there too
  • not critical bug fixes go to stabilization or accumulated maintenance line depending where the bug is found
  • critical bug fixes go to stabilization or critical bug fix line, again, depending where the bug is found

There is one more detail we should pay attention - the release lines maturity changes over time, beginning with the new development and ending with closed, which is actually right after the “critical bug fixes only” level, with the difference that we don’t support it any more.

We’re practicing what we preach - let me bring you an example from project Changelogic itself (that is managed in Changelogic too). At this very moment we have the following developments:

  • releases prior to 1.29 are not supported
  • releases 1.29 and 1.33 get only very critical bug fixes
  • release 1.36 is being actively maintained, including some minor new functionality
  • release 2.1 is being stabilized, meaning it’s in production usage in-house
  • release 2.2 is where the new functionality goes
  • there are also some open changes containing experimental code

If you are looking for morale here, it could be something like this: differentiate between change risks and integrate them early to right release lines, and, if you like, try out the continuous forward integration in Changelogic.

Of course, this model does not come without cost; in the next essay we’ll cover parallelism, isolation and another agile practice “keep it releasable”.

Add comment  August 2nd, 2006  Posted by Karel Kravik  

Evolution of branching models

This article intends to be an introduction to longer series about version control and branching in particular. On the way I also explain why things in Changelogic work like they work.

I’ll make a rough classification of project organizations here based on the sophistication of version control usage:

Level 0 - not using version control

Karel Kravik: “Honestly, I cannot imagine how any serious development can be done without it, but still, it happens.”

Level 1 - using version control, no branches

If we’re talking in CVS terms, all the development happens in MAIN and we only care about HEAD. MAIN and HEAD are CVS’s reserved tags for its internal bookkeeping, MAIN denoting the default branch and HEAD the very latest revision of any file we have in MAIN.

This is where almost everybody begins.

The first problem arises, when we’ve made our first release and are now proceeding with new development. As a rule, customers do find something they really really want to see fixed in that first release; usually there are plenty of these things. Now we face three options:

  • we fix the bugs and hand over the release WITH the new alpha stage functionality we just started
  • we let the customer live with the bugs until the alpha functionality is stabilized and do the release ASAP
  • if it’s not the first release, it may be possible to roll back the last release to previous one, until new functionality stabilizes

Branko Slavic, project manager, Krakozhia Telecom: “Hey guys, don’t tell me it’s not possible. I want this bug fixed. Find a solution.”

Mihajlo D., release manager, Krakozhia Telecom: “COME ON YOU LAZY … (inaudible - ed.), FIX IT UP, I DON’T CARE HOW!!!”

None of the previous options is what could be considered as professional software development, as two latter ones leave the client unsatisfied and the first one drives us to a vicious circle where we find new bugs shortly after the release, fix them fast, release fast, just to find ourselves in the beginning again, because, in such a rush, we certainly introduce new bugs.

Taavi Kotka, Webmedia: “The more your software sucks, the more nights you need to fix it. You start with coffee, at some point it just doesn’t taste any more. Then you take caffeine pills, 6 pieces at 6 am keeps you going at least half of the next day. But still, I believe the ultimate solution is energy drink.”

Anton Masik, occasional writer: “I just love the chemical taste of it.”

Usual workaround: we introduce some kind of freeze, be it a code freeze or functionality freeze or whatever you name it, generally a stabilization period. It helps a bit, released software is of higher quality, for some time we can fix the bugs and support production, but sooner or later we’ve to start with new functionality. Meaning if production still needs support, we’re again choosing from the three bullets described above.

Freezes have another serious shortcoming - as stabilization rarely consumes all the resources and our developers end up sitting there doing nothing.

Level 2 - branch-off with release

This is perhaps the most broadly used solution as we can see it almost everywhere; maybe with little nuances, but the point remains the same. You can easily notice its popularity when looking at the open source projects with repositories online.

Branch-off with release means that every time we’re making a release, we create new branch right from the point in code history where the release has been made. Now we can proceed developing new functionality in MAIN and collect the bug fixes to the release branch. Somewhere in the future we merge the release branch with MAIN thus propagating the bug fixes into MAIN too.

Yes, we solve two of the hardest problems here - we can support production and we’re not wasting the resources during code freeze, but soon we still run into little and maybe not so little annoyances:

  • as the release line evolves in isolation, the difference may become quite large and the merge far from trivial
  • if you’ve shipped bug fixes to production and your customers sees test environment with pre-merge version, it creates a lot confusion and becomes one source of the “old bugs come back” problem
  • developers cannot assess the risk of any change and don’t know into which branch it should go. The sad part is that they usually don’t even care.

There is actually one more problem ruining the picture - we might have to do multiple merges from release lines to MAIN, say, before branching off another release line. Doing so, we mess it up quite badly and if we want to still control the situation, we’ve to keep track of all the branches and merges manually, which is, as you already guess, not a pleasant thing to do.

Dave Jones, Buried Alive in Patches: “By the time Linus put out pre1, my tree was 6MB away from mainline.”

Karel Kravik: “It’s not exactly about using CVS, but rather analogous branching model. Interesting piece.”

Usual workaround: at this point people usually start dreaming about perfect version control that would keep track of everything, let them merge how they want avoiding dupes and resolving conflicts on the fly. In configuration managers’ jargon it’s usually called “cherry picking“, meaning you can add and subtract changes at will, especially before making release.

If it doesn’t work, let’s cut it. If the risk level is not appropriate, let’s merge it to another branch.

But this story has a twist - where will it lead if we can (re)package changes at will? It’s not so simple and perfect solution as it seems to be. Why?

Because there are always some hidden dependencies.

Subtract one change, test, you have to subtract some more, test, now you have to add one to make it at least compile and so forth. You cannot judge the application final state by just looking at the code, you have to test it. If you have more changes than you can count on your one hand’s fingers, you have a combinatory explosion in QA, exact opposite of common practice called early integration.

Albert Stone, mathematician: “Combinatory explosion happens when your search space grows at n factorial.”

Lieutenant Albert Stone, Marine Corps: “Picture this - line up a hundred soldiers in random manner and figure out if they are in order of height, if they aint, choose next random combination…”

Level 3 - anything more complicated

Basically anything more complicated is already a niche model, dedicated to solve problems like platform or locale specific code, distributed development and things like that.

Respectable work on branching patterns has been done by Brad Appleton et al.

In the next piece we’ll be looking at how some of these problems are handled in Changelogic, the configuration management software we’re developing.

Add comment  July 31st, 2006  Posted by Karel Kravik  

Nonlinearities and us

In this weblog (I just hate the word blog, it somehow associates with Oracle’s blob - a data type quite impossible to get from database, so we’ll call our weblog) we are going to write about several topics, that probably, but I’m afraid not always, will describe some nonlinearity, asymmetry or plain wicked situation from software development world, be it usability, project planning or whatever comes to mind with SD. Ever wondered how unexpectable the users really are? Or seen that asymmetry between the resources you need and the resources you have?

I’m pretty sure there will be also some entries describing a nonlinearity literally, meaning, for example, source code branching models of various kinds, because that’s what we actually do here. You can find a small list of topics we’re about to cover here: what we do?

The tagline about nonlinearities is itself a loan from a book called “Fooled By Randomness” by Mr. Nassim Nicholas Taleb. Although the book mostly meditates about financial markets, their randomness and how our lives are really driven by chances, it’s for sure an entertaining and thought-provoking read.

By saying “we” I mean the team currently developing a software product called Changelogic. Changelogic is a simple application that integrates task and bug management with version control giving you a change management functionality. It comes with a branching model included, meaning it very much systematizes and automatizes the way branches for different tasks and releases are created and merged later. If you want to know more about it, check us out at www.changelogic.com.

But back to talking about “us” - by now there are six of us:

  • Karel Kravik, product manager, that’s me
  • Imre Lumiste, technical lead
  • Sander Muru, PHP programmer
  • Villu Ruusmann, Java programmer
  • Sigmar Muuga, PHP programmer
  • Artur Assor, tester

But don’t be afraid that there will be only few postings, we expect to at least double our team by the end of year. Maybe a few more words about posting frequency - although by now even my high-school classmate who is engaged in plumbing, blogs bi-daily, we at least try to keep the quality bar and not to copy stories from Slashdot, Digg nor del.icio.us, but rather try a bit harder and season your expensive time with software development experiences from not well-known corner of the world called Estonia, the home of personal computer Juku.

Besides our product, we’re building here company too, having an ambitious plan to be the best one to work for in Estonia, even better than Webmedia, that we’re actually part of, or Skype for example. We’ve promised to reverse the tradition of being invited to Webmedia’s summer days and invite them to our’s someday.

So stay tuned.

Add comment  July 14th, 2006  Posted by Karel Kravik  

 

WordPress database error: [Can't open file: 'wp_slim_stats.MYI' (errno: 145)]
INSERT INTO wp_slim_stats ( `remote_ip`, `language`, `country`, `referer`, `domain`, `searchterms`, `resource`, `platform`, `browser`, `version`, `dt` ) VALUES ( "644300605", "en-us", "us", "", "", "", "/weblog/", "-1", "34", "", "1223322339" )