Friday, March 17, 2006

CM is a multidimensional problem

Traditionally, software configuration management (CM) is concerned with version control on software modules. Many CM systems are very capable of handling source code files and files generated from those sources (derived objects) in workspaces that look very similar to the ordinary file systems.
Most of them are also able to control parallel evolutions of those files (branching and merging), attach attributes to those files as a status indicator (labels, attributes) and allow automation on particular events (triggers). Only a few CM systems support a higher abstraction level concepts like subsystems/components/modules, projects, streams, baselines, tasks/activities.

In most organizations that I have seen, there is a enormous gap between the CM for "code", the CM for "documents" and CM for "data" that is neither code or nor documents (e.g. requirements database or a test data set), not to speak of CM for physical objects (e.g. hardware components, mechanical components, books or CD-ROMs).

There are some some questions that I find very difficult to answer during development, but that I would expect a configuration manager to be able to answer. Of course, after product release, when all documents are approved and all functionality has been implemented these questions are easy, but not during development.

  • Here is a test system with this software baseline installed on it. What is the corresponding document baseline that describes how the system should behave and how it should working internally (in other words: which versions of which documents)?
  • Which versions of hardware boards en chips on those boards should we use?
  • Which version of each of the requirements in the requirements database correspond to those software baseline?
  • Which test cases, test scripts and test data sets should I use for testing, and which version of them?
  • Which tools have been used to make this software baseline, and which version of them? Which tools and which versions of them should I use to analyse the system's behaviour and data output?
  • Which versions of the plans (project plan, configuration management plan, quality assurance plan, integration plan, test plan, etc.) apply?
  • Which problem reports apply to indicate the known problems?
May be the default answer is: the latest version. But if I am using the software baseline of last week then the latest versions will not correspond to the software baseline because those versions did not exist yet when the software baseline was created. The latest set of problem reports contains problems that were not known or that have been solved now but were still unsolved at the time of the software baseline.

May be the default answer shoul be: the latest version at the time of the software baseline. But possibly (and probably) some of the documents or other objects were out-of-date at that time, and some of them were already updated for the next baseline. Of the set of problems that time some solution may not (yet) be that particular software baseline, since the source versions stem from some time before the software baseline.

Another issue is that various individuals in various teams work all in parallel. Project assets (i.e. code, documents, hardware, data, knowledge and other assets and information) are evolving asynchronously. So it is not trivial to draw a timeline across all running activities to capture exactly the right versions of everything that belongs together as an overall baseline. To do that, there must be a system that manages not only the relationship between every (physical or informational) asset that plays a role in development , but also the relationship across versions of every asset.

Configuration management is actually a multidimensional problem and a "baseline" is a line in the multidimensional space. Some examples of dimensions are:
  • Identification of the controlled items (a.k.a. "configuration items" or CI)
  • Logical decomposition/clustering
  • Storage locations, directory paths
  • Versions
  • Variants
  • Branches
  • Maturity or quality level
  • Competence area
  • Department, division, third-party supplier, team, site/location
The most difficult issue that CM has to deal with is that during development a change can (and often will) happen in any of those directions. For example, an update results in a new version, a move results in a new location, a generic item may become specific for a particular variant or vice versa. But in addition to the change itself, the relationships between all those versions and items must also change. For example, when a requirement is changed or added, all design documents and models, hardware, mechanics and software modules and other dependent items must still relate to the previous set of requirements. But as soon as any of those items is updated for the changed requirement it must relate to the new set of requirements. But... in the meantime another requirement change happens, and also design changes, code changes, hardware changes, data set changes.

Everything constantly changes during development!

Because it is a multidimensional space that is constantly in motion, it is humanly impossible to maintain a constant overview of which (version) of all thousands of items belongs to which (version) of any other of those thousands of items. It's like making holographic pictures of an N-dimensional waterfall.

Did you notice that I did not take into account that organizations are constantly changing. For example turnover of people, responsibility shifts, process improvements, policy and strategic changes, marketing fluctuations, business objectives changes. But also people are changing constantly by their motivation, knowledge level, moods, etc. The effect on the configuration management is rather indirect, but for example the interpretation of a design decision may be different when the another architect is in charge, or another set of documents must be updated when the production of a certain piece of hardware is subcontracted to another supplier.

May be you do not consider all of this as "configuration management" but the whole conglomerate of interdependent physical and informational items (and all of their versions) must be maintained somewhere. Because this is too complex a problem for a human to comprehend and because nobody has ever invented a (CM) system to cope with this, we still rely on the collection of human brains with shared knowledge, logic, communication, perceptions and opinions.

Technorati tags: , , ,

No comments: