The VLT Software Control group is proud of the fact that VLT commissioning has never suffered delays due to software. A decisive factor in achieving this was the use of software engineering practices, and this paper reports on the experiences of applying these practices to the VLT Control Software and building the group culture as a simple but effective set of standards and tools.
Main areas of interest are analysis and design methodology and tools, software configuration control, software testing, problem reporting and tracking, and documentation. The paper will also discuss what didn't work and changes that are needed and planned.
The European Southern Observatory (ESO) Very Large Telescope (VLT) project consist of four 8-meter telescopes, fifteen instruments, and the VLT Interferometry (VLTI). The VLT Control Software project provides installation of all control and monitor functions can be characterized as follows:
The VLT Software has been and will be reused on other projects (e.g., NTT, 3.6, 2.2, VST, .etc.).
In 1991 when the VLT development began, the question was whether a traditional, basically no-rules approach would suffice, or whether a more structured software engineering approach was needed for a project of its size and characteristics. In retrospect, it is easy to identify the decisions responsible for the successful approach:
At the beginning very little was available; hence SE had to be put rapidly in place while the project was developing (see Table 1) in order to provide timely standards and tools for early project needs. It was found that by paying attention to the proper synchronization between the two processes, both SE and project needs could be met. This may be of interest for organizations or groups that intend to start but are afraid that SE practices have to be fully in place before starting. Of course this is preferable, but the other approach can also work.
|'91||Start of a SE/QA role||VLT SW Requirements|
|in the software (S/W) team|
|'92||Basic development environment||VLT SW Specifications|
|'93||OS and development tools||Start Coding|
|'94||First VLT Common S/W release||First external distribution|
|S/W Problem Reports|
|'95||Code management (cmm)||First Field Test (NTT)|
|Automatic test support (tat)|
|'96||S/W process audited||Wide external distribution|
|'97||Use & tune||Installation at Paranal|
|'98||Use & tune||UT1 First Light,|
|First external Instrument (FORS)|
|'99||Use & tune||UT1 inauguration, UT2 first light|
|'00||Use & tune||UT2 inauguration,|
|UT3 and UT4 first light|
The SE approach applied to the VLT project can be grouped in the following areas that are discussed below: Software Life Cycle, Documentation, Development Environment, (Automatic) Testing, Configuration Management, Releases, and Problem Reporting.
For the VLT we used the traditional waterfall model (sequence of phases: requirement, analysis, design, implementation, test, etc.) in an ``incremental'' way, i.e., identical sequences were started at different times for the different parts of the project. Every release of each major package repeated a similar sequence of phases.
The documents followed the numbering system defined at the VLT level and all had the same layout. A document template was provided for each of the major document types.
Whenever possible, documentation was extracted from the code. This was applied mainly to interface specifications (.h files, command tables, etc.) and for ``man pages'' used both in the design phase as detailed description of each function and later as user documentation. The ``man page'' is maintained in the file that contains the corresponding code.
A standardized development environment is an essential requirement for managing different people, in different places, who are all developing software for subsequent integration. The cornerstones of the development environment have been:
The developer has to provide at least one test executable (from C/C++ programs and shell, tcl, etc. scripts), ordered in a so-called TestList. Because the test tool compares the current output with a previously generated one, test programs can be quite simple: execute an action and print out the result. No interpretation with error-prone ``if-then-else'' provisions are needed in the test program itself. Minimising test program complexity is important both to limit the cost and improve reliability. Make the automatic test affordable, and not a nightmare.
Execution of the test is then accomplished via a standard command: ``cd moduleName/test; tat''. The needed environment is created (every time!) to allow the test to be executed with the same initial conditions. All tests are executed and the result is compared with a reference file. PASSED or FAILED is the only test output. PASSED means the result of the current run is identical to the one defined as reference. FAILED means that something was different and requires that an investigation begin. A very compact output is necessary when tests are used for non-regression during release integration.
Despite the ``magic'' word ``automatic'', testing is not without its costs. Dedicated resources, both people and computers, are needed for this task in addition to the work of each developer in preparing and using the test software. The size of test software has been of the same order as delivered software.
In addition to the internally developed tool, commercial software products, like Purify, have been used.
Key aspects have been: identify all components (OS, tools, ESO-developed S/W), maintain a reasonably paced upgrade cycle (6-12 months), document both in man pages and on paper, enforce backward compatibility, and impose systematic non-regression testing before release.
One central archive, accessible to all sites and all people, was essential. The archive is the coordination point: if a software unit is in the archive it means that it is ready to be used, because it has been tested and all files are consistent.
The supporting tool (cmm) is a thin layer of procedures on top of an RCS archive. Its basic rules are:
- All files belonging to a module are handled at the same time;
- Only one person at time can modify a module;
- Branches are supported;
- There is only one archive;
- A client-server protocol implements all operations;
- Commands consist of a very simple set.
Figure 1 gives the total number of transactions executed per month over the last three years. ``Archiving'' accesses are when a new version of a module is created, ``copying,'' when a read-only copy is retrieved. Note the high number of read-only copies, especially at the end of the integration process of a release or during first telescope integration and commissioning. This is a sign that regeneration from scratch was the rule. Figure 2 gives the same data but limited to transactions serving non-ESO sites, mainly European institutes that are part of one of the several Instrument Consortia providing the VLT instruments.
Tracing bugs and modifications is a necessary complement to Configuration Management. For VLT software, a Web-interfaced central database to submit, query, and modify SPRs has been implemented using the commercial tool Action Remedy. This application supports the following work flow:
Figure 3 gives the total number of SPR users, i.e., people that submitted a problem or proposed a change. In principle this could be the number of software people that participate in the development and integration of the project. A bit less than 60% were generated by people from the development team (either at the Garching headquarters or on assignment at the Observatory), about 20% by technical support staff based in the Paranal Observatory, and the last 20% by people from Consortia.
Figure 4 give the cumulative number of SPRs each year and shows a continuous increase of the total number due to new software made available each year. Except for the early days, the number of open SPRs at any time was always below a critical level (about 500 SPRs) that was more or less what the team could deal with between releases.
Figure 5 gives the answer to the most important question: Is the system becoming stable, or will a maintenance team bigger than the development team be needed to maintain it? A healthy system should show a peak corresponding to the first integration, followed by a decline. In Figure 5, three different areas are compared over a three-year period: the Common Software had its ``glory days'' earlier and is declining as a sign of stability, while the Telescope Control Software shows its peak at the integration and commissioning of the first Unit Telescope (1998, UT1 first light). The third group includes other software packages that were at the initial stages of their life cycles in the years shown.
The established SE system was audited by an external team (in 1996) that found no deficiencies in the approach and in the implementation. Beside this, it has been positively recognized by the integration and commissioning teams of both Paranal and La Silla
A July 2000 survey asked those who had used the VLT Common Software to identify the three best and three worst aspects of the systems. Of a total of 22 responses collected, 15 people mentioned SE practices as one of the best items, while none mentioned SE among the worst aspects. Table 2 contains these reported opinions.
|1||S/W Dev. Environment + Standards (Makefile,|
|1||S/W engineering (configuration, SPR, programming|
|1||S/W engineering, standards|
|1||The development environment (makefile, file|
|1||standard for programming, directory structure,|
|1||std module structure - INTROOT/VLTROOT concept|
|2||Standard module structures with makefile and|
|2||Lots of documentation|
|3||software configuration control and vltMakefile|
Standards and practices have been enforced more by means of such tools, than by ``police inspections.'' With the exception of a few key areas in which deviations were simply not permitted, it has been the voluntary behavior of developers that has determined the results.