CAOS and AIRY are two sets of tools designed to allow the building of complex simulation programs specifically targeted to adaptive optics (CAOS) and interferometric image restoration (AIRY).
Each package consists essentially of a number of modules which can be assembled together in a simulation program (Carbillet 2001, Correia 2002).
A module is actually a single routine and the simulation program is usually built by assembling a sequence of calls to proper modules and wrapping everything within an iteration loop. In order to simplify the design of application programs by hiding the actual code of the simulation algorithms, the packages are provided with the Application Builder1 (Fini 2001), a Graphic Programming Environment where application programs can be assembled in a graphical manner and the actual simulation code is generated automatically (see Figure 1).
The CAOS/AIRY systems and the Application Builder have been developed under IDL and are currently targeted to IDL programs, although the same techniques could be applied to other programming environments as well.
Simulation programs are typically very CPU intensive, thus the ability to run production simulations on a high performance parallel machines is quite appealing.
Unfortunately most current programs (and notably both CAOS and AIRY) have been designed for scalar machines and should be rewritten, at least partially, to exploit the capabilities of parallel architectures and, moreover, programming for parallelism is not usually an easy task.
Parallelizing programs is in principle a simple job: the program is subdivided into partially independent tasks which can be executed concurrently on various CPUs. The actual improvement in performance, however, depends critically on how the program is divided into tasks and on the architecture of the parallel machine and its actual performance. To divide a program into tasks one can follow essentially three strategies:
All the three strategies, alone or even used together, can yield improvements in execution time, but in any case the need for data exchange among tasks must be carefully considered. On most parallel machines, and notably on Beowulf clusters, task-to-task data communication is performed through a comparatively slow channel so that the related overhead can easily spoil the time improvement gained with parallelization.
The third parallelization strategy quoted above is the most general, but also the most challenging in that the complexity of the program structure may notably be increased when the code is redesigned for parallelism. Moreover parallel programming skills are pretty unusual in the common curricula of physicists or engineers who are the main users of the CAOS system.
The aim of this work is thus to exploit the automatic code generation capabilities of the Application Builder to implement a code generator which can automatically produce source code optimized for a target Beowulf architecture.
What is most important is that all the essential tools to analyze the structure of the program and to generate the equivalent code are already there because they are also needed for the scalar version of the Application Builder.
In order to allow the Application Builder to generate optimized parallel code, it must be augmented with three tools:
A simulation program life cycle can thus be as follows: a) build the graphic representation of the simulation program by using the Application Builder just as before (or open an existing simulation program); b) select the profiling option and run a sample version of the program on a sensible but small subset of the input data using any scalar machine; c) evaluate the cluster characteristics by using the suitable tool (or get the same info from a previously stored table); d) let the Application Builder generate code for your target grid of computers. e) run the parallel version of the program to your satisfaction.
Most of the code needed is already part of the existing CAOS Application Builder and we have presented the architecture of the tools to be added to it for the purpose of parallel code generation. The coding is currently on the way and will yield a beta version in the near future.
More information about the CAOS and the AIRY packages can be found at:
Carbillet, M., Fini, L., Femenía, B., Riccardi, A., Esposito, S., Viard, E, Delplanke, F. & Hubin, N. 2001, in ASP Conf. Ser., Vol. 238, Astronomical Data Analysis Software and Systems X, ed. F. R. Harnden, Jr., Francis A. Primini, & Harry E. Payne (San Francisco: ASP), 349
Correia, S., Carbillet, M., Boccacci, P., Bertero, M., Fini, L. 2002, A&A, 387, 733.
D'Amore, L., Guarracino, M.R., Laccetti, G. 2003, "On the Parallelization of a Commercial PSE for Scientific Computation", to appear in ``Proceedings of IEEE International Conference on Parallel and Distributed Processing'', Genova, Italy.
Fini, L., Carbillet, M., & Riccardi, A. 2001, in ASP Conf. Ser., Vol. 238, Astronomical Data Analysis Software and Systems X, ed. F. R. Harnden, Jr., Francis A. Primini, & Harry E. Payne (San Francisco: ASP), 253