Next: The Client Server Design of the Gemini Data Handling System
Up: Data Pipelines
Previous: Data Pipelines
Table of Contents - Subject Index - Author Index - Search - PS reprint - PDF reprint

Grosbøl, P., Banse, K., & Ballester, P. 1999, in ASP Conf. Ser., Vol. 172, Astronomical Data Analysis Software and Systems VIII, eds. D. M. Mehringer, R. L. Plante, & D. A. Roberts (San Francisco: ASP), 151

Pipeline Processing of Data in the VLT Era

Preben Grosbøl, Klaus Banse, Pascal Ballester
European Southern Observatory,
Karl-Schwarzschild-Str. 2, D-85748 Garching, Germany


The VLT Data Flow System pipeline and quality control subsystem provides a general infrastructure for standard reduction and quality assessment of data obtained at the VLT facility. The main design challenge is to support a wide range of instruments in a distributed environment. The pipeline system can be configured, through a set of ASCII files, to handle several instrument simultaneously. It was designed using the object-oriented methodology and major parts of the baseline version will be written in Java using OMG/CORBA technology to support distributed objects.

1. Introduction

The VLT Data Flow System (DFS) provides a single, homogeneous, end-to-end system for handling science data from the VLT facility (see Grosbøl & Peron 1997). It can be divided into three main parts, namely: a) pre-observation tasks which include preparation of observing proposals, detailed specification of observations and tentative scheduling, b) observation support containing archiving services, and c) post-observation processing including pipeline reduction and quality control of data acquired.

This paper focuses on the post-observation modules with emphasis on the standard pipeline reduction of VLT data. A discussion of the quality control aspects was given by Ballester et al. (1998).

2. Requirements and Challenges

The vast amount of data produced by the VLT and the multitude of instruments demand that raw data can be reduced very efficiently and with a minimum of manual intervention. The DFS pipeline has been designed for this purpose and will be used in four main scenarios: Whereas it is trivial to make an explicit pipeline to reduce data from a given instrument, the main challenge for the DFS pipeline is to create a unique infrastructure which can serve all of the more than 15 different instruments on the four 8.2 m telescopes plus ancillary units for interferometry and wide field imaging. The long expected life time of the facility makes it mandatory to rely on a single concept to ease operation and reduce maintenance costs while employing a modular design and thereby enable a gradual replacement of components in the course of time.

There must be a clear separation between the pipeline infrastructure and data processing tasks to ensure that any suitable data reduction system (DRS) can be used. Some reduction tasks for a specific instrument may already be available in a particular system (e.g., AIPS++ could provide interferometric reduction procedures). It is also prudent to assume that not all current DRS's will be fully updated or supported over the next decades. A smooth migration of reduction tasks from one to another system would therefore be important. Further, the large data volume also makes it essential to support parallel processing to take advantage of multi-processor or loosely coupled computers e.g. Beowulf type of systems.

3. Pipeline Model and Assumptions

The pipeline processing model assumes that all raw frames can be uniquely associated to a specific instrument and the full description of its setup can be obtained from their FITS headers. It must also be possible to determine the observational and operational context of a raw frame relative to others by analyzing its FITS keywords. This provides a hierarchical grouping of frames e.g. based on their relation to Observation Blocks and Templates (see Grosbøl & Peron 1997). Each instrument must define a unique classification of all frames used by the pipeline including raw frames, calibration data and products generated.

Actual pipeline tasks are triggered by different events such as arrival of new raw frame or end of observing template. The frames associated to the event are identified and the necessary set of reduction recipes are obtained following rules defined for the instrument. The recipes specify the algorithms and the required parameter including calibration data such as CCD flat fields or wavelength tables. Calibration files are either stored in a local database or file directory structure. The appropriate calibration frames for a given input data set are found through their classification and by matching a primary key defined as a set of FITS keywords.

4. Architectural Design

The architecture of the pipeline system is based on a distributed, object oriented design. The system can either be driven by events or activated through graphical user interfaces. Five main applications are defined to support the basic pipeline functionality: The first implementation is aimed mainly at an automatic, batch type environment but it is expected that more clients will be added to provide better interactive control. The applications use a set of general services:

Figure 1: Main applications and services of the DFS Pipeline subsystem.

The communication between clients and servers is based on the OMG/CORBA distributed object model. The components are shown in Fig. 1 together with several CORBA services which may be used. Whereas the Naming Service is available in most CORBA implementations, the Event and Trader Services are not yet standardly provided.

5. Configuration and Implementation

It is essential that new instruments can be easily integrated into the pipeline environment. This is facilitated by defining their behaviors in a set of ASCII configuration files. They make it possible to define the following instrument specific items: The files are under configuration control and make it possible to process data from the Science Archive using the appropriate versions for the instrument definitions.

It is foreseen that the DFS pipeline will be employed at computer systems distributed over the ESO sites and possibly exported to external institutes. Web based interfaces are also expected to play an important role in operating and monitoring the pipelines. The major parts of the DFS pipeline will be implemented in Java which provides excellent support for distributed object systems and user interfaces. It is expected that OMG/CORBA based tools will be used for the object bus and general services.


Grosbøl, P. & Peron, M. 1997, in ASP Conf. Ser., Vol. 125, Astronomical Data Analysis Software and Systems VI, ed. G. Hunt & H. E. Payne (San Francisco: ASP), 23

Ballester, P., Kalicharan, V., Banse, K., Grosbøl, P., Peron, M., & Wiedmer, M. 1998, in ASP Conf. Ser., Vol. 145, Astronomical Data Analysis Software and Systems VII, ed. R. Albrecht, R. N. Hook, & H. A. Bushouse (San Francisco: ASP), 259

© Copyright 1999 Astronomical Society of the Pacific, 390 Ashton Avenue, San Francisco, California 94112, USA
Next: The Client Server Design of the Gemini Data Handling System
Up: Data Pipelines
Previous: Data Pipelines
Table of Contents - Subject Index - Author Index - Search - PS reprint - PDF reprint