The main characteristics of the NOAO Mosaic Pipeline System are:
To the outside world, the NOAO Pipeline System appears as a black box (Fig. 1). Data is submitted to the Pipeline either directly from the telescope or from the NOAO Science Archive. A pipeline operator is able to constantly monitor the health and performance of the system and, if necessary, completely control the processing.
Quality control data is produced at several steps in the Pipeline, covering both basic telemetry and advanced image parameters (e.g. sky uniformity, PSF variations etc.). Monitor GUIs can subscribe to these streams of informations, enabling, for instance, the instrument scientist to monitor the performance of the instrument.
The system produces calibrated data, master calibration frames, catalogues and data quality information that can be delivered to the observer and ingested in the Science Archive.
Processing nodes have a layered architecture, as illustrated in Fig. 2. The Processing Software (e.g. IRAF tasks, scripts, compiled code) does the actual number crunching. The Software is logically organized in modules. Modules are then grouped into standalone pipelines. Pipelines form the full processing system. Any number of instances of each module can be started, to fully exploit the processing power of the host machine.
The Black Board subsystem is responsible for making data flow through the processing modules/pipelines (modules and pipelines can be dynamically chained together at run-time, using an XML based configuration system). The Black Board also provides an event handling and message passing framework that individual modules and pipelines use.
The Node Manager is a high performance server, running on each node. It fully controls the operation of the Pipeline System on that node, allowing pipeline operators (via Control GUIs) to:
The Node Manager also serves as load balancer. This functionality is implemented in a fairly sophisticated algorithm able to ``predict'' the load of a given processing node given its current CPU load, number of processors, number of instances of a given pipeline and the number of files in the queue.
The current architecture implements well defined interfaces for inter-machine communication and for communications with Monitor and Control GUIs. Data being processed, software and state are always kept local to each machine (having a private copy of the Black Board). The result is that each node of the processing network is an independent entity. This makes the NOAO Pipeline able to handle the failure of one or more nodes by transparently re-routing data to the available machines.
Hiriart, R., Valdes, F., Pierfederici, F., Smith, C. & Miller, M. 2004, this volume, 74 .