prev   toc   next  

ADASS XIII presentations

Session P1: Surveys and Large-Scale Data Management


P1.1: Data Archive and Transfer System (DARTS) of ISAS

Takayuki Tamura, ISAS, Hajime Baba, ISAS, Keiichi Matsuzaki, ISAS, Akira Miura, ISAS, Iku Shinohara, ISAS, Fumiaki Nagase, ISAS, Masahiro Fukushi, Fujitsun Limited, Kenji Uchida, , Fujitsun Limited

DARTS is a central database for the scientific data obtained by satellites of the Institute of Space and Astronautical Science (ISAS) in Japan. We have archived and released the data from ASCA (X-ray astronomy), IRTS (Infrared astronomy), Yohkoh (solar physics), Geotail (geomagnetosphere), Akebono (aurora observation), and other several missions. In the near future, the data from Astro-E2 (X-ray astronomy), Astro-F (Infrared astronomy), and Solar-B (solar physics) missions will be released from the DARTS. These satellites will provide the data much larger in quantity and quality than previous satellites. To manage these upcoming data, we are developing a new system using new technologies such as Java and XML. In addition, the new data distribution system on 'super SINET', which is an ultrahigh speed network dedicated to academic institutes in Japan, is under construction. We will introduce the DARTS system and present our future plan.

P1.2: Metadata generic server prototyping at CNES

Paul KOPP, Centre National d'Etudes Spatiales (France)

CNES holds in its archives a huge amount of datasets, which are the receptacle of more than 40 years of space borne instrument measurements. Whilst the physical integrity of the datasets is no longer a real concern at CNES, it turns out, however, that the knowledge about information contained in the datasets, i.e. semantics needed to make good use of the datasets, sometimes vanishes, as time elapses. The reason for it is that semantics is seldom organised in a structured way and tightly attached to the datasets. Very often, it has to be mined from different documents. Sometimes there is no digital version of these documents available. Sometimes there is even no document at all and one must ask the scientist who may still know about the dataset. In any case, if semantics of a dataset is not retrievable, the dataset is becoming definitely unusable after only a few years and in the same situation as if it were physically unrecoverable.

Metadata is the name usually given to semantics attached to datasets. CNES has undertaken the development of an information server aiming at collecting metadata, managing them according to predefined structures and making them available for analysis by future dataset users through an interface allowing different kind of queries. This metadata server has been designed in order to make it deal with the needs of very different provider and user profiles, without requiring any recoding of the underlying software. In particular the structures according which metadata are handled by the metadata server, the way metadata are portrayed and the kind of queries available to the user are easy to change. This characteristic is called genericity.

The metadata server is now a prototype used inside CNES and which should be made available over the Web by the end of this year. The presentation aims at giving a view of the technical foundations, which have lead to a metadata generic server.

P1.3: User provided reduced data, catalogues and atlases in the ISO Data Archive

Alberto Salama, Christophe Arviset, Iñaki Ortiz, Jose Hernandez, Pedro Osuna, John Dowson

Five years after the end of the Infrared Space Observatory (ISO) operations, its data continues to be an exemplary resource for scientific exploitation. To date, just over one thousand papers based on ISO results have been published in refereed literature, and the rate has not yet peaked. ISO data has yielded an abundance of exciting discoveries and many more are still to be expected. Many papers are based on a systematic reduction of ISO data, producing what we call 'Highly Processed Data Products' (HPDP). These products include DATA (images, spectra etc.), which have been processed beyond the automatic processing pipeline and/or using new, refined algorithms and therefore have been improved to any degree compared to the legacy pipeline products, as well as any resulting CATALOGUES and ATLASES. In this direction, projects have been undertaken by the ISO Data Centre, in collaboration with the national instrument data centres, for systematic data reduction of specific instrument modes, that will produce homogeneous sets of HPDP. The ISO Data Archive has been enhanced to host these new products. Version 6, released on July 2003, has the functionality for continuous ingestion of new data, catalogues and atlases, after screening by the ISO Data Centre. All datasets are queriable and retrievable, in a user-friendly way.

P1.4: On the analysis of old objective prism spectra with modern systems

Corinne Rossi, Dipartimento di Fisica, Universita' La Sapienza, Roma, Italy, Roberto F. Viotti, IASF-Istituto di Astrofisica Spaziale e Fisica Cosmica, Roma, Italy, Alessandro Omizzolo, Specola Vaticana, Castelgandolfo, Roma, Vatican State, Angela Filippi & Silvia Pedicelli, Dipartimento di Fisica, Universita' La Sapienza, Roma, Italy

Objective prism (OP) plates collected with Schmidt telescopes are a heritage of the pre-electronic Astronomy which may still contain useful data for statistical researchs, and precious information on unrecorded peculiar events as well. Two critical aspects of the problem have to be considered: the digitisation of the large number of spectra recorded on a single plate, and the creation of a standard procedure of spectrophotometric calibration. For our project we used a set of 23 OP plates obtained by Lubos Kohoutek during 1969-1975 with the Hamburg-Bergedorf 80/120/240 cm Schmidt telescope equipped with a 4 deg objective prism, with the aim of studying the spectral evolution of the symbiotic star HBV 475 (V1329 Cyg) and of the nearby Nova Cygni 1970.

For each plate, a central region of about one square degree has been scanned using an EPSON 1640 XL scanner, hence with a resolution far below the one obtainable with the classical microdensitometres. We applied the IRAF packages to compute the density from the digitised data, and derived the wavelength scale for the best exposed spectra, that has been used to derive the general dispersion curve of the plates.

For the spectrophotometric calibration, following the procedure developed by Baratta et al. (ApJ 187, 651, 1976) in their OP study of the symbiotic star V1016 Cyg, we selected in the field a number of stars with known photometric data and spectral classification. As a first step of our analysis, we have used these stars to derive the plate contrast in the blue and visual of the different photographic emulsions, and to obtain the B and V magnitudes of other stars in the field. Preliminary results of our project will be presented and discussed.

P1.5: The Chandra Multiwavelength Project (ChaMP): Optical Data Processing and Catalog Generation

Robert Cameron, Harvard-Smithsonian Center for Astrophysics, Wayne Barkhouse, Harvard-Smithsonian Center for Astrophysics, Paul Green, Harvard-Smithsonian Center for Astrophysics, Amy Mossman, Harvard-Smithsonian Center for Astrophysics, John Silverman, Harvard-Smithsonian Center for Astrophysics

A principal objective of the Chandra Multiwavelength Project (ChaMP) is the optical identification and cataloging of serendipitously detected background X-ray sources in Chandra archival data. The ChaMP uses a program of multi-filter optical imaging of observed Chandra fields to detect optical counterparts to X-ray sources. We describe the methods used for reduction, analysis and cataloging of optical sources in the ChaMP fields. Automated pipeline processing of the optical data includes source extraction, photometric calibration and optical to X-ray source matching. Visual inspection tools have been developed for quality control of the resultant source lists and for identification of interesting objects for follow-up spectroscopic observations. Methods and tools for management, presentation and access of the ChaMP catalogs are also described. This work was supported in part by NASA contract NAS8-39073 and by Chandra grants AR1-2003X and AR3-4018X.

P1.6: From FITS to SQL - Loading and Publishing the SDSS Data

Ani Thakar, JHU, Alex Szalay, JHU, Jim Gray, Microsoft

The Sloan Digital Sky Survey Data Release 1 (DR1) contains nearly 1 TB of catalog data published online as the Catalog Archive Server (CAS) at http://skyserver.pha.jhu.edu/dr1. The DR1 CAS is the end product of a data loading pipeline that transforms the FITS file data exported by the LINUX-based SDSS Operational Database (OpDB), converts it to CSV (comma separated values) format, and loads it into a MS Windows-based relational DBMS (SQL Server).

Loading the data is potentially the most time-consuming and labor-intensive part of archive operations, and it is also the most critical: it is realistically your one chance to get the data right. We attempted to automate it as much as possible, and to make it easy to diagnose data and loading errors.

We describe this pipeline, focusing on the highly automated SQL data loader framework (sqlLoader) - a distributed workflow system of modules that check, load, validate and publish the data to the databases. The workflow is described by a directed acyclic graph (DAG) whose nodes are the processing modules. It is designed for parallel loading on a cluster and is controlled from an ASP web interface (Load Monitor).

The validation stage, in particular, represents a systematic and thorough scrubbing of the data before it is deemed worthy of publishing. The publish step merges the different data products into a set of linked tables that can be efficiently searched with specialized indices and pre-computed joins.

We are in the process of making the sqlLoader generic and portable enough so that other archives may adapt it to load, validate and publish their data.

P1.7: Adapting the BIMA Image Pipeline for Miriad using Python

David Mehringer, Univ. of Illinois/NCSA, Raymond Plante, Univ. of Illinois/NCSA

Through our experience using AIPS++ in the BIMA Image Pipeline, we found that a sophisticated scripting environment is crucial for supporting an automated pipeline. Miriad V4, now in development, introduces support for calling Miriad programs from a Python environment (referred to as Pyramid). We are creating processing recipes using Miriad through Python that can be used with the BIMA Image Pipeline. As part of this work, we are prototyping tools that could be integrated into Pyramid. These include two Python classes, UVDataset and Image, for examining the contents of Miriad datasets. These simple tools have allowed us to recast our Pipeline using Miriad in a few months. Python recipes are used for such things as determining line-free channels for continuum subtraction and determining if data will benefit from self-calibration. We are currently using the pipeline to do massive processing of hundreds of tracks of archival data.

P1.8: LIGO Data Analysis Software and Systems

Gregory Mendell, LIGO Hanford Observatory

The Laser Interferometer Gravitational-wave Observatory (LIGO) project began to take science data in 2002. An overview of the software and systems being used to analyze the data by the LIGO Science Collaboration will be presented. Results of the analysis effort to date and its future direction will be discussed.

P1.9: Searching for a cosmic string through the graviational lens effect: Japanese Virtual Observatory science use case

Y. Shirasaki, NAOJ, Y. Mizumoto, NAOJ, E. Matsuzaki, TITech, M. Oishi, NAOJ, N. Yasuda, NAOJ, M. Tanaka, NAOJ, S. Honda, NAOJ, H. Yahagi, NAOJ, M. Nagashima, Univ. of Durham, G. Kosugi, NAOJ, N. Kashikawa, NAOJ, F. Kakimoto, TITech, S. Ogio TITech

This paper describes a method to search for a cosmic string using its unique gravitational effect which produces aligned double images and its implementation to Japanese Virtual Observatory (JVO).

Grand unified theory predicts that superheavy cosmic strings with linear mass density of 10^22 g/cm were produced at a phase transition in the early universe. The lensing effect by a long straight object can be characterized by undistorted double galaxies and quasars which are co-aligned in a direction of string network and are distributed in a very large scale. Because of its large scale nature, wide field deep survey is crucial for its discovery.

We have constructed a database of Subaru Suprime-Cam catalog/image for selected area and applied our search method through the JVO prototype.

P1.10: Towards a large field of view archive for the European VLBI Network

Huib van Langevelde, Friso Olnon, Harro Verkouter, Bauke Kramer, Mike Garrett, Arpad Szomoru

Traditionally VLBI observations focus on a small patch of sky and image typically a few 100 mas aroud a bright source, which is often used to self-calibrate the data. High spectral and time resolution is needed to image a larger area of the sky, up to the primary beam of the individual telescopes. The EVN MkIV data processor at JIVE is being upgraded to make such high resolution data its standard product. From the archive of high resolution data it will be possible to image many sources in each field of view around the original targets.

P1.11: The WFCAM Data Pipelines

James Lewis, Institute of Astronomy, Cambridge, Mike Irwin, Institute of Astronomy, Cambridge

The infrared mosaic camera, WFCAM will be commissioned on the United Kingdom InfraRed Telescope (UKIRT) in the first quarter of 2004. In order to maximise observing efficiency and data quality, two separate data reduction pipelines will be employed. The first will run at the summit and will provide rapid feedback to observers on data quality and observing conditions. The second will run in Cambridge and reduce and extract the information that will ultimately feed into the WFCAM science archive. Although the environment in which the pipelines will run are different, both share the same underlying processing software (CIRDR). These are part of a package of basic astronomical processing software orignally written at CASU to process CIRSI data. In this paper we describe the architecture as well as the processing strategy that will take place in each pipeline.

P1.12: The INTEGRAL Archive System

Mohamed Tahar Meharga, ISDC - INTEGRAL Science Data Centre, Switzerland, Mathias Beck, ISDC, Switzerland, Pavel Binko, SYNSPACE SA and ISDC, Switzerland, Tom McGlynn, LHEA, Code 660, NASA's GSFC, USA, Katja Pottschmidt, Max-Planck-Institut fĂźr extraterrestrische Physik, Germany and ISDC, Switzerland, Roland Walter, ISDC, Switzerland

We present the archive system designed for the long term storage of the data provided by ESAÂ?s International Gamma-Ray Astrophysics Laboratory (INTEGRAL, launched October 2002). About 6 GB are archived at the INTEGRAL Science Data Center (ISDC, near Geneva) for every ~3 day revolution of INTEGRAL. We first give a short overview of the main components of the INTEGRAL archive - namely the data ingestion software, the data organization concept, the ORACLE meta-database, the data distribution pipeline, and the modified W3Browse data access interface - reviewing their current operational performance. Building on those rather standard archive components, the unique properties of INTEGRALÂ?s data (large field of view, coded mask imaging technique, complex auxiliary information multi-version data) triggered the development of some additional features. We will discuss several of them - e.g., the option for external users of triggering the distribution pipeline via the modified W3Browse facility. The former builds and provides (via FTP) the complex dataset of all related science and auxiliary files necessary for further analysis of public observations.

P1.13: Clustering the large VizieR catalogues, the CoCat experience

François Ochsenbein, CDS, Sébastien Derriere, CDS, Sébastien Nicaisse, CDS, André Schaaff, CDS

VizieR is a database containing about 4000 astronomical catalogues with homogeneous descriptions. The major part of the catalogues is stored in a relational database but very large catalogues containing over 10 millions rows are stored as compressed binary files and have dedicated access tools for very fast access by celestial coordinates; this method proved to be much more efficient both in terms of speed and disk usage than storing the huge tables in relational databases. The CoCat (Co-processor Catalogue) project main goal was to parallelize the VizieR large catalogue treatments (data extraction, cross-matching) for reducing the response time. A wide range of free or commercial clustering tools is available. Our project is based on a new free clustering tools package, CLIC (Cluster LInux pour le Calcul) which is based on the Mandrake Linux 9.0 distribution.

P1.14: An Authenticated FTP Staging Area for Proprietary Data

Tim Kimball, STScI, Farshid Khoui, STScI

MAST is implementing an FTP staging server for proprietary data from the HST and FUSE archives. This server will authenticate connections against the DADS registered users database, providing protection for proprietary data staged from DADS. The server software is the Covalent Enterprise FTP Server, a commercial product based on Apache 2.0. Authentication is done using a mod_perl handler written in house. This handler was the only software development required; the rest of the server's functionality is implemented through Apache and Covalent configuration directives. In addition to staging, the server can be used by research teams to exchange non-public data with collaborators.

P1.15: HST ACS Associations, the Next Step after WFPC2

Richard Hook, ST-ECF/ESO, Daniel Durand, CADC, Luc Simard, CADC, Anton Koekemoer, STScI, Alberto Micol, ST-ECF/ESA

After the release of the WFPC2 associations, the CADC, ST-ECF and STScI are now working on joint pipeline software to produce associations of images from the HST's Advanced Camera for Surveys instrument.

Although the basic approach is very similar to the WFPC2 associations (Durand et al., this conference) there are some fundamental differences because of the high level of geometric distortion of the ACS optics. The core of the ACS association pipeline will perform image combination using the Drizzle method and hence there will be no need to constrain the position angle of associated observations as was done with WFPC2. Our goals are the production of high quality products for the HST archive users and the 'publication' of these products within the Virtual Observatory.

P1.16: Distributed Data Storage Systems for Astronomy

Justin Kanoa Withington

In response to the current generation of wide field imagers many astronomy facilities have remarkable online data storage capacity requirements. Distributed data storage systems are anattractive solution since they can utilize low cost hardware and scale well. Parameters in the design of such systems relate directly to their cost and performance. This paper examines some of those parameters and relates them to examples of distributed data storage at the CFHT and other data centers. The purpose is to provide a discussion of distributed storage design for astronomy.

P1.17: Increasing the Accessibility of GBT Data

Karen O'Neil, NRAO - Green Bank, Ron Maddalena, NRAO - Green Bank, Nicole Radziwill, NRAO - Green Bank

The Green Bank Telescope (GBT) currently outputs its raw data as a suite of binary FITS files, approximately one per component device on the telescope, which are then consolidated and pre-processed before being written into an AIPS++ Measurement Set for more extensive analysis. This essentially restricted astronomers to a single data analysis package and reduced the productivity of those who prefer analysis packages other than AIPS++. To maximize the scientific returns from the unique features of the GBT, and to support a broader cross-section of observers' backgrounds and interests, work is being done to combine raw GBT data from the disparate FITS files into a variety of standardized FITS file formats such as SDFITS and CLASS FITS. These files can then be analyzed using tools such as IDL, CLASS, Mathematica, and Matlab. Here we describe prototyping exercises that were initiated in Green Bank during the summer of 2003 for the purpose of identifying how to make GBT data more readily accessible to a wider variety of data reduction tools. Although further refinement is needed to support the standard observing modes of the GBT in a production capacity, early results from the investigation demonstrate the feasibility and applicability of the approach.

P1.18: The ACS Science Data Archive

Terence Allen, JHU, William Jon McCann, JHU

The ACS GTO Science Data Archive offers access to all products generated by the ACS GTO data processing pipeline (APSIS) through a database-driven web interface. It enables science team members to construct ad hoc and deeply structured queries through visually intuitive navigation and search tools. The current capabilities include: color and magnitude constrained object searches, cross-correlated searches with other online catalogs, direct download access to all FITS images, download of complete object catalogs and search results in a variety of formats (including VOT, HTML, XML, CSV).

P1.19: The Gemini Science Archive

Severin Gaudet, CADC/HIA/NRC, David Bohlender, CADC/HIA/NRC, Adrian Damian, CADC/HIA/NRC, Sharon Goliath, CADC/HIA/NRC, Norm Hill, CADC/HIA/NRC

The National Research Council of Canada's Canadian Astronomy Data Centre (CADC) at the Herzberg Institute of Astrophysics is currently developing an archive system for the Gemini North and South telescopes: the Gemini Science Archive (GSA). The CADC will also operate the GSA and anticipates the start of basic operations in the fall of 2003. Early archive facilities available to the user will include basic catalogue browsing through a web interface and retrieval of public, unprocessed data from facility instruments. Additional functionality will be added to the GSA operations in an incremental fashion in parallel with the archive development.

P1.20: Webservices for the Sloan Digital Sky Survey

Joerg M. Colberg, University of Pittsburgh, Andy Connolly, University of Pittsburgh, Ryan Scranton, University of Pittsburgh

Over the past few years, computer technology has been pushed by the software industry to integrate the internet into applications rather than treating it as some sort of static way to exchange data. Custom-designed pieces of information can now be sent and retrieved by programmes that know how to find what they need and how to access and process it. This is what is usually called webservices. Webservices are of interest for anybody who has to deal with large amount of static and non-static data and with a potentially large number of users interested in those data. Thus, webservices have a lot of potential for scientific applications such as the Sloan Digital Sky Survey (SDSS), one of the largest existing surveys of part of the sky to date. A lot of work has already been devoted to database issues and to data mining. We focus on allowing users to actually get access to data mining tools by providing an easy interface. We are going to show a first webservice developed at the University of Pittsburgh. The webservice allows a user to specify a dataset, to run the dataset through a very efficient N-point correlation code (a very popular astrophysical analysis) and then to only retrieve the final results, thus reducing network traffic to a minimum. The application incorporates all steps from database querying to a version of the N-point code embedded in Microsoft's .NET to dealing with user interfaces.

P1.21: The NOAO Pipeline Data Manager

Rafael Hiriart, NOAO, Frank Valdes, NOAO, Francesco Pierfederici, NOAO, Chris Smith, NOAO, Michelle Miller, NOAO

The Data Manager for NOAO Pipeline system is a set of interrelated components that are being developed to fulfill the pipeline system data needs. It includes:

1. Management of calibration files (flat, bias, bad pixel mask and xtalk calibration data.)

2. Management of the pipeline stages' configuration parameters.

3. Management of the pipeline processing historic information, for each of the data products generated by the pipeline.

The Data Manager components uses a distributed, CORBA based architecture, providing a flexible and extensible object oriented framework, capable of accomodating the present and future pipeline data requirements.

The Data Manager communicates with the pipeline modules, with internal and external databases, and with other NOAO systems such as the NOAO Archive and the NOAO Data Transport System.

P1.22: The D4A Digitiser

Jean-Pierre De Cuyper, Royal Obs. Belgium, Lars Winter, Sternwarte Bergedorf, Germany, Joost Van Ommeslaeghe, NGI Belgium

The pilot project D4A - Digital Access to Aerial and Astro photographic Archives - initiated and financed by the Belgian Federal Science Policy Office, aims to acquire the necessary know-how, hardware and software to digitise the astro photographic collections of the Royal Observatory of Belgium and the aerial photographic collections of the National Geographic Institute and the Royal Museum of Central Africa, as well as the associated metadata. The project set out to offer the results to the public and to make them directly usable for scientific research through the modern techniques of the information society. The D4A project is constructing a high precision two dimensional plate digitiser, that will be operated on the step in order to create a precise digital copy of the original greyscale (or colour) image or spectrum with a pixel size of about 5 micrometer. The digital camera-objective unit is mounted on a granite bridge, perpendicular above the plate which is mounted on the inner open frame of a granite based air bearing XY-table, with a geometric positioning accuracy and repeatability of some ten nanometers. The objective used is a two-sided 1:1 telecentric lens, this to ensure that if the image surface is not perfectly flat, the introduced error will only slightly enlarge the projected image of a point source, while keeping it isotropic and without displacing it. The part of the footprint of the telecentric objective used will be limited to its central part where the distortion is smaller than a pre-defined maximum. In this way an optical contact copy of the original image onto the digital detector will be achieved. In order to be able to reach and maintain a high geometric and radiometric accuracy, the D4A digitiser will be placed in a clean room, at a temperature of 18 °C +- 0.1 °C (1 sigma) and at a relative humidity of 50% RH +- 1% RH (1 sigma). For obtaining the sharpest possible image, the air between the photographic emulsion and the objective must be kept at rest. A benchmark was developed using a geometric dotted grid (on a chrome on glass plate) in order to test and fully calibrate the absolute positioning of the XY-table on each point of its 335mm x 335mm travel area. This is achieved by tabulating the systematic errors of the ZERODUR linear encoder on each of the four linear motor rails into the positioning software. The D4A digitiser will be equipped with a computer operated plateholder on a 270° turntable suited for both glass plates and film sheets up to 350mm x 350mm large and with an automatic film roll transport system. A comparison with graphical and photogrammetric commercial scanners will also be presented.

P1.23: Transparent XML Binding using the ALMA Common Software (ACS) Container/Component Framework

Heiko Sommer, ESO, Gianluca Chiozzi, ESO, Bogdan Jeram, ESO, David Fugate, NRAO, Matej Sekoranja, Cosylab

ALMA software, from high-level dataflow applications down to instrument control, is built using the ACS framework. The common architecture and infrastructure used for the whole ALMA software is presented at this conference in another paper [1].

ACS offers a CORBA-based container/component model and supports the exchange and persistence of XML data. For the Java programming language, the container integrates transparently the use of type-safe Java binding classes to let applications conveniently work with XML transfer objects without having to parse or serialize them.

This talk will show how the ACS container/component architecture serves to pass complex data structures, such as observation meta-data, between heterogeneous applications.

To give an overview and a visual impression of ACS, this talk and a poster presentation on control system aspects of ACS [2] will be complemented by a joint demo.

[1] J. Schwarz et.al., "The ALMA Software System", ADASS 2003 [2] B. Jeram et.al., "Interfacing software with hardware in a Control System based on the ALMA Common Software", ADASS 2003

P1.24: ALMA Proposal Preparation: Supporting the novice and the expert.

Alan Bridger, UK Astronomy Technology Centre, Joe Schwarz, European Southern Observatory, David Clarke, UK Astronomy Technology Centre, Maurizio Chavan, European Southern Observatory, Heiko Sommer, European Southern Observatory, Marcus Schilling, European Southern Observatory

The Atacama Large Millimetre Array Observatory (ALMA) is an international collaboration between Europe and North America to build a synthesis radio telescope that will operate at millimetre and submillimetre wavelengths. Other papers in this conference outline the overall ALMA Software design and cover various aspects of the software system.

In this paper we describe the subsystem that will provide the Astronomer's main interface to observing with ALMA: the Proposal and Observation Preparation subsystem. Tools to handle Proposal and Observing preparation for the World's major telescopes are of course now commonplace (HST, Gemini, VLT, UKIRT, JCMT to mention a few), and we will build on the experience of those tools, but the ALMA telescope will present some novel challenges.

We will outline the technical side of the subsystem, how it integrates fully with the overall ALMA software system, and also describe our proposed solutions to the challenges of the user-interface side of the system: the major requirement we have to fully support the needs of advanced users, those expert in submillimetre aperture synthesis observing, and novices, who may know very little about the domain.

P1.25: The ALMA Prototype Pipeline

Lindsey Davis, NRAO, Brian Glendenning, NRAO, Doug Tody, NRAO

The ALMA prototype pipeline project is a joint initiative of the NRAO Data Management group and the ALMA Computing IPT to develop a prototype pipeline data processing capability for ALMA. The pipeline prototype will use Python for pipeline scripting, the CORBA-based ALMA ACS as a scalable computational framework, and AIPS++ components written in C++ as the processing modules. Control GUIs will be implemented as pluggable Java components. Data from an ongoing VLA GRB monitoring program will be used to provide a real-life use case to test the pipeline.

The current project is divided into 3 phases covering a period of one year, terminating in mid-2004. Phase A, already underway, involves connecting several AIPS++ distributed objects to ACS and demonstrating a basic tasking capability including messaging. Phase B will demonstrate a basic Python and Java scripting capability for AIPS++ components and requires porting the remaining major AIPS++ distributed objects to the ACS framework. Phase C will demonstrate hosting a simple Python processing script as a servant script and the capability to perform automated processing.

P1.26: The ALMA Archive: A centralized system for information services.

Andreas Wicenec, ESO, Simon Farrow, UMIST, Severin Gaudet, HRI, Norman Hill, HRI, Holger Meuss, ESO, Alastair Stirling, JBO

ALMA will produce enormous data rates and volumes. In full operation it will generate up to 60 MB/s of scientific data and in addition auxilliary and logging data with frequencies down to 48 ms. These data have to be made persistent as early as possible after their production. Consequently the archive is placed at the very center of the ALMA data flow system and all other subsystems utilize the services provided. In addition to these services the archive subsystem has to implement the standard archive functionalities for PIs and archive researchers and it is probably the first archive to have VO compliance written in the science requirements. This paper gives an overview of the design and implementation and the current status of the ALMA archive subsystem.

P1.27: Flexible Storage of Astronomical Data in the ALMA Archive

Holger Meuss, Andreas Wicenec, Simon Farrow

The requirements for the archiving of ALMA observation data are challenging: Not only are the expected rates of observation and monitor data extremely high (0.5 TeraByte/day), there is also the need to archive metadata about projects, proposals, observations, scheduling blocks, etc. in a flexible way that allows for changes in the structure of these data over the years.

The ALMA archive is divided conceptually into three parts: (1) The BulkStore for the very observation data, (2) the MonitorStore for monitor data collected by all instruments, and (3) the XMLStore for metadata about observation and monitor data. The entities in the three distinct stores are highly interrelated.

We will give an overview over the architecture of the ALMA archive with a special focus on XML storage. XML (eXtended Markup Language) was choosen not only as format for communicating data in the ALMA computing infrastructure, but also for archiving data, since it provides the required flexibility needed by the ALMA archive: XML is designed to represent semistructured data, i.e. data whose structure is irregular, changing over time or even unknown. This makes it the format of choice for software that has to work over many years, when changes in the underlying data structures are unavoidable.

P1.28: Dynamic Scheduling in ALMA

Allen Farris, NRAO, Sohaila Roberts, NRAO

The scheduling subsystem of the ALMA software system is described. Since weather will play such an important role in science observations with ALMA, the telescope will operate primarily in a dynamic scheduling mode. Current environmental conditions together with the current state of the telescope itself will be used to algorithmically determine the best scheduling block to execute at any time. This technique implements a micro-scheduler; at each moment in time it answers the question: What is the best thing to do now? The fundamental concepts used to solve this problem are presented as well as the software architecture of this subsystem.

P1.29: ALMA On-Line Calibration Software

Robert Lucas, IRAM, Dominique Broguière, IRAM, Fanny Cosson, IRAM, Heiko Hafok, MPIfR, Juan R. Pardo, IEM

On-line calibration consists of calibration operations performed to maintain the ALMA observing system properly tuned to successfully execute the planned observations. Many of these calibrations are reduced on a short time scale so that the results are made available to the observing process in due time. Results are also used as input to hardware quality control and to dynamically schedule the observing projects, and for final data processing by the Science Pipeline. Among processing operations are the calibration of the atmospheric absorption, of phase radiometric correction, of pointing scans, of phase and amplitude reference measurements, ... We will describe the developments as planned and the options taken so far.

prev   toc   next