| prev   toc   next   |
Performances of both hardware and networking give astronomers the possibility of organising their own data in local databases. Nevertheless, setting up such bases is still difficult, especially for complex and heterogeneous data as used in astronomy. SAADA is a tool to enable astronomers to easily create their own databases from archival files (images, spectra, tables, ...) or from imported data. It aims to make the process of database creation as automatic as possible. SAADA's functionality will include data loading, automatic web interfacing, and some interoperability features. The system is developed over widely used standards and freeware modules. SAADA is an object-oriented system based on a class diagram matching different types of stored data. Data persistence is taken in charge by a relational data base management system. The system is able to automatically generate specialized classes modelling input data. Logical links between records can easily be set up by astronomers in order to add scientific content to the database.
Mining large quantities of uncalibrated archives, for specific sources can prove to be a hard task. Even an automated search engine able to use an archive metadata (instrument, a filter, exposure time...) is not completely sufficient. Indeed, without calibration it is difficult to know whether an interesting source can be seen on images without actually looking. Here, we show how a "reversed" exposure time calculator can be used to lightly process the database-stored image descriptors of the ESO/Wide Field Imager (WFI) archive, and compute the corresponding limiting magnitudes.
The end result is a more scientific description of the ESO/ST-ECF archive contents, allowing a more astronomer-friendly archive user interface, and hence increasing the archive useability in the context of a Virtual Observatory. This method is developed for improving the Querator search engine of ESO/HST archive, in the context of the EC funded ASTROVIRTEL project.
Within the ESA’s Science Operations and Data Systems Division, the Archive Development Group in Villafranca, Spain is responsible of developing and maintaining ESA Scientific Archives. In particular, the ISO Data Archive (IDA) and the XMM-Newton Science Archive (XSA) have been developed using the same flexible and modular 3-tier architecture which have allowed them to be interoperable with other astronomical archives and applications.
The standard way of accessing these ESA archives is normally through a powerful Java interface (http://www.iso.vilspa.esa.es/ida for ISO and http://xmm.vilspa.esa.es/xsa for XMM-Newton) where users can interactively query, visualize and retrieve the observations and sources’ catalogues of these missions.
For several years already, the IDA and XSA have offered some interoperability services which allowed external archives or applications to bypass their standard user interfaces and access the archives catalogue and products without requiring human intervention. Furthermore, the IDA and XSA user interfaces are also accessing remote archives (ADS, SIMBAD, NED, IRAS, ?) to provide links to relevant information for the end user.
In the recent months, these interoperability services have been adapted to comply with the new VO standards, in particular the SIAP (Simple Image Access Protocol). We will explain how the archive modular architecture has helped a lot in implementing these new standards, how they can be accessed and used and the interest ESA see as a Data Provider in adopting these VO standards.
Well defined data specifications are fundamental for the operation of large observing facilities. Collecting and organizing these specifications into a dictionary provides easy access to a complex body of FITS keyword conventions and precise definitions for a mission. This is particularly important for facilitating multi-mission data analysis. Such dictionaries are also important as an aide to projects and missions that are in the process of defining their conventions. Furthermore, in the context of a Virtual Observatory electronic access to this information is essential.
In response to these needs and in accordance to HEASARC requirements, the Chandra Data Archive (CDA) has been tasked with providing a FITS keyword dictionary defining the keywords used in the headers of approximately 240 different types of Chandra FITS files. We present an account of the functionality of the dictionary, as well as a description of the database design and details of the tools which display the dictionary.
This work is supported by HEASARC grant 16613384 and NASA contract NAS8-39073 (CXC).
In the Eros experiment and the XMM-Newton Large Scale Structure Survey a large part of the physical data (images, light curves, fits files ...) has been stored as plain files while the organization of these files, the directory structure and the metadata have been kept in a relationnal database mangement system. To access these files remotely two applications were developed allowing physicists to display and edit data using a web browser such as Mozilla or Internet Explorer. These applications, EDBG and L3SDB, are implemented using servlets and java server pages technology with the Apache Tomcat server in a common framework.
The terabyte--scale SuperCOSMOS Sky Survey (SSS; Hambly et al., 2001, MNRAS, 326, 1279) consists of digitised scans of Schmidt photographic material in a multi--colour (BRI), multi--epoch, uniformly calibrated product. It covers the whole southern hemisphere, with an extension into the north currently underway. Public online access to SSS pixel data and object catalogues has been available for some time; data are being downloaded at a rate of several gigabytes per week, and many new science results are emerging from community use of the data.
In this paper we describe the SuperCOSMOS Science Archive (SSA), which is a recasting of the SSS catalogue system from flat files into an RDBMS, with a greatly enhanced user interface. We describe some aspects of the hardware and schema design of the SSA, which aims to produce a high performance, VO--compatible database, suitable for data mining by `power users', while maintaining the ease of use praised in the old SSS system.
Initially, the SSA will allow access through web forms and a flexible SQL interface. It acts as the prototype for the next generation survey archives to be hosted by the University of Edinburgh's Wide Field Astronomy Unit, such as the WFCAM Science Archive of infrared sky survey data, as well as being a scalability testbed for use by AstroGrid, the UK's Virtual Observatory project. As a result of these roles, it will display subsequently an expanding functionality, as web -- and later, Grid -- services are deployed on it.
To compare observations of objects made at various wavelengths, such as are to be found in the Virtual Observatory, good positions are needed. New instruments for ground-based observing, such as multi-fiber spectrographs, also need very accurate positions for objects fainter than those already catalogued. Recent large catalogs, such as the US Naval Observatory's 526,280,881 source A2.0 and 1,036,366,767 source B1.0 Catalogs, the 998,402,801 source Guide Star Catalog II, and the 470,992,970 source 2 Micron All Sky Survey Point Source Catalog, have revolutionized our ability to do astrometry with CCD images. The recently published FITS World Coordinate System standard has provided a standard way of parameterizing that astrometry, and the WCSTools and SExtractor software packages allow the automation of the
The Chandra Data Archive is distributed at three physically remote locations, two in Cambridge, MA and one in Leicester, UK. Each site operates local hardware and a local installation of the same software release. The data are stored at a single site or in synchronized copies at multiple sites. The architecture enables processes to access the site that is closest to the user or another site if the first becomes overloaded or unavailable.
This paper presents the archive architecture for the multiple installations. We explain how the software release is configured to operate at each site and what mechanisms are used to synchronize the data. We analyze the differences in data holdings across sites and discuss how users are routed to a site depending on their profile. Finally, we describe the load balancing and failover mechanisms built into the system.
Current development of globally accessible astrophysical data systems typically builds on Grid computing concepts, with data description and formatting standards such as VOTable and Uniform Content Descriptors providing a basis for system-system interoperability. To date, a diverse set of database management systems have been used for catalog storage within these systems. We present a virtual observatory service for the HI Parkes All Sky Survey, implemented on an IBM Lotus Domino R6 database management system. Domino's distributed computing architecture, with in-built support for replication and clustering, set it apart from more general database systems as being inherently suitable for Grid computing applications.
We present an overview of work completed at the Multimission Archive at the Space Telescope Science Institute (MAST) during the past year. The Web search interface has been totally rewritten using PHP, and offers several new features including the ability to perform searches using a list of targets from an uploaded file. By offering search results in various output formats (including VOTable), the interface can also be used as a web service and allows queries to be performed using the same parameters available from the HTML search forms.The interface also supports the NVO Simple Cone Search protocol.Other new items include: support for NICMOS, ACS and WFPC2 associations in the HST Pointings search tool, creation of FUSE quick-look previews, access to WFPC2 Associations and "High Level Science Products", support for the ADEC data set verifier web service, and online access to HST, FUSE, IUE and EUVE proposal abstracts.
The ESA's space observatory INTEGRAL (INTErnational Gamma-Ray Astrophysics Laboratory) was launched on October 17, 2002. The OMC (Optical Monitoring Camera) is one of the onboard instruments, designed to obtain V-Johnson photometry from the prime targets of the two INTEGRAL Gamma-ray instruments (15 keV-10 MeV) with the support of the X-ray monitor(3-35 keV). OMC offers the first opportunity to make long observations in the optical band simultaneously with those at X-rays and Gamma-rays. This capability will provide invaluable diagnostic information on the nature and the physics of the sources over a broad wavelength range.
LAEFF has developed a scientific archive, containing the data generated by the OMC, and an access system capable of performing complex searches. A remarkable point is the existence of visualization and analysis tools, available from the user's interface, aiming at optimizing the scientific return of the OMC data. In this poster, the main functionalities and contents of the system are described.
We intend to present the Skysoft project (http://www.skysoft.org). Skysoft is a
To be useful, Skysoft needs to be a long living project, so it needs to set a low pressure on maintainers, a very low nuisance level to the developers community, and a low maintenance cost.
Our choice is to design Skysoft as a community-supported directory, to which all people can contribute, both developers and end-users. We use a modified version of one of the now available content management software (postnuke), tied to a multi-table relational database (mysql). The workload is intended to be distributed and thus acceptable: each contributor has only a small share of the responsibility to maintain the site up to date, easing the process.
KUG is a catalog of ultraviolet-excess galaxies which have been detected on two- or three-color Kiso Schmidt plates. From 1984 to 1993, the first KUG survey listed 8104 objects in about 5100 square degrees, and the second KUG has already listed 1642 objects in about 300 square degrees. We present a combined list of both surveys, together with its analyses comparing other catalogs and databases.
The prototype version of the Korean Astronomical Data Center (KADC)'s database server is presented. The first dataset is from the Bohyunsan Optical Astronomy Observatory (BOAO)'s 1.8m optical telescope. The total amount of the data is about 400 GB and these data are obtained during 1997 Sep - 2002 Dec. These data are obtained by using 1K CCD, 2K CCD, and medium-dispersion spectrograph. The prototype version of KADC database has two modes for the users : (i) user can look through the list of titles of observing runs, observation logs, and parameters of each FITS files, and (ii) user can search data by typing object name, coordinates and search box size, etc. The FITS format files for target objects together with all accompanying files for processing can be obtained. In KADC database, searching process is being operated by SQL using metadata-table which contains information of FITS header parameters. KADC database will be expanded for the Taeduk 14m radio telescope data and future projects such as Korean VLBI Network (KVN, to be completed in 2007) and 8m-class large optical telescope which is currently being designed.
The
Data retrieval and presentation to the user is relatively simple when only one source (catalogue) is involved; queries involving from a few to thousands of catalogues present a big challenge in terms of grouping equivalent information from the different sources. Cone-search type queries, for instance, may become very common within the VO users; it is easy to represent the results involving RA and Dec and the angular distance between targets and matches, what's more complicated is to group equivalent physical quantities such as FLUX for presentation purposes and much more complex if some arithmetic expressions are to be computed (eg, differences or ratios); some of these issues are covered in this paper based on the studies of AstroGrid's Data Federation Research Group.
Identification of equivalent quantities among catalogues should be possible by use of each catalogue's meta-data: column name, UCDs, units, explanation, etc. but some problems remain and need to be addressed, for instance is the quantity in question represented in linear or log scale? Are there zero point differences (particularly common in quantities like Epoch)? Do two columns represent exactly the same quantity? eg, some astrometric values related to RA may or may not be corrected by cos(Dec). These differences can not be detected by using the meta-data listed above alone; additional meta-data like zero point, minimum value, maximum value, mean value may provide a way to decide if the quantities listed are truly equivalent or not. User intervention should be permitted through a flexible GUI, which may allow users to generate not just tables but plots and other means to verify the validity and quality of operations such as a cross correlation, leaving the final decision of a merge in the hands of the user.
Using the Region part of the VO Space-Time metadata representation, we implemented a set of functions that perform complex spatial queries over objects stored in a SQL Server database. These include various spatial operations on points and polygons: (i) point inside of a polygon, (ii) Boolean operators on spherical polygons, (iii) polygons containing a point, (iv) all points in a polygon, etc. The functions are implemented in Transact-SQL and rely a small number of basic functions in the HTM2 library linked to the database through a DLL. The SDSS Data Release 1 includes an extensive table of geometric objects and boundaries which can be used to select objects from observations and to censor objects in regions masked by satellites and bright stars. We are also implementing a Web Service that can calculate the intersection of an external input region with the survey footprint, or can tell whether a point is inside the survey or not. These procedures can be used to calculate the intersections of various observational and operational constraints (target selection areas, spectroscopic plates) and subsequently compute separate targeting and observing efficiencies from the database. Such data are extremely important to estimate statistical completeness for studies of large-scale structure.
We discuss the redesign of the SkyQuery architecture, originally built as a simple proof of concept for dynamic federation of astronomical archives. In keeping with the Virtual Observatory philosophy of hierarchical services, the design of Open SkyQuery is based upon higher level services extending the basic functionality of the current VO standard, the ConeSearch.
Open SkyQuery implements the VO specifications for data access, retrieval and spatial join. Data are published via Web Services called SkyNodes providing a rich functionality including footprint coverage. SkyNodes are discovered through the VO registry. We propose to have at least two levels of SkyNode compliance (Core and Advanced). We will also provide templates for publishing data into a SkyNode. Keywords: SkyQuery, VOTable, Registry, ADQL
The Galaxy Evolution Explorer (GALEX) led by the California Institute of Technology was successfully launched on April 28, 2003. GALEX data products will include a series of near- and far-UV all-sky surveys and deep-sky searches in imaging mode and partial sky surveys in spectroscopic mode. GALEX data will populate a large archive (~ 5TB) available to the entire astronomical community and to the general public via the MultiMission Archive at Space Telescope Science Institute (MAST). The GALEX archive is linked to other archives that share a common cross-referencing schema to provide a conceptual “digital sky”. The archive and the web user interface are developed within a NVO framework and offer many compliant NVO Web Services to achieve the VO goal. Cone Search and SIAP (Simple Image Access Protocol) are currently available. Simple Spectra Access Protocol (SSAP) Web Service for 1-D and 2-D spectra will be available pending the release of a standard definition from the VO community. We adopted MS .NET Framework technology as a development environment with integration to MS SQL Server 2000 DBMS and IIS as Web Server for Data Server products using C# language and ASP.NET.
We discuss the design and implementation of a scheme enabling authors to refer and link to online datasets available from astronomical archives. This will provide the readers of electronic papers with direct access to the data discussed therein. The software tools used to create and maintain links from published papers to the datasets make use of Web-Services-based techology. The system has been designed in collaboration with the NASA Astrophysics Data Centers and the University of Chicago Press. It will be maintained by the NASA Astrophysics Data System, funded by NASA Grant NCC5-189.
We present easy-to-use web applications and Web Services to search, plot and manage spectral energy distributions and filter profiles. We provide keyword search, advanced query forms and SQL interfaces to select spectra or bandpasses that may be retrieved in a variety of formats including XML, VOTable and ASCII.
All SDSS DR1 spectra had been loaded into a database as well as the entire 2dF catalog that adds up to almost half million SEDs but registered users can upload their own data making it available for the rest of the community and are free to modify or delete them at any time. Scientific services allow to build rest-frame composite spectra out of selected spectra.
The bandpass database has a growing collection of photometric filters and the same search interfaces. Using the spectrum and filter profile core services, we plan to build higher level services to help astronomers create color-color diagrams, simulated catalogs and estimate distances to extragalactic objects.
The most basic data federation operation for catalogues of celestial objects is that of cross-matching, for example to identify the same object detected in different wavebands. In DBMS terms this requires a spatial join. We have tested the performance of three DBMS (MySQL, Postgres, and DB2) using spatial indexing (such as R-trees) and using the pixel-code method. The latter maps celestial positions to integers, after which a B-tree indices can be used to do the join. The results are presented and have implications for the best way to set up an astronomical data exploration facility.
ISDC's DAL and DAL3 libraries
This presentation will discuss the Integral Science Data Centre's (ISDC) experiences with our Data Access Layers (DAL). At ADASS '99 we presented an overview of our Data Access Layers. Now that Integral has launched and we have 9 months of operational experience we can more fully evaluate DALs benefits for Integral as well as potential benefits to other missions. ISDC's DAL was designed to solve problems anticipted by the combination of Integral's 4 instruments, wide fields of view, many instrument modes, and, new pointings approximantly every half hour. This would mean that any sensible scientific data analysis would require processing hundreds of individual files. Selecting and managing these files were seen as a large and unnecessary overhead to push off to the scientist. DAL and DAL3 were our solutions and they have generaly allowed scientists to work while remembering only one file name.
Object access is a tool that is feasible for interoperability of telescope operations. Different subsystems need to be accessed in a standardized way. Simple Object Access Protocol, SOAP, provides the core of this approach. Web Services is the wider context of this peer-to-peer protocol.
A protocol that is used in virtual observatory interconnectivity can also be used for controlling telescope operations. The learning curve of available packages is easy to overcome and people having experience from different operating systems can start coding in SOAP in a matter of hours.
ASTROVIRTEL(*) has concluded its three-year life cycle. A review of the last two cycles is here presented. The program selection process was instrumental in ensuring that the tools and methods developed for the successful PIs were general enough to be reused by a wider community of users of the ESO/STECF archive. It will be here shown how such goal was achieved. The programs of the last two cycles will be described, touching upon science goals, scientific and technical requirements, technical challenges -quite typical for any astronomical archive-, technical and scientific achievements. The overall exercise of hosting scientific investigators with quite spread scientific interests has been very effective in revisiting and augmenting various ESO/STECF archive functionalities and scientific products. Some of the developed tools and methods are already integral part of the ESO-ECF archive, while some others are still being optimised before becoming operational. ASTROVIRTEL forced the developers to look into both the HST and ESO archives, each with its own peculiarities, and come up with solutions as general as possible. Furthermore, ASTROVIRTEL has also played an important role in the Virtual Observatory phase A study, particularly in the area of data centres interoperability and scientific requirements.
-- (*) ASTROVIRTEL is a project supported by the European Commission under the 'Access to Research Infrastructures' action of the 'Improving Human Potential Programme', FP5 contract No. HPRI-CT-1999-00081
SkyQuery is an excellent VO prototype application that marries Web Services technology with emerging VO standards to enable dynamic cross-matching queries between different VO-enabled archives. The archive data is stored in databases that are published online as SkyNodes.
As the available data from Sky Surveys and new digital archives rapidly multiplies every year, more than 80 percent of the data will exist outside of large data centers at any given moment, making it very important to have dynamic cross-identification tools like SkyQuery.
Loading an entire survey like 2MASS or SDSS into a database involves making decisions about issues like data formats and indices for tables. I describe the process of loading such a large amount of data into a relational DBMS (SQL Server) and generating a sky index using the Hierarchical Triangular Mesh (HTM), which provides a really fast way to find objects. This can be easily done even for a large survey like the 2MASS All-Sky Data Release (150GB uncompressed, 471M objects) in as little as 2 days including the required computation time for HTM.
I show how a database like this can be set up and published as a SkyNode, so that it can be included in cross-matching queries with other archives. In less than a month, I have created SkyNodes for 2MASS, 2dFGRS, FIRST, IRAS, 2QZ, PSCz and NVSS Surveys, enabling each of them to be cross-queried using SkyQuery.
Many of the datasets that will be served through the NASA/IPAC Infrared Science Archive (IRSA) will consist of diverse and complex collections, of arbitrary size, of images, source tables, and spectra that cover specific regions on the sky of arbitrary size. The purpose of ATLAS is to provide a uniform access to such collections. We define a set of files to be generated for each collection (image metadata, a list of source tables, etc) and some standards for how data should be organized and referenced and how collection `homepages' should be structured. Then, a single CGI-program can be used to search, subset, and present any collection's data to the user. This method of organizing and presenting the data makes it easy to update the data without major modifications or upgrades to the system or software. Most importantly, one program enables us to serve through a common interface multiple data collections, of any size, containing any mixture of images, spectra and/or source catalogs, in a simple and uniform manner.
Much of the tabular data in these collections are of modest size (approximately 50,000 records), and querying and subsetting them are most efficiently performed outside a DBMS. We are therefore developing tools based on Open Source expression evaluators to provide SQL-like tabular search and subsetting capabilities.
We will also show how Virtual Observatory (VO) protocols and standards must be extended to support such complex data collections.
A poster covering how Astrogrid's infrastructure can be used to publish data to the world wide telescope.
After releasing the first version of the new HST WFPC2 associations in November 2002, the CADC, ST-ECF and STScI are jointly releasing Version 2. This presentation will discuss the software pipeline steps that were put in place to construct these associations. Our goals are the following: Firstly, to make available higher quality products for scientists using the HST archives. Secondly, to translate all the necessary parameters used for the processing into a data model for inclusion in the Virtual Observatory. We will then present the different products wich are accessible to the potential user. Finally, we will discuss the pros and the cons of doing this for non survey data.
Numerical and bibliographical Databases in Atomic and Molecular Physics are essential for both the modelling of various astrophysical media and the interpretation of astrophysical spectra provided by ground or space-based telescopes. We will report on our current project concerning the access to Atomic and Molecular Physics Databases within the Virtual Observatories, addressing the problems of UCD, the existence of well-known databases, organisation/access of data for specific astrophysical applications. As an example we will present the current status of a numerical and bibliographical database concerning collisional ro-vibrational excitation rate coefficients of molecules (basecol.obs-besancon.fr).
We present the general characteristics of a database for binary and multiple stars from all observational categories, specifically designed to address the awkward topics of the identification of stellar components. BDB is based on a modular architecture to allow the easy integration of data from various sources. We describe in particular the set up of connections with other double star databases through the Internet. Additional tools are being developped for the processing of image data. The implementation of standards for the connection of BDB with Virtual Observatory projects is reviewed.
We present the main new features of the Aladin interactive sky atlas. The Aladin image server is now able to provide a result compliant with the Simple Image Access Protocol (SIAP). The Aladin tool is also able to query a remote server providing SIA output, and to interpret the result. Aladin is able to build a metadata tree from the description of a dataset, in order to easily browse through all data available in a sky region on a server or in a local folder. It is also possible to recalibrate the astrometry of images, using different astrometric reduction methods, or to create manually a new calibration using a reference catalogue.
Additional tools have been developed for the ACS GTO Science Data Archive web interface to supplement the core functionality. These tools add value to the standard pipeline data products. These tools provide: on-the-fly color image generation, a clickable object map that integrates image and catalog data, image cutouts for any cataloged object or observed position, overplotting of calculated object apertures, overplotting of wcs compass and scale, arbitrary image zooming and scaling, multiple types of image intensity scaling, pedigree charts of data product relationships, a dynamically generated data product inventory. These functions are performed in real time so that additional storage space is not required and in order to guarantee that only the most recent source products are used in the calculations.
PostgreSQL, the open-source RDBMS, is probably one of the best database solutions for astronomy and astrophysics. Compared to several available commercial and non-commercial database engines, it appears to be the most versatile.
At present PostgreSQL is being used in several well-known astronomical projects, for example in the HyperLEDA database, http://leda.univ-lyon1.fr/ and in the MAPS, http://aps.umn.edu/ (Minnesota Automated Plate Scanner) Catalog of the POSS1.
Extensibility is the most remarkable feature of this RDBMS - it allows to develop custom data types, queries and indexed access methods, optimized for specific tasks, without knowledge of database internals. This is very important for 'non-standard' tasks, typical for scientific reaserch.
We present two backend modules for PostgreSQL.
*pgSphere, the ready-for-use contribution module for PostgreSQL, offers the capability for dealing with geometric objects in spherical coordinates. This module demonstrates all the possibilities of backend programming using the GiST interface for spatial indexing of spherical data. pgSphere can be very useful in astronomy and the geo-sciences. Performance tests are included.
*pgAstro, the contribution module based on the pgSphere engine, provides astronomy-specific functions and methods, for example positional astronomy and physical models.
We show that PostgreSQL is the most advanced database solution for astronomical applications and would be very useful for different Virtual Observatory projects.
The Information Bulletin on Variable Stars is a bulletin fully available in electronic form. We are working on converting the text, tables and figures of the papers published to a database, and, at the same time, making them accessible and addressable. IBVS Data Service will serve information on variable stars - like finding charts, light curves, etc., and will be VO compatible. Other services could link to individual figures, data files, etc. this way.
| prev   toc   next   |