With the advent of large CCD detectors, the way astronomy is done changes rapidly. Because of the exponential growth in the size and speed of the silicon chips, new surveys are expected to have significantly higher data rates. These survey projects become both the authors and publishers of their data (Szalay et al. 2002). In this exponential world, only 10% of all astronomical information is available in central archives at any given time. In order to have access to all up-to-date observations, we need to find a way to federate geographically separated astronomical archives.
Current sky surveys such as SDSS, 2MASS, DPOSS have proven that discoveries are always made at the boundaries when going deeper or using more colors. By covering different wavelength ranges, surveys can very well complement one another if one finds a way to combine them. In the past, crossmatching of many catalogs was prohibitively complex and expensive. The Virtual Observatory and Open SkyQuery in particular are to make it simple and affordable.
The prototype SkyQuery was built last year in six weeks as a feasibility study (Budavári et al. 2003, Malik et al. 2002). It used a hierarchy of XML Web Services to implement a distributed query system that provided seamless access to SDSS, 2MASS and FIRST data. Since the launch of the SkyQuery web site, many other catalogs have become available (Purger et al. 2004) including the Isaac Newton Telescope's Wide Field Survey (INTWFS), IRAS, NVSS, 2dF, PSCz, 2QZ and Rosat, see Figure 1.
The SkyQuery architecture is being redesigned to utilize the recently emerging VO standards such as the VOTable, the Astronomical Data Query Language (ADQL; Yasuda et al. 2004), the VO Query Language (VOQL) and the VO Registry services (Greene et al. 2004). The data are going to be published by the SkyNodes that implement XML Web Services to extend the basic functionality of the ConeSearch.
We propose to have three layers of VOQL building on top of one another, see also in Yasuda et al. (2004):
Figure 2 illustrates the Open SkyQuery architecture and shows the relations between the components.
In order to ensure fast response, one needs to optimize the query plan. Our simulations show that the simple sequential execution proves to be optimal because today the wire speed is the limiting factor. One needs to arrange the SkyNodes in ascending order of the number of matching records so that the least amount of data is transferred. This simplifies the logic of the portal significantly. However, the SkyNodes are designed to deal with more complicated query plans, so that the system may be enhanced easily later on. Another possible enhancement might be the asynchronous data flow (O'Mullane et al. 2004).
The emerging Virtual Observatory infrastructure makes it possible to develop a new generation of astronomical tools. These online tools promise to be easy to use and to open new dimensions for scientists.
Open SkyQuery is just one of the first steps. Its catalog services will enable us to analyze geographically separated IVOA archives, a.k.a. SkyNodes, as if they were part of the same dataset. As of today, the VO building blocks are already in place to make Open SkyQuery a reality.
Budavári, T., et al. 2003, in ASP Conf. Ser., Vol. 295, Astronomical Data Analysis Software and Systems XII, ed. H. E. Payne, R. I. Jedrzejewski, & R. N. Hook (San Francisco: ASP), xii:O10-131
Greene, G., et al. 2004, this volume, 285
Kunszt, P. Z., Szalay, A. S., and Thakar, A. R. 2001, Mining the Sky: Proc. of the MPA/ESO/MPE workshop, Garching, A.J.Banday, S. Zaroubi, M. Bartelmann (ed.), (Springer-Verlag Berlin Heidelberg), 631.
Malik, T., et al. 2002, CIDR `03, p.17, `SkyQuery: A WebService Approach to Federate Databases'
O'Mullane, W., et al. 2004, this volume, 372
Purger, N., et al. 2004, this volume, 201
Szalay, A. S., et al. 2002, Proc. of SPIE, 4846, `Web Services for the VO'
Yasuda, N., et al. 2004, this volume, 293