In order to make it easy for astronomy information services to participate in the VO, we propose a system for metadata management based on a hierarchy of descriptive schemas. At the top level we require a minimum amount of information, sufficient primarily to note the existence of a resource and to describe who is responsible for it. At lower levels, the metadata are more extensive and complex, allowing for the description of query syntax, access protocols, and usage policies.
A resource is a general term referring to any VO entity that can be described and which can be given a name and unique identifier. Just about anything can be a resource: it can be an abstract idea, such as sky coverage or an instrumental setup, or it can be fairly concrete, like an organization or a data collection. This definition is consistent with its use in the general Web community as ``anything that has an identity'' (Berners-Lee et al. 1998). We expand on this definition by saying that it is also describable.
An organization is a specific type of resource that brings people together to participate in VO applications. Organizations can be hierarchical and range greatly in size and scope. At a high level, an organization could be a university, observatory, or government agency. At a finer level, it could be a specific scientific project, space mission, or individual researcher. A provider is an organization that makes data and/or services available to users over the network.
A service is any VO resource that can be invoked by a user or software agent to perform some action on their behalf. Associated with any service is descriptive metadata about the service. This metadata generally include information the user needs to determine if a service is of interest and how the service may be invoked. Specific types of metadata are described below. Note that the service itself need not be aware of the metadata that describe it.
A query service supports a query/response protocol. The user submits a query to the service that may define characteristics of interest, and the service returns a set of information to the user. The query may be null, e.g., a current-time service may only support a null query, and some services may respond to a null query with appropriate default actions. Non-query services may also exist, e.g., services to copy or delete files on remote file systems, to mail information to other users, to kill existing jobs, to authorize actions, etc.
A registry is a service which aggregates and serves resource metadata. The metadata may be added to the registry via an input form or harvested from the resources themselves. A registry may serve all resource metadata (full registry), select types of resource (limited registry) or resources at a specific location (local registry). Any registry may also support a query interface which will allow searching for resources based on various combinations of metadata values.
A sample of the metadata that would be used to describe the Sloan Digital Sky Survey source catalog as hosted at the Space Telescope Science Institute is shown in Fig. 1. Further information concerning the encoding of such metadata and their incorporation into resource registries is describe by Plante et al. (2004) and Greene et al. (2004).
Both the NVO and AstroGrid projects have implemented prototype registries. The NVO prototype has been used as a data discovery engine for the Data Inventory Service (http://heasarc.gsfc.nasa.gov/vo/data-inventory.html, McGlynn et al. 2004). The prototype registry was constructed primarily through manual entry of metadata about known cone search and Simple Image Access Protocol services. It took about a week to populate a prototype registry of 100 resources. During this time period, we recognized certain patterns in data entry as well as inconsistencies in metadata descriptions. This experience leads to the following conclusions and questions:
In addition, the resource metadata concepts described here must be encoded and structured in a machine-readable registries. Work continues on XML schema that more fully show the relationships among metadata elements and that simplify data entry and maintenance efforts (e.g., by allowing an organization to register its curation-related metadata once and apply it to a number of different collections).
Berners-Lee, T., Fielding, R., & Masinter, L. 1998, IETF RFC2396, http://asg. web.cmu.edu/rfc/rfc2396.html
Greene, G., O'Mullane, W., Hanisch, R., & Gaffney, N. 2004, this volume, 285
McGlynn, T., Lee, J., Hanisch, R., O'Mullane, W., & Greene, G. 2004, this volume, 319
Plante, R.,Greene, G., Hanisch, R., McGlynn, T., O'Mullane, W., Williams, R., & Williamson, R. 2004, this volume, 585