Next: Multiple World Coordinate Systems for DEIMOS Mosaic Images
Previous: Error and Bias in the STSDAS fitting Package
Up: FITS-Flexible Image Transport System
Table of Contents - Index - PS reprint - PDF reprint

Astronomical Data Analysis Software and Systems VI
ASP Conference Series, Vol. 125, 1997
Editors: Gareth Hunt and H. E. Payne

Practical Applications of a Relational Database of FITS Keywords

De Clarke and S. L. Allen
UCO/Lick Observatory, Kerr Hall, UCSC, 1156 High Street, Santa Cruz, CA 95064



In the course of designing a major mosaicked instrument (DEIMOS) we need tools to manage the large set of FITS keywords required for instrument control and data reduction/archiving. We have therefore constructed a relational database schema which describes FITS keywords, and have implemented it using Sybase. In this schema FITS keywords and structures are entities, having attributes such as format, units, datatype, minimum and maximum value, semantics, etc. The schema was expanded and generalized to document information flow and database schema (becoming self-referential). Many useful output products (documentation, diagnostics, configuration, and source code) can be generated from one authoritative source of keyword information.


1. Introduction

Using a centralized database or information service to document FITS keywords is not a new idea. However, in previous incarnations of this concept, the object has either been purely documentary or single-purpose (e.g., the SaveTheBits database schema). Our goal was to capture enough information, and to generalize the information sufficiently, that multiple related applications could all use one authoritative, online reference database for information about FITS keywords. For example, live observing and engineering interfaces to an instrument could rely on the same database that was used to generate printed documents, drawings, and even sections of source code.

A more complete documentation of our work, with examples, and live demos, can be found online at Memes (Keywords) Database Homepage.

2. Keywords and Memes

The first challenge was to determine the list of attributes necessary to document a FITS keyword. Some, such as name, FORTRAN datatype, semantics, and min/max values, were obvious. Others were more subtle, and the list of attributes was revised several times as we explored the syntax and semantics of existing FITS headers and our proposed DEIMOS instrument keywords.

FITS keywords have provenance or context (i.e., an institution, instrument, author, or standards document which defines their valid use). Knowing the context is essential, given the lack of constraint on duplication in FITS namespace.

FITS keywords have relationships to other keywords in the same dataset (for example, the value of NAXIS controls the occurrence of NAXISn). We developed attributes for a generalized description of these relationships.

FITS keywords occur in groups or ``bundles''; a standard set of required keywords must appear in a valid table extension header, for example. A FITS header itself is a special case of such a bundle, as is a FITS table. We established ancillary tables to describe the grouping or bundling of keywords.

We found it necessary to establish special types of keywords called ``tuple keywords,'' whose values are actually a list or other parsable structure of subvalues. We established a general-purpose way of documenting such tuple values, which handles any such keyword by defining subsemantics and subformats, separators, delimiters, etc. We accommodated the case where identical semantics are repeated N times, and the case where N sub-elements have distinct semantics.

Given that our instrument and telescope control systems use a keyword/value architecture, we needed to establish access control for keywords: some are writable, some readable, some are both. Access control attributes are needed for the configuration of dynamically-generated graphical user interfaces.

We found it necessary to establish a special relationship between certain keywords and an ``archetypal'' keyword of the same semantics. For example, one instrument system may format a value as F8.4 where another will format it as F6.4 or as a string. It is evident that these are really ``the same'' keyword.

As we struggled with subsemantics and archetypes, it became clear that we were documenting something more abstract than ``a FITS keyword.'' We borrowed the term ``meme'' to describe ``a unit of meaning''-a more general-purpose name for the entities we were manipulating.

This generalization led swiftly to the use of the ``Memes'' database to document itself, since the fields (columns) of a database table are memes with a large subset of the attributes of a FITS keyword (such as name, datatype, semantics, etc.).

Exploration of the relationship between database tables and Meme bundles led equally swiftly to automated translation between these formats.

In other words, we are able to extract (outload) a Sybase table as a FITS table extension, and to import (inload) a defined FITS table extension to a Sybase table. We are able to generate Sybase table definitions for any meme bundle, including standard FITS headers and FITS table extensions. There are many practical applications for these abilities (see Table 1).

Although we cannot adequately discuss the Memes schema within the present page limit, we invite the FITS community to investigate the Memes table definition and related tables, using the demo reports available at the Memes (Keywords) Database Demos page. If you choose ``Documentation of Memes by Name'' and enter the name Memes, the resulting report will be the current Memes table definition. We propose these attributes as the first draft of a standard set of attributes of FITS keywords, and invite comments from the FITS community.

3. Information Flow

The schema was expanded to include a set of ancillary tables describing protocols, formats, event timings, and agents (software, hardware, and human) so that we could document the flow of information through any large system.

The Agents schema permits us to document the source language, revision, authorship, and other attributes special to software agents. (This model corresponds well with the ``gaming table'' model of Noordam & Deich 1996.) As with Memes, a URI field permits the attachment of detailed documentation to any Agent.

A table of Paths documents the passage of Memes between Agents. The attributes of a Path are the sending and receiving agent, a controlling agent, the Meme ID, and a cluster of attributes describing the transaction: format, protocol, event timing, and elapsed time. Since human beings can be agents and Key Entry is an acceptable protocol, it is possible to document the Memes/Agents toolset itself using this schema.

We use a digraph generation package (see Acknowledgments) to generate graphical representations of information flow. A hierarchy of agents and paths was required in order to generate both simplified and detailed drawings. Accordingly, the schema was adjusted to permit the definition of ``superAgents'' which consist of multiple Paths.

The Agents/Paths database can be used not only to generate functional diagrams, but to assist in diagnostic procedures. It can be used to trace any FITS keyword value back to the originating or authoritative source; conversely, it can reveal which agents handle a keyword and thus may be affected by changes to the syntax or semantics of that keyword; it can reveal which keywords are handled by an agent, and which ``downstream'' agents may be affected if an agent is disabled or altered.

Currently we are addressing the distinction between syntactic and functional relationships among Memes. We have addressed, for example, the fact that the value of NAXIS controls the appearance of keywords NAXISn (a syntactic relationship). We have now to model those relationships in which a set of FITS keywords acts (in use, rather than in syntactic specification) as a state engine. This is particularly relevant for instrument control systems, in which (for example) a HATCHCLO keyword and a HATCHOPN keyword cannot both be true, but may both be false. The solution to this problem will almost certainly require one additional ancillary table in the schema.


We would like to acknowledge the invaluable contributions of the free software community, with particular gratitude to Tom Poindexter (Talus Technologies) for the SybTcl package, Stephen North and Eleftherios Koutsoufios (Lucent) for their digraph toolkit, Per Cederqvist and friends for CVS, John Ousterhout (Sun) for Tcl/Tk, and Mark Diekhans (Information Refinery) and Karl Lehenbauer (Neosoft) for TclX.


Noordam, J. E., & Deich, W. T. 1996, in Astronomical Data Analysis Software and Systems V, ASP Conf. Ser., Vol. 101, eds. G. H. Jacoby and J. Barnes (San Francisco, ASP), 229

© Copyright 1997 Astronomical Society of the Pacific, 390 Ashton Avenue, San Francisco, California 94112, USA

Next: Multiple World Coordinate Systems for DEIMOS Mosaic Images
Previous: Error and Bias in the STSDAS fitting Package
Up: FITS-Flexible Image Transport System
Table of Contents - Index - PS reprint - PDF reprint