Next: Formatting Journal Tables in XML at the ADC
Up: Archiving and Information Services
Previous: Archive Storage Subsystem for the ESO VLT
Table of Contents - Subject Index - Author Index - Search - PS reprint - PDF reprint

Pirenne, B., Albrecht, M., & Schilling, J. 1999, in ASP Conf. Ser., Vol. 172, Astronomical Data Analysis Software and Systems VIII, eds. D. M. Mehringer, R. L. Plante, & D. A. Roberts (San Francisco: ASP), 269

The Prospects of DVD-R for Storing Astronomical Archive Data

Benoît Pirenne1, Miguel Albrecht2
European Southern Observatory, Karl-Schwarzschild-Str. 2, D-85748 Garching bei München, Germany

Jörg Schilling
Research Institute for Open Communication Systems, Berlin

Abstract:

The Digital Versatile Disk (DVD), heralded as the successor of the CD-R as a direct-access, cheap archive media has been long to materialize. This paper reviews the current situation regarding DVDs and assesses the suitability and maturity of the technology for astronomical data archives. Considerations such as cost per unit of capacity, durability, format issues and compatibility are addressed. The current projects involving DVD technology at ESO are presented.

1. Introduction

In the past 10 years the photon collecting efficiency - and hence the data throughput - of astronomical equipment has been multiplied by almost two order of magnitude: larger mirrors ( x 4), vastly superior efficiency of CCDs ( x 3); better use of scheduling tools to make better use of the time available ( x 2). Moreover, the quest for better and better resolution has also spawned CCD mosaics that can produce extremely large images ( x 5). New infrared detectors with their rapid sequences of exposures are also challenging data acquisition systems.

On top of the multiplication factor described above, the cumulative effect caused by systematic archival of all observations complicates the problem faced by archivists. Also, using new observing facilities is now expensive enough to make data worth keeping (almost) forever.

2. Towards Higher Density

To address the challenges posed by this dramatic increase, technologists need to re-assess the means at their disposal in the area of data storage. Up until recently, archives were a mere ``bit bucket'': data was kept on some off-line medium, but hardly ever retrieved. Pioneered by IUE (Wamstecker 1991) and later by the Hubble Space Telescope (Pirenne et al. 1993), a new era came were archives were a more lively system: data was stored with the idea that any particular exposure would be retrieved many times over, during and after the lifetime of the instrument which produced it (Crabtree et al. 1996).

A third generation archive is probably necessary for the early years of the next decade where we will see requirements for rapid access to huge quantities of digital data, coming from different instruments, different observatories and different epoch. These data will have to be rapidly re-calibrated, associated, analyzed and cross-correlated prior to its distribution to the final users. This is triggering the need for fast access to thousands of science and calibration exposures, each of which occupying perhaps dozens of mega bytes.

3. From CDs to DVDs

For the past 3 to 5 years, the needs of our archives have been satisfied with the existing: low cost, vendor independant, medium density, long-lifetime media materialized by the ``CD-Recordable'' (CD-R). This medium is indeed extremely cheap ($1/volume), the recording system is also very affordable and is available from multiple vendors. Moreover, large jukeboxes capable of making hundreds of such CD-Rs available quasi on-line are also rather affordable.

When the DVD technology was announced, it really looked like the logical step forward for growing archives. Unfortunately, the technology has only recently been made available and real testing could only be started this Summer.

Another complication, explaining perhaps some of the delay, was due to the competition between potential vendors of the technology: everyone was coming up with its own version of the technology. As a result, the confusion was great amongst potential buyers faced with unclear choices: DVD-R, DVD-ROM, DVD-RAM, DVD-R/W, DVD+RW, ...

So far, those who had to make a choice for a large part turned away from the technology and started to explore other avenues:

So far a reasonable choice in this area has been the CD-R. Comparisons are shown in Table 1.

In the ESO/ST-ECF archive, the idea of non-erasable, passive media, located in jukeboxes has always seemed very attractive and has been successfully implemented and used for a number of years. Therefore, a ``jump'' on the first available similar technology from the DVD world was very tempting.

In early 1998, the only such system available was the DVD-R system from Pioneer, with the following positive and negative characteristics:


\begin{deluxetable}{l p{4.5cm} r c}
\scriptsize\tablecaption{Currently available...
...
\hline
tape robot & DLT juke box & \$26/GB & yes \nl
\enddata
\end{deluxetable}
Magnetic disk systems are RAID (Redundant Array of Inexpensive Disks) systems: their cost is increased to account for a necessary backup on say tapes or for the setup of a given data redundancy level.

4. Current experiments at ESO

In early 1998, our major problem was the lack of Unix-based software to drive the DVD-R recorder. This problem was solved by Jörg Schilling, co-author of this article, who adapted his existing popular public-domain code cdrecord to deal with the DVD-R device as if it was simply a larger CD-R. As a matter of fact, the file system written on the new DVD-R medium is still an ISO9660-compliant format, as we did not have the time -or the real need- to compose a UDF file system, which would have been the necessary format to call our medium ``DVD''. Again, our goal was to record data for our own archive, not for redistribution.

With this system in place, we could start preparing media. Simultaneously, of course, it became clear that we could not start a large-scale mass storage on the new medium without having juke box system to make them available on-line. This was the sore point as only one company (Pioneer) had only a 100-slot model available in the third quarter of the year. With the promise of other vendors working on larger and more promising models we decided to wait a bit more before purchasing such a device.

In the mean time, we recorded a number of media and tried to fully check them. Roughly 30 media with various type of astronomical datafiles were recorded, mostly as GNU-compressed FITS files. This brought us the advantage that we could fully check not only the file system consistency, but also any possible read-write errors in the files themselves, using the ``test'' option of the decompression command. In unix, this gave a the following simple medium verification command:


find . -type f -print -exec gunzip -t {} \;

Using this combination, we could uncover occasional incomplete file reads and ``CRC'' errors in the files, giving us clues for the errors. With this method, applied to the same disks read with various reader devices, we concluded that all reader devices were not born equal: we could only get error-free reads with the Panasonic LF-D101 and with the Pioneer DVD S101 recorder. We still think the recording is only marginal and a better recording device might alleviate our problems.

Fortunately, the DVD jukebox we are currently in the process of acquiring will be equipped with the devices that have proven capable of reading the media we have already produced.

5. What if it fails?

The DVD(-R) technology is brand new and has not yet been planted in firm grounds. In particular, the multiplicity of its variants, the commercial battle amongst vendors proposing different incompatible recordable formats, are only favoring one thing right now: consumer confusion and defiance.

If it would turn out that DVD-R technology does not have a future, we will have to find an alternative route store our data. We want therefore to keep all options open and have decided to choose a juke box system that will accomodate simultaneously most kinds of CDs and DVDs. This will protect the investments we might have done in any technology so far while keeping the possibilities open for such improvements as double-sided media etc.

If all else fails, we will still be able to re-use this particular juke box system and retrofit it to support any kind of $5 {1\over4} ^{\prime\prime}$ recording medium.

6. Conclusion

In looking early on at the DVD technology, we have tried to simultaneously satisfy our need to support the data volume growth that current and future astronomical instruments will impose on us, while continuing with our existing data storage paradigm: cheap, non-erasable, passive, $5 {1\over4} ^{\prime\prime}$ media.

The early experiments have been plagued with reliability problems and have both slowed the adoption of the technology and cast some doubts about its future.

As we will have to select a method for storing the data we will receive anyway, we have decided to make the kind of careful technological investment that will enable us to change our minds in time, without having to re-consider the entire investment and set of decisions.

References

Crabtree, D., Durand, D., Gaudet, S., Hill N., & Pirenne, B. 1996, in ASP Conf. Ser., Vol. 101, Astronomical Data Analysis Software and Systems V, ed. G. H. Jacoby & J. Barnes (San Francisco: ASP), 505

Pirenne, B., Benvenuti, P., Albrecht, R., & Rasmussen, B. F. 1993, in SPIE Proc., Vol. 1945, Space Astronomical Telescopes and Instruments II, ed. P. Y. Bely & J. B. Breckinridge, (Bellingham: SPIE), 83

Wamsteker, W. 1991, in Databases & On-Line Data in Astronomy, ed. M. Albrecht & D. Egret (Dordrecht: Kluwer), 35



Footnotes

... Pirenne1
Space Telescope - European Coordinating Facility
... Albrecht2
Data Management Division

© Copyright 1999 Astronomical Society of the Pacific, 390 Ashton Avenue, San Francisco, California 94112, USA
Next: Formatting Journal Tables in XML at the ADC
Up: Archiving and Information Services
Previous: Archive Storage Subsystem for the ESO VLT
Table of Contents - Subject Index - Author Index - Search - PS reprint - PDF reprint

adass@ncsa.uiuc.edu