At present, a typical data set resulting from the Robert C. Byrd Green Bank Telescope (GBT) is composed of individual FITS files for each device required for an observation (e.g. the antenna, LO, backend) as well as a log (also a FITS file) which indexes all of the device files according to scans. GBT data can be assimilated into the AIPS++ DISH utility by using the AIPS++ d.import command, or by using the gbtmsfiller command, called from the UNIX command line. Either step transforms the raw data into a representation that is sensible from the astronomical perspective.
Because the GBT was designed to produce its raw data as a collection of FITS files, it is a challenge for any data reduction package to combine the information for analysis. To fill data into an AIPS++ Measurement Set, the development team spent up to two years resolving issues associated with the data itself, and was eventually able to produce the gbtmsfiller routine which is in use today. Prior to the launch of the GBT data accessibility exploration, IDL users (for example) had to follow a similar process independently, writing their own modules to extract and pre-process relevant information from the collection of GBT FITS files. Users of other packages are still faced with this barrier.
The demand for greater accessibility has been expressed within NRAO as well as by visiting observers. Several astronomers at Green Bank have expressed a desire to process data in IDL, making use of IDL modules relevant to astronomers that have been developed by third parties. Engineers working on the Precision Telescope Control System (PTCS) project (a major initiative currently underway which will provide the pointing, collimation and surface accuracy required to allow the GBT to operate effectively at 3mm - see papers in this volume by Constantikes 689 and Marganian 724 ) do much of their analysis in Matlab and need to access data from astronomical observations within the Matlab application. Requests have also been made to allow ready data reduction within the CLASS, Classic AIPS, and Mathematica packages.
The primary goal for this effort is to make GBT more readily accessible to various data analysis packages. It is understood that each package has its own unique strengths and limitations, and not all packages may be able to reduce all types of GBT observations. However, with a clear understanding of what is possible with each package, an astronomer will have greater leverage in choosing the tool that best suits his or her needs for a particular investigation.
This is not exclusively a data format issue, although knitting together the disparate FITS files currently produced into one cohesive structure is one important step to enable many of the data paths. The intention is not to create a new, all-encompassing data format for the GBT, but to arrive at a reasonable representation that will make it straightforward to transition to future, standardized single dish data formats. (One possibility is the MBFITS specification that is under discussion by ALMA.)
Meeting several objectives will facilitate the accomplishment of these goals:
Once this process is complete, we will be able to verify the consistency of scientific results between data analysis packages (e.g. IDL vs. AIPS, AIPS++ vs CLASS, CLASS vs. IDL); until now we have not had two or more packages with which cross-comparisons can be performed. Being able to perform cross-comparisons will aid the process of commissioning data reduction for new capabilities on the GBT, ensuring that errors are captured well in advance of live observations using a new device.
Three types of data were evaluated during the initial exercises: continuum data taken with the Digital Continuum Receiver and spectral line data from both the GBT spectrometer and spectral processor.
As it is a powerful language with the array handling needed for working with GBT data, Python was chosen as the programming language for all accessibility prototypes. It has a reasonably quick learning curve - skilled software engineers in Green Bank with no prior knowledge of Python were able to produce useful results within 2-3 days of beginning to work with the language. Additionally, several ALMA prototypes are being written in Python, indicating that Python could become a core competency among software engineers throughout NRAO.
Proof of concept exercises have been performed using IDL and Matlab experiments are in progress (Figure 1). These experiments take advantage of the FITS Query Language to create an intermediary data format based on SDFITS. The next phase of prototype work to be completed by the end of the year will explore data accessibility by other analysis packages.
Making GBT data accessible to additional data analysis packages is being done in a staged approach, aligned with demand from visiting observers and other development priorities of the GBT project. IDL is being targeted immediately, because of the strong demand that has been expressed by visiting observers and local astronomers alike. Accessibility of GBT data to Matlab is also being addressed at the present time to support critical PTCS experiments. In the next stage, access to CLASS will be investigated to support a wider audience of radio astronomers, and accessibility to AIPS will be explored, in part to support research for GBT development projects now in their earliest stages. Mathematica, which has the fewest identified users to date, will be explored once solutions are in place for other packages which are used more widely.
On November 24th, 2003, the beta version of the SDFITS generator was released for wide internal review. Continuum data from the DCR, as well as spectral line data from both the spectrometer and the spectral processor, are fully supported. File sizes are somewhat smaller than the total size of the raw data files, and much smaller than equivalent MeasurementSets. The output in the SDFITS files has been validated against the AIPS++ filler and is at least as accurate, although performs much more slowly. Future plans include making the preprocessing components used to generate the SDFITS files fast enough to replace the AIPS++ filler, so that data to be reduced in most data reduction packages will be preprocessed by the same, uniformly validated components.
The GBT project does not intend to provide dedicated support to users of all the packages described herein; however, limited hands-on support for select packages such as AIPS++ and IDL will be available. The intent is to provide sufficient documentation that all of the options, while making it possible for any observer to be able to easily use the data analysis package of their choice.
Up-to-date information on this project can be found online at http://wiki.gb.nrao.edu/bin/view/Data/WebHome.