Chapter 6 - Spectroscopy Pipeline and Data Analysis
This chapter describes additional reductions beyond described in Chapter 3.
The spectroscopy pipeline is divided into two stages. In the first stage, whole field images are created and a selection of bright star spectra are extracted primarily to allow immediate quality analyses of the newly arrived data. The second stage of the pipeline is run only after the best possible GALEX direct image source catalog is generated. From this catalog, sources are selected which will have a spectrum of significant signal-to-noise and the final spectral catalog is produced.
In the first stage, the photons are processed similarly to the direct image pipeline-- photon positions are aligned and whole field images ( -cnt.fits, -int.fits, -rrhr.fits ) of the photon intensity and system response are created. However, unlike the direct pipeline, source positions are not determined from the grism data and a source position catalog is not generated. Instead, object positions are determined from star catalogs (Tycho and NOAO). The pipeline selects the 100 brightest sources and generates spectral image strips ( -pri_rtastar.fits) of photon count and response for each source. Each 80 acrsecond by 900 arcsecond strip contains an image of the strongest spectral orders in each band-- 1st and 2nd for NUV and 2nd and 3rd for FUV. In addition, a catalog of extracted specta is generated (-xg-gsp_rtastar.fits).
In the second stage, which may occur much later, object positions are obtained from the best GALEX direct image catalog of a field as close in position as possible to the grism observation. (Often this will have the same tile or target name, but not always.) Depending on the eventual exposure depth of the grism target, a few hundred to a thousand or more of the brightest sources are spectrally extracted. A catalog of image strips (photon count and response) are produced for each source and each band (-pri.fits) and a final spectral catalog is generated (-xg-gsp.fits).
In addtion to the photon count (-cnt.fits), intensity (-int.fits), and response (-rrhr.fits ) FITS image files, which are also produced by the direct imaging pipeline, the spectroscopy pipeline produces 4 of its own file types. These include:
- -gsax.fits : GALEX Spectral Accumulation and Extraction file which contains a extraction parameters and spectral source list. This file contains three FITS tables: 1) extracted sources list with extraction parameters for each source, 2) a table of accumulated exposures, and 3) a profile model used for optimal extraction. An example of all four FITS HDUs is given here: gsax_header.txt . Here is a abbreviated example of the tables: gsax_tables.txt
- -pri.fits : Photon and Response Image strip file which contains 2-D image strips for each source covering the area of the two strongest orders of each band. The image strip for each source appears in a separate data header unit. Examples of FITS headers are shown here: pri_header.txt .
- -xsp.fits : Raw Extracted Spectra file which contains 1-D uncalibrated spectra of each source in terms of photons/second versus arcseconds (position on the image).
- -gsp.fits : Calibrated spectral catalog of all spectrally extracted sources in terms of relative flux versus wavelength. A detailed description of the catalog columns is in gsp_columns_long.txt . For GR4, there are 22 new columns in the merged spectral catalog file (-xg-gsp.fits files). These include detector positions and Q values of each source at the undeviated wavelength position and the middle of the primary order (NUV 1st and FUV 2nd). The new 22 columns include: 'n/fuv_nc, n/fuv_nr, n/fuv_mag, n/fuv_detx/y, n/fuv_Q, n/fuv_mid_detx/y, n/fuv_mid_Q, n/fuv_mid_ra, and n/fuv_mid_dec .
There are two of each of these for the NUV and FUV bands (e.g. -ng-gsax.fits and -fg-gsax.fits). For the calibrated spectral catalog, there is also a combined file ( -xg-gsp.fits ). A complete description of all GALEX product files is given in pipeline_files.txt . Below is an example of FUV and NUV image strips and a flux calibrated spectrum. Note that the spectrum always goes to zero between the two bands at about 1800 Angstroms.
- Wavelength Scale -- The dispersion function of the spectra (wavelength versus position on the detector) was determined using groundbased data. The average dispersion for NUV first order is approximately 4.04 Angstroms per arcsecond. The offset in arcseconds relative to the "undeviated wavelength point" as a function wavelength in Angstroms is given by the formula for NUV 1st order: offset = -882.1 + (0.7936 x wave) - (2.038 x 10^-4 x wave^2) + (2.456 x 10^-8 x wave^3). For FUV 2nd order, the average dispersion is approximately 1.64 Angstroms per arcsecond and the formula is: offset = -4530.7 + (7.3859 x wave) - (3.9254 x10^-3 x wave^2) + (7.4750 x 10^-7 x wave^3). The relative wavelength scale is accurate to within a few tenths of an arcsecond. The absolute wavelength scale is computed for each exposure by correlating bright star spectra with known templates (see Grism Aspect Correction below) and usually accurate to within 1 arcsecond.
- Position Dependent Response Correction -- A position-dependent GALEX flat field response correction is applied to the grism field in the same manner as with the direct image processing. This correction is contained in the FITS response image file (-rrhr.fits). In addition, starting with the GR4 release, a relative grism flat field correction is applied which is separated into 4 wavelength bins for the NUV 1st order part of the spectrum and 2 wavelength bins for the FUV 2nd order spectrum. The standard star LDS 749B was observed at 35 different positions on the detector. These data were used to create the 6 "wavelength-dependent" grism flat field images (4 for NUV, 2 for FUV). The average correction amplitude in these images is about 1% with a maximum amplitude of about 5% in a few areas. This wavelength- and position- dependent correction is stored in the last row of the response image strips (in the -pri.fits file) and applied during the spectral extraction after background subtraction. Future improvements to the grism pipeline will include new observations of a brighter standard star to divide each band into a greater number of wavelength sections.
- Wavelength Dependent Response Correction -- The GALEX spectral calibration was updated (for GR4 release) using the fainter standard star HZ21 to avoid possible saturation issues encountered in GR3 when the brighter standard BD+33 2642 was used. The new spectral calibration was cross-checked against the photometric calibration using GALEX grism observations of the photometric standard LDS749B and found to agree to within the errors. The largest change is in the FUV, having an average Effective Area (EA) that is 79% of the EA from GR3 with excursions to as low as 57%. In the NUV, the change is much less significant: the GR4 EA curve averages 99% of the GR3 curve with excursions to as low as 86% and as high as 106%. The plot below compares the two sets of effective area curves for the primary spectral orders (Order 2 for FUV and 1 for NUV). The new curves are solid while the older curves are dashed.
- Photon Alignment -- The relative photon positions are computed in a manner similar to direct imaging pipeline. However, the absolute correction is done using a different method using the extracted spectra of bright stars.
- Grism Angle -- The "grism angle" is the sum of the detector grism angle and roll of spacecraft (GRSPA = DGA + Roll - 90). "DGA" is defined as 0 degrees when the spectra point (blue to red) along the positive X axis direction, and 90 degrees in the positive Y direction. "GRSPA" is 90 degrees when the spectra point East and 0 degrees when the spectra point North. Before computing the correct aspect (RA and Dec) of the field,an approximate grism angle is determined by iteratively comparing the RMS of a 1-D array after summing the rows of a field image at various trial grism angles. After an approximate aspect correction is done, a more precise grism angle (within 0.1 degrees) by using bright stars in the field.
- Offsets and Field Rotation -- Using about 20 bright stars in the field, offsets are computed in the grism dispersion direction of the spectra and the spatial direction (perpindicular to the dispersion), by cross-correlating the extracted star spectra with a synthetic template of main-sequence star spectra. These values are stored in the FITS header cards as 'SXOFF' and 'SYOFF' in the '-gsax.fits' and '-gsp.fits' product files. A field rotation is also computed (stored as the 'STWIST') card. For the GR4 release, these values are converted into a RA and Declination offset which are applied to the whole field images (-int.fits and -rrhr.fits). For these images, the coordinates of a given source are located at the 'undeviated wavelength position'. This is the position which would remain the same on the sky as the grism is rotated, while the spectrum of a given object would always point out from this position (blue to red).
- Quality Check -- The aspect correction is probably the most critical part of the reduction of a given exposure. In order to verify the accuracy of this correction, several 'QA' files are created by grism pipeline. One of these files ( -xg-offset_profile_qa_rtastar.jpg ) is shown below.
- Blind Extraction -- Source positions are selected from a known star catalog (Tycho,NOAO), a GALEX direct image source catalog (-xd-mcat.fits), or a user given list of positions (by special request). Objects are not detected using the grism data and spectra are 'blindly' extracted given an RA and Declination position and extraction parameters (object and background windows). The list of extracted sources and parameters appear in tables in the '-gsax.fits' and '-xg-gsp.fits' product files. The source list in the '-gsax.fits' file contains both extracted sources and sources only used for masking purposes (see 'flag' column).
- Image Strips -- For each source, two image strips (usually 80 x 900 arcseconds) are created and stored (see -pri.fits)-- a photon accumulation image and a response image. The pixel size is 1 arcsecond (not 1.5 arcseconds which is used in the -int.fits and -rrhr.fits images).
- Masking Nearby Sources -- In addition to the list of extracted sources, there is also a somewhat larger list of sources used for masking. Given the direct image NUV and FUV magnitudes of these masking sources, the position and intensity of their spectra is estimated on the grism image. If the flux of a neighboring source is significant compared to the primary source in a given image strip, the neighboring source is masked (pixel set to a negative value). Masking of nearby sources is evident when viewing the '-pri.fits' file image strips.
- Extraction Parameters -- The default object extraction is 10 arcseconds centered on the predicted position of a source. The separation between the object window and start of the background window (on either side) is 6 arcseconds. The total image strip is 78 arcseconds in the spatial direction (78 rows), which leaves 28 arcseconds on either side for background determination. At each wavelength point (or column), the background is determined by averaging all pixels in the background rows and 3 columns on either side. These parameters are given in the tables (-gsax.fits and -xg-gsp.fits) and could vary, although the current standard pipeline uses these values for all sources.
- Simple and Optimal Extraction -- The default spectrum is a simple summation of pixels across the object window at each wavelength point for the two strongest orders in each band (1st for NUV, 2nd for FUV). The error is computed by counting statistics using the total number of photons at each wavelength point. This default spectrum is stored as the fixed length array columns 'obj' and 'objerr' in the -xg-gsp.fits binary FITS table file. In addition, an 'optimal extraction' spectrum is computed using a single model profile for a given exposure (the profile model is stored in the -gsax.fits file), and stored in the 'opx' and 'opxerr' columns. A 'secondary spectrum' is also extracted using the 2nd order NUV and 3rd order FUV spectra, which is stored as 'objs' and 'objserr'. For coadded data, a 'median spectrum' is computed using the median image strips (see below).
- Rebinning to a Linear Scale -- The uncalibrated spectral data appear in '-xsp.fits' file as photons/second versus position relative to the undeviated wavelength point in arcseconds. These data are rebinned to a scale of 3.5 Angstroms per bin so they can be stored on a linear wavelength scale in the '-xg-gsp.fits' file. The rebinning is done by summing up partial pixels based on the wavelength dispersion function.
- Calibration -- The wavelength-independent response is divided into the photon data prior to background subtraction. After the spectrum is extracted, the wavelength-dependent response correction is applied. This includes the position and wavelength-dependent response correction mentioned above which was stored in the last row of the response image strip. This correction yields photons/second/cm^2/Angstrom in the final spectra in the '-xg-gsp.fits' file.
- Merging -- The NUV and FUV data is processed separately (although the aspect correction is derived from the NUV only and applied to the FUV data). The final '-xg-gsp.fits' file is created by combining the '-gsp.fits' files from each band. A inter-band gap of 10 Angstroms between FUV and NUV at 1820 Angstroms is set to zero in the spectra (-1 in the error spectra). These values are given by the 'BANDSPLT' and 'BANDSGAP' header cards and may change in the future (based on better calibration).
- Adding Image Strips -- Each exposure (or 'visit') has a set of photon and response image strips for each object (-pri.fits). To create the coadded data for a given target (or 'tile'), each image strip from every visit is added together for each extracted object. Masked portions of an image strip will contribute zero photons and zero response in a given visit will tend to be filled in by data from other visits or different grism angle. (An optimal set of data for a deep survey will contain a wide variety of grism angles.) The final coadded photon and response image strips can then be extracted as described above. The coadd products will include a '-prc.fits' (instead of a '-pri.fits'). Other filename extensions remain the same.
- Median Image Strip -- For multiple visit data, a median image strip is created for each object. These data are stored in the '-prm.fits' file and are used to create the 'median' spectra which appear in the '-xg-gsp.fits' file (column name 'objmdn'). There is no error spectrum for the median spectrum and is generally only useful in investigating possible non-physical artifacts in a coadded spectrum (since any strong unmasked artifact in a single visit may appear in the sum, but not the median).
- Inter-visit Masking -- Some features in the GALEX grism data, including optical reflections and bright stars just off the field of view, will not be masked at the visit level since they cannot be predicted accurately. To improve the quality of the data in targets with many exposures, these features are identified by comparing the image strip data for a given object with the median image strip data (from all visits). Using a variety of criteria, including contiguous deviant pixels, most significant artifacts can be eliminated. For targets with many visits and many different grism angles, the quality of the coadded spectra can be much better than for single visit spectra. Below is shown an example of a coadded whole field image with multiple grism angles.