This section presents a brief overview of the data reduction software
for the NOAO CCD Mosaic. It concentrates on describing the general data
flow and the main data reduction software components. The details of each
component are given in later sections. The figure below illustrates the
components and data flow with the data reduction components highlighted.
The NOAO CCD Mosaic Software System
The data acquisition system (DAS) sends pixel and descriptive
information down the message bus. Various components receive the
information from the message bus. Two major components are the real
time display (RTD) and the data capture agent (DCA). The DCA
writes the information for the observation to a FITS Multiextension
file (FITS-ME) in the Mosaic data format. It
also sends a message to the data reduction agent
(DRA) as each observation file is completed.
The data reduction agent automatically (or by user command) operates on
the observation files. Basic calibrations are applied by CCDPROC. This removes detector biases and defects.
Another basic calibration applies a world coordinate system (WCS)
calibration prepared earlier from a standard astrometry field. This is
done by MSCWCS. The basic WCS calibration defines a
fairly accurate mapping between pixels and celestial coordinates. If a
catalog of sources for the field of observation is available MSCWCS finds
the sources based on the WCS calibration and updates the WCS for small
errors. The DRA may automatically obtain source information for the field
from on-line catalogs.
Calibration observations, such as flat fields, generally include many
exposures to minimize noise. The multiple exposures are combined by COMBINE to create master calibrations to be applied to
the science observations. The DRA automatically senses sequences of
calibration exposures and combines them once the sequence is finished. It
also keeps track of the master calibrations and applies the appropriate one
to new science exposures. Science exposures are generally not combined by
the DRA since these are usually dithered or rastered.
In the basic processing the individual amplifiers and CCDs are kept
separate except that multiple amplifiers from a single CCD may be merged
together into a single extension for the CCD. The user may analyze the
calibrated exposures keeping the readouts separate. This avoids any
resampling of the pixel data. However, the user may wish to resample the
elements of the Mosaic into a single large image. MSCIMAGE uses the WCS to resample the pixels into a
uniform grid on the sky. This corrects alignment errors in the detector
and optical distortions.
When multiple exposures of a field are taken the images from MSCIMAGE
are combined with COMBINE. This would include offsets for dithering
and raster patterns. To avoid resampling the data a second time MSCIMAGE
produces images which are sampled on an even grid of pixels on the sky.
This means that multiple exposures can be combined using integer shifts
along the raster axes.
For this to work well the WCS used by MSCIMAGE must be consistent over
the data set. There are two ways in which this is assured. One is if the
WCS has been absolutely calibrated by MSCWCS using catalog sources.
If one does not have a catalog of sources to determine an absolute
WCS calibration then the objects in the images may be used to derive
a self-consistent WCS over a set of overlapping images. The objects
in each image are cataloged by an automatic detection algorithm. Each
object will have a coordinate based on the approximate WCS calibration.
The objects in the catalogs are matched using an automatic matching
algorithm. The WCS coordinates of matched objects are adjusted
to define a consistent WCS for all the images from the field. In
effect this registers objects in the images so that when the images
are combined the common objects will be aligned. The task that does
this is MSCREGISTER.
In addition to calibrating, registering, and combining the pixel
data the data reduction software also creates and maintains auxilary
data. This includes bad pixel masks,
uncertainty arrays, and exposure
maps.
Mosaic Data Format
The NOAO CCD Mosaic data format consists of a single FITS file
for each observation. The FITS file contains a primary header
with no associated data and a number of extensions. The primary
header is used to described the contents of the file and contains
global keyword information applicable to all the image
extensions. The extensions include the image data from each amplifier,
pixel masks, uncertainty
arrays, exposure maps, auxilary tables, etc. The image extensions are
always present while the other information is added at various stages
during the reductions.
The following figure illustrates the data structure. The PDU stands
for the primary data unit. The figure also shows how the inheritance
convention defines the header for each image extension as the combination
of the global keywords and the keywords for each individual image header.
Pixel masks assign an non-negative integer value to each pixel
in an image. The meaning of the mask value depends on the purpose of the
mask, there may be more than one assigned to an image, and the application
that will use it. Because it is often the case that most pixels have the
same mask value IRAF provides a special representation called a "pixel
list". The representation is very compact. One of the issues still to be
resolved for the Mosaic data format is how pixel lists will be
stored as a FITS extensions. Use of the IMAGE extension clearly defeats the
purpose of the compact list.
Because the pixel list format is so compact it will also be used to
represent some real values associated with the pixels. This is appropriate
when there are only a few values that have large regions of constant value
or when a range of values can be mapped to a set of discrete values with
some desired precision. The mapping will often be linear, comparable to
the FITS mapping of real values to integers using BSCALE and BZERO, though
non-linear maps may also be used.
The types of integer pixel masks being considered are:
Identification of good and bad pixels
Number of original input pixels contributing to a pixel
Data quality flags
Identification of regions for various purposes
The types of real pixel masks being considered are:
Uncertainty values (discussed further elsewhere)
Exposure maps showing accumulated exposure time contributing to a pixel
The pixel masks that will definitely be associated with the NOAO CCD
Mosaic data are bad pixel masks and uncertainty values. When combining
multiple calibrations or dithered exposures there will also be an exposure
map. Specific extension names will defined for these associated data.
The bad pixel mask will identify good and bad pixels. The proposed values
for the mask are:
0 = good pixel
1 = CCD defect (inherited from a static calibration map)
2 = Saturated (determined by the acquisition system or processing)
3 = Rejected by user (e.g. pixel editing)
4 = Rejected by software (e.g. cosmic ray algorithm)
Uncertainties
A very important aspect of the image data is the uncertainties. Many
of the concepts are reasonably well understood such as the characterization
of the uncertainties in the raw CCD data in terms of a readout noise and
Poisson statistics and how uncertainties are propagated when combining
pixels with independent errors. Others are less well understood such as
what happens with resampling. The biggest dilemma has been how to maintain
the uncertainty information without doubling the data volume by using an
associated data array of uncertainty values of the same size as the image
data. The NOAO CCD Mosaic Software Project provides an opportunity to
address the question of uncertainties. In terms of the data structure we
need something that will be compact yet offer the flexibility to
characterize the uncertainties of each pixel.
The model we propose for CCD uncertainties is
V(i,j) = A + (B + I(i,j))) * f(U(i,j)) (1)
where V(i,j) is the variance (sigma squared), I(i,j) is the data, A and B
are constants, U(i,j) is an array of values, and f is a mapping function.
In order to provide a compact description U(i,j) is represented as a pixel
list of integers which, hopefully, have large regions of constant value.
The use of integers means that the variances will be quantized at some
precision. The mapping function f can be defined to adjust the resolution
at different levels. Note that there is already a mapping relative to the
pixel sigmas because of the definition in terms of the variance.
This model allows easy propagation of errors in the common cases. The
A value is a constant noise term. Typically this would be the CCD readout
noise. When adding or subtacting two images corresponding A terms add.
The B term is used when adding or subtracting constant values from images.
For raw CCD data this value is zero.
The usefulness and compactness of this model, that is how well the idea
of largely constant areas in the U array will work in practice, still needs
to be investigated. Preliminary experiments show promise that this
approach will work effectively.
The problem of the storage format for the U pixel list is essentially
the same as that for pixel masks. As with the masks the format in a
FITS file is still to be determined.
Implementation Notes
FITS Extension for pixel lists needs to be defined
FITS Kernel support for pixel lists needs to be added
Tools for copying table extensions need to be created
The uncertainty array capabilities need to researched and developed
The Data Reduction Agent
The data reduction agent (DRA) provides pipeline data handling of the
observational data. Its functions are
The DRA is a continuously running event-driven process.
The events which trigger the above functions are
when the data capture agent finishes writing an observation to disk
when the user initiates an action via the graphical user interface
The first case provides automatic processing and archiving. The second
case allows the user to perform manual calibrations or initiate recalibrations
of the automatic processing. Reprocessing would be done when additional or
improved calibration data becomes available. For example, the automatic
processing can proceed using calibration data from the start of the night
and recalibration can be done after additional calibrations at the end of
the night are obtained.
The pipeline calibration, reduction, and quality assessment are
defined by "recipes" selected from a list of recipes. A recipe is
basically a "macro" or "script" that is executed on a specified
disk file or set of disk files.
Graphical User Interface (GUI)
The DRA is controlled by a graphical user interface. This interface
provides
a browsing tool for the observations
status information of the processing
quality assessment information
a tool to view and manipulate the calibration data base
the ability to delete or exclude data
control of the automatic processing
selection of pipeline calibration, reduction, and assessment recipes
control of parameters
recall of raw data
initiation of recalibrations
Pipeline Data Calibration
Pipeline calibration consists of the standard CCD calibration operations.
These are
pre/overscan calibration
trimming of pre/overscan and bad edge regions
bad pixel and saturated pixel masking and replacement
zero level (also called bias) calibration
dark count calibration
flat field calibration
propagation of uncertainties from the detector readout characteristics and the calibration data
The zero level, dark count, and flat field calibrations are created by
combining multiple individual calibration exposures. The combining
provides
scaling to a common mean or mode
detection and rejection of bad pixels from masks and by various algorithms
averaging or medianing of good pixels
propagation of uncertainties
output of an exposure map
output of a pixel mask of the number of good pixels combined
Series of calibration exposures will be automatically detected and
combined by monitoring the exposure types. For example, when a flat field
type is first seen the individual exposures will be logged to the
calibration database and when the first exposure which is not a flat field
is detected all the preceding flat field exposures will be combined into a
master flat field.
The automatic pipeline calibrations will use the closest calibration or
master calibration in time. During initial automatic processing this
will be the most recent previous calibration. When recalibration is
done the nearest in time may be either before or after the exposure
being calibrated. However, the DRA can be instructed which calibration
to use if desired.
The details of the pipeline calibrations are specified by selecting
a recipe and parameters in the DRA. Normally there will be one standard
calibration recipe which will be part of the initial implementation.
Variations of the standard recipe would be for special modes of
operation (e.g. drift scans) and for future types of detectors and data
(e.g. IR detectors where the order of calibration steps is different).
Pipeline Data Reduction
Pipeline data reductions are those operations automatically performed
after CCD calibration. Possible examples are spectral extraction and
object cataloging. The pipeline data reductions are selected from a
list of recipes in the GUI. In most cases there will be no data reductions
performed. Those that might be performed would generally be quicklook
reductions that are redone later by the investigators in a more interactive
manner.
The DRA does not provide the data reduction recipes. It only
provides the mechanisms for adding data reduction recipes. The initial
version of the DRA will probably not include any data reduction recipes.
Pipeline Data Assessment
Pipeline data assessment is a special kind of data reduction. It
does something to the calibrated (or possibly uncalibrated) data which
results in one or a few numbers. Often the numbers will be related to
the signal-to-noise of the observation. Examples of this are monitoring
the aperture photometry of some object(s) in a series of exposures of the
same field or computing the mean extracted counts in spectrum.
The DRA provides for recipes that perform data assessment with the
results viewed as graphs or text output.
Archiving and Taping
When the DRA is notified of a completed observation it may queue the
raw observation and, possibly, pipeline calibrated data to be archived
and taped. The archiving would, at a minimum, be something like
"save-the-bits".
The archiving will include access control to prevent
general users from avoiding observatory mandated archiving.
Mosaic CCD Processing: CCDPROC
Basic CCD calibration processing is performed by the IRAF task
CCDPROC. It provides the standard CCD calibrations for each of the
amplifier/CCD readouts of the Mosaic.
pre/overscan calibration
trimming of pre/overscan and bad edge regions
bad pixel and saturated pixel masking and replacement
zero level (also called bias) calibration
dark count calibration
flat field calibration
propagation of uncertainties from the detector readout characteristics and the calibration data
The processing is performed on input data in the Mosaic data format.
The output data is also in the Mosaic data format with the CCD image data
calibrated and the associated pixel masks and uncertainties updated. The
output data is created in a temporary file until the processing is
successfully completed. Then the input data is renamed to a backup
directory and the output file is renamed to the input name.
One change of data format is when there are multiple amplifiers from
each CCD. The calibrated amplifier images are combined into a single
image for the CCD. The output Mosaic format then consists of multiple
extensions for the CCDs.
The Mosaic version of CCDPROC is actually a relatively simple task,
possibly an IRAF script, that understands the details of the Mosaic data
format. It extracts the individual amplifier images and associated data,
such as pixel masks and uncertainties, and passes them to a lower level
task to do the actual processing. It then takes the calibrated data and
updated associated data and puts them back into the Mosaic data format.
The lower level task is written to process individal images and
associated pixel masks and uncertainties from an input to an output
and has no knowledge of the details of the Mosaic data format.
The extraction from the Mosaic format to individual images and the
reconstruction of the Mosaic format from the individual calibrated images
does not actually involve extra copying of the data or intermediate files
for the bulk CCD data. The FITS image kernel allows individual input
images to be addressed directly in a multiextension FITS file and the
output images to be appended to new extensions of a multiextension file.
This is done so that an IRAF application does not need to know the disk
structure of the data and can be written as simply reading and writing
logically individual images. The Mosaic CCDPROC task controls the syntax
to the FITS kernel image specification.
To illustrate how this works consider the following command sequence.
The first statement copies the input data global header to a new output
FITS multiextension file. The second statement passes the image extension
"im1" to the lower level CCDTOOL task as a single image and tells CCDTOOL to
create a new output image "outdata[im1]", output pixel mask "tempmask", and
output uncertainties "tempvar". The FITS kernel appends the calibrated
data to "outdata" without CCDTOOL knowing that it is appending to an existing
file. The next statement appends (sequentially) the pixel mask and
uncertainty data from the temporary files to the output data file. Note
how this avoids simultaneous access to the output image. Mask and
uncertainty files are small and there is no significant overhead to using a
temporary disk image. The final part of the example shows that other
extensions in the input data can be copied by the task that knows about the
data format without requiring something like CCDTOOL to know about the
non-image extensions.
Implementation Notes
The following must be added to the current capabilities of CCDPROC.
Creation and propagation of uncertainty arrays
Use of bad pixel masks for cosmetic interpolation
Identification of saturated pixels and update of pixel masks
Mosaic Image Combining: COMBINE
Calibration Images
The combining of multple calibration exposures from the Mosaic detector
is performed by an IRAF task COMBINE. It combines the individual elements
of the Mosaic matched by amplifier or CCD identification. The combineing
is done pixel-by-pixel within each amplifier/CCD image. It also propagates
combined bad pixel masks, variance images, and exposure maps. The input
and output data formats for the combining are the Mosaic data format.
The Mosaic version of COMBINE is a relatively simple task that
understands the details of the Mosaic data format. It extracts the
individual amplifier/CCD images and associated data, such as pixel masks
and uncertainties, and passes them to a lower level task to do the actual
combining. The calibrated data and updated associated data are then put
back into the Mosaic data format.
The extraction from the Mosaic format to individual images and the
reconstruction of the Mosaic format from the individual calibrated images
does not actually involve extra copying of the data or intermediate files
for the bulk CCD data. The FITS image kernel allows individual input
images to be addressed directly in a multiextension FITS file and the
output images to be appended to new extensions of a multiextension file.
This is done so that an IRAF application does not need to know the disk
structure of the data and can be written as simply reading and writing
logically individual images. The Mosaic COMBINE task controls the syntax
to the FITS kernel image specification.
COMBINE also will define how the image headers and non-image extensions
are combined. In the initial implementation the output combined image will
have the image header and non-image extensions from the first input image
in the specified list of input images. This is the current approach in
most IRAF tasks, such as IMCOMBINE and IMARITH, that produce an output from
more than one input image.
The combining of calibration exposures will generally be controlled
by the data reduction agent. It will detect sequences of calibrations
and combine the sequence. Simple scripts layered on CCDPROC and COMBINE
will be used and may also be used by the observer. These are ZEROCOMBINE,
DARKCOMBINE, and FLATCOMBINE, and COMPCOMBINE.
Dithered or Rastered Science Images
The combining of calibration exposures is straightforward in the sense
that there does not need to be any interpolation, shifting, and coordinate
manipulation. The combining of dithered or rastered science exposures is
more complex, particularly with regard to coordinate systems. Such data
are first resampled into a single image in a celestial coordinate system
that can be shifted by integer amounts along both image axes before
combining. This is done by MSCIMAGE. COMINE
uses the coordinate system produced by MSCIMAGE to shift and then combine
dithered or rastered obsrvations.
Implementation Notes
Define contents of combined image headers
Define contents of combined non-image extensions
Define combined WCS
Combine image extensions grouped by amplifier/CCD and by subset
Offset images based on WCS
Provide the combining algorithms available in IMCOMBINE
Propagate bad pixel masks
Propagate uncertainty arrays
Propagate exposure maps
Propagate maps of the number of pixels combine
Calibrating the Mosaic World Coordinate System: MSCWCS and MSCREGISTER
The Mosaic World Coordinate System (WCS) maps the image pixels to
celestial coordinates on the sky. The mapping is stored in the headers for
each amplifier/CCD image. The WCS is defined in two stages. The first
stage applies a predetermined calibration and the second stage adjusts this
calibration based either on a catalog of sources in the field of the
exposure or registers the WCS in multiple overlapping exposures based on
common objects in the images.
The WCS Calibration File
The WCS calibration file consists of "plate solutions" for each
amplifier/CCD determined from calibration exposures. This is done using
MSCMAPWCS. The plate solution is then applied to observations by adding
the telescope pointing and, possibly, instrument position angle. In other
words, the WCS is determined once at some telescope pointing reported by
the telescope control system. This WCS is used for other telescope
pointings with a zero point offset set by the difference in reported
telescope coordinates between the calibration and the observation. If the
detector may be rotated then the calibration also includes a rotation axis
origin determination and uses the difference in instrument position angles
to adjust the WCS.
The plate solution may be determined by an instrument support person at
some point prior to the observer or by the observer at the beginning of a
run. Hopefully the NOAO CCD Mosaic will have sufficient geometric
stability that the calibration need be done only when major maintanence is
done or when the detector is mounted on the telescope at the beginning of a
block of observing time. Regardless of whether this is done by an
instrument support person or the observer some standard calibration fields
with source catalogs will be prepared and a "cookbook" sequence
documented.
A secondary calibration tool, MSCZERO, allows marking a single
object in an exposure and entering a celestial coordinate to update
the calibration file to "zero" the coordinates relative to the telescope
pointing. For possible rotations two objects may be marked.
Applying the WCS Calibration File: MSCWCS
The first stage of setting the WCS for an observation using a
calibration file and the telescope pointing is a basic calibration
operation performed by MSCWCS. Note that if the WCS is set at an earlier
stage by the data acquisition system or the data capture agent then this
option of MSCWCS might not be needed.
Adjusting the WCS for the Observation: MSCWCS
The WCS set by the first stage is likely to be off by a small amount
due to errors in the telescope pointing and instrument flexture. The
second stage is to use objects in the image to adjust the WCS. This second
stage may use many objects and a full astrometric catalog to make a new
calibration. However it is more likely that there are only a few objects
and possibly no source catalog. In that case the few objects can be used
to make small zero point and rotation adjustments in either an absolute
sense if the objects have known celestial coordinates or a relative sense
if common objects in multiple exposures are used to register the
exposures.
The adjustment of the WCS using a catalog of sources in the field of
observation is performed by the task MSCWCS. It assumes that the existing
WCS is fairly close. It takes each source in an input source catalog and
searches near the expected position in the image for an object. The object
position is determined using a centering algorithm. Once a set of measured
pixel positions and catalog celestial coordinates is determined the WCS can
be adjusted for an offset and rotation or possibly a new plate solution can
be computed.
MSCWCS can be run automatically given a good first WCS and a catalog of
sources. If the user supplies the source catalog or the data reduction
agent can automatically obtain a catalog (say by using the telescope
coordinates and a "catalog server") then this second stage WCS calibration
performed by MSCWCS can be part of the basic calibration performed by the
DRA.
MSCWCS applies both the initial calibration based on a calibration file
and the telescope information and an observation correction based on a
catalog of sources in the observation. Thus, while the logic is described
as two steps the DRA may do both operations at once with one call to
MSCWCS. The way MSCWCS works is if a calibration file is specified it does
the first stage and if a source catalog is specified it does the second
stage. If both are specified in one execution then both stages are done.
Mosaic WCS Registration: MSCREGISTER
The task MSCREGISTER uses objects in a set of Mosaic observations to
adjust the world coordinate system (WCS) for each observation to best
"register" the objects. This means that overlapping objects will have
nearly the same coordinates subject to the limitations set by the form of
the WCS description. The set of objects need not appear in all
observations but there must be some reasonable overlap so that each
observation has common objects with one other observation and all the
observations form a single continuous region.
Several algorithms are required. The objects in each amplifier/CCD
image must be cataloged. Then common objects between the many catalogs must be
identified. Finally the set of WCS must be registered in some
"best" way.
To simplify the problem the data are required to have some approximate
world coordinate system that places common objects within some distance of
each other. This is based on a astrometric calibration, offset by the
position of the telescope, that takes the CCD alignments and optical
distortions into account.
Implementation Notes
The FITS WCS descriptions for celestial coordinate systems is under
development. The least certain area is representations of the higher order
terms of a plate solutions. The initial implementation will measure the
full plate solution but will set the image WCS using only one of the WCS
representations described in the FITS WCS draft. Since each amplifier/CCD
image has it's own WCS the plate solution should be sufficiently accurate
without higher order terms.
The following needs to be considered.
Define format for WCS calibration file
Define source catalog format
Create a Single Image from a Mosaic Observation: MSCIMAGE
The individual amplifier/CCD pieces from a calibrated Mosaic exposure
are put together to create a single image using the task MSCIMAGE. This
operation
corrects for flips in the images introduced during the amplifier
readouts
corrects for alignment differences between the CCD chips
corrects for optical distortions
corrects for field rotations and subpixel shifts relative to a
uniform sky grid
Basically, a uniform sky grid of equal sized pixels about some point in the
sky is defined and the observed pixels are interpolated to this grid. By
using the same grid for dithered or rastered sets of observations, the
images can then be combined using only integer pixel shifts in the two
image axes. The goal is to require only a single interpolation of the
data.
The mapping between the coordinates of the input pixels and the
output pixels is defined by the world coordinate system in the
image headers. This is set during the calibration steps as described
in another section.
While the default action of MSCIMAGE is to create a single resampled
image from the elements of the Mosaic there is also an option to
preserve the Mosaic data format by keeping the resampled elements as
separate extensions.
Implementation Notes
The WCS for a transformed image made from the components of the Mosaic
will be a Cartesian projection which allows simple shifts to register
dithered and rastered images. This type of WCS is perfectly fine for the
scales on which dithered and rastered observations will be done. It is an
acceptible and defined WCS in the FITS draft standard. The point to note
is that this type of WCS projection has not been considered common in FITS
optical images. With the increased use of optical Mosaics with fields of a
square degree or less this will likely become much more common because of
it's property of straightforward combining of rastered observations.