This is the inaugural message of a topic I hope to make a regular feature on this site. It is an excerpt from a paper recently written meant to inform the reader about various technical aspects of the IRAF system (we hope to also make this paper public in its entirety as well). Topics will cover a range of technical IRAF system matters with the hope of prompting discussion and questions. but will by no means be complete. Your comments and suggestions for future topics are encouraged.

IRAF Memory Usage

IRAF memory requirements are small by any measure; the system will happily process a Mosaic image on a machine with as little as 32MB of memory installed. This is due in large part to the fact that most tasks were written to process images in a line-by-line fashion rather than reading in an entire image, extensions of a mosaic MEF file are processed serially where possible. .....(read more below....)

The result is that tasks tend to deal with no more than a few thousand pixels at a time (even mosaic detectors typically use 2K x 4K CCDs), although there are a few tasks that will try to use an entire image. This is a happy artifact of system architecture from a time when computer memory was much more limited (and expensive) than it is today.

Even laptops today typically come with several Gb of RAM, so it is natural to ask whether there is any benefit to be had by changing at least major applications so that more processing is done over entire images (and presumably then “in memory”) and are thus presumably faster to execute. The surprising answer is: not necessarily. Modern operating systems are fairly aggressive about caching frequently accessed files in virtual memory, so even though an application may be I/O intensive, not all of the data is coming from the hard drive all the time. Consider the case of a master dome flat being used to calibrate hundreds of observations, or an astrometric reference image used to calibrate a night’s observations: Chances are that a programmer would optimize an algorithm to cache this master image in memory, but in a modern OS the image would already be in virtual memory and disk I/O would be possibly irrelevant.

Where this model fails is in the case of an algorithm that is written such that this master image would be frequently swapped out of memory, in this case more direct control at locking data in memory would help. Such a scheme was implemented for the Mosaic data handling system to lock the on-the-fly calibration image in virtual memory where disk thrashing became an issue. Stubs for this process are already in the IRAF system but were never completely integrated due to varying levels of support for the method among different operating systems. This could be revisited, but would require further investigation to determine whether there is any real benefit.

Another issue not widely known is that the image I/O system in IRAF is already optimized to minimize the number of disk transfers required, and this process can be tweaked by environment variables for packages that may be dealing with especially large images or when using hardware with memory well in excess of the image size. This works by allowing either an explicit buffer size to be used, or by setting some fraction of the image size as the file i/o buffer (up to a max of 32Mb which is considered adequate for most applications).

Optimizations of this sort should be examined for closely, especially for extremely large-format detectors, the system defaults and automatic nature of tuning this buffer appears to work well for most detectors currently in use today.

Comments (0)