[Note: I'd originally intended to be more prolific about these topics, sorry. Please comment on any suggestions you have for ideas, otherwise I may expand the topics to some non-IRAF projects that might be of interest]

IRAF is inherently a multi-process system (i.e. even a simple script requires binaries for the CL and perhaps several packages) and has always had the ability to run background jobs, but users sometime wonder whether tasks can be made multi-threaded as a means of improving performance for highly data-parallel operations (e.g. mosaic image processing). The short answer is that it is possible, but only with some restructuring of the core system, and of course needed changes to applications to enable the threading.

See below for more .....

Technically, the current limitation is in the use of global data structures throughout the system, e.g. the file i/o common blocks. Adding an IRAF kernel interface to allow thread operations has already been done in a prototype system, but is only really useful for strictly computational routines such as image interpolation. At the applications level, what you really want is to be able to split an input list of images (or MEF extensions) across multiple threads, however doing so requires that we modify key parts of the system so threads cannot corrupt global data, and that the applications be modified to split the processing. More thought would be needed before starting such an adventure, everything from error handling in SPP to CL parameter passing is affected, however expanding the scope of the prototype work to include basic image i/o would allow for better parallelism in tasks. This is strictly an R&D project at this stage and would require a commitment to expanding the capabilities of the SPP language and core IRAF system in order to move forward.

Processing a list of 1000 images on a quad-core machine can already be optimized by splitting the list amongst four background jobs on the CL command line. Adding system support to do this automatically on behalf of the user isn’t currently a high priority, nor should it be.

Note that in something like the VAO applications framework being discussed in various circles (a topic for a future post), we can exploit the process-parallel nature of IRAF directly, i.e. we let the higher levels of the framework distribute the list to be processed amongst various compute components as needed. Were we to extend the current IRAF system itself, it would be more natural to do the same thing at the CL level with some clever way of creating multiple instances of a task to split up a list for processing to systematically add the functionality, rather than by modifying individual tasks to do the same thing at a lower level. The process of making any piece of software run in parallel requires making some decisions about the granularity of the parallelization. No doubt certain algorithms will draw this line at much smaller blocks of data, but when considering IRAF as a system, the natural lines fall either at the task level when considering processing components, or at the level of 2-D image arrays (entire images or extensions) or data files when considering data parallelization. In either case, and especially in a system with thousands of applications, the place to implement this functionality is clearly at a level above the application itself.

Comments (0)