PC-IRAF V2.12/Suse 10.0problem

Welcome to iraf.net Wednesday, May 01 2024 @ 07:37 AM GMT

	Forum Index > Help Desk > Applications	New Topic	Post Reply
PC-IRAF V2.12/Suse 10.0problem

Anonymous: Guest

02/16/2006 11:19AM (Read 6311 times)

I have started using PC-IRAF V2.12 on a Linux machine (Suse 10.0),
and find a problem using daophot/allstar*** glibc detected *** free(): invalid pointer: 0x8f62202e ***
In a script the usage was this:i=1
allstar (image="fakeImage00"//i//".imh", photfile="all.mag", psfimage="all.fits.psf.1.imh", allstarf="all.als.1", rejfile="all.da>
arj.1",subimage="all.sub.1", fitrad=6, fwhmpsf=2.8, psfrad=10,datamin=-1.,datamax=1000., sannulus=13,
wsannulus=9,sigma=0.05,verify-, recenter=yes, groupsky=no, fitsky=yes) I hope you can help.
David

Quote

fitz

02/16/2006 11:19AM

Admin

Status: offline

Registered: 09/30/2005
Posts: 4040

There isn't much to go on here and no similar reports have been filed. The pointer address is outside the 32-bit range which may simply be a memory corruption issue. OTOH if this is a 64-bit OS that could be part of the problem.
To help track it down, I'm assuming you can isolate this to a particular image, i.e. instead of running in a script where your 'i' is some loop variable, can you get the same crash for a particular value of 'i' or does this happen with only the first/N-th image? If it is some N-th image does the old hack of putting a 'flpr' in the loop clear it? Do any other tasks fail similarly, e.g. "implot dev$pix" or is it just ALLSTAR? Have you tried unlearning all of the parameters first (including the datapars/daopars psets)? Is the cache parameter on or off, does it make a difference?
Otherwise we'd need some way to reproduce the error, i.e. could you bundle up all the images/mag files and parameters
(as in 'dpar allstar datapars daopars > parfile') and post the URL to them?-Mike

Profile

Quote

butlerdj1

02/16/2006 11:19AM

Newbie

Status: offline

Registered: 02/16/2006
Posts: 2

The 'glibc/invalid pointer 'problem did not happen with the daofind, wphot, psf, or pselect tasks. That the image name had a running variable 'i' does not appear to be the problem, as the error occurs when there is no variable.IF I add "export MALLOC_CHECK_=1" to the .bashrc file, the error is ignored, but allstar fails later on, giving this error message:
PANIC in `/usr/local2/misc//iraf/iraf/noao/bin.suse/x_daophot.e': Memory has been corruptedThe machine used is a 64bit machine running Suse10., which might be part of the problem based on one of Mike's comments.allstar works fine on a 32bit machine, but fails on the 64bit machine.
Is that expected ?

Profile

Quote

fitz

02/16/2006 11:19AM

Admin

Status: offline

Registered: 09/30/2005
Posts: 4040

It sounds then like a "normal" memory corruption problem rather than anything specific to the 64-bit issue (other than you see it on the 64-bit machine and not the 32-bit, but we often see bugs expressed on linux which run fine on sparc so this isn't any different). You might try setting MALLOC_CHECK_=2 so
the task aborts closer to the point of corruption but even this may not be definitive. One approach we use a lot is to link with the ElectricFence memory debugger. If you've got this installed, then before starting the CL (and it sounds like you're using bash) do a export LD_PRELOAD=libefence.so.0.0The CL and each subprocess should print a banner saying efence is loaded. What this does by default is put a protected memory page at the end of each malloc and triggers a hardware exception when an array is overrun. There are environment variable to control other behaviors so see the man page. What this gets you is a way to trap right at the point of corruption so you can get a stack trace.To do this: Once the CL is started put the ALLSTAR binary in the process cache using "cl> prc allstar", then list the cache just using "cl> prc". You should see output likedaophot> prc
[04] tucana!17593(44B9X) HL noaobin$x_daophot.eHere the 17593 is the process id. In a separate window attach GDB to that process using % gdb $iraf/noao/bin.suse/x_daophot.e 17593Then set a breakpoint in 'zpanic_' and type 'c' to continue. Run ALLSTAR as before in the other window and when it faults your GDB should break at that point and type 'where' to get a stack trace. Post this result.Now, this may not actually tell us anything useful. The problem is likely to be data specific so posting a URL to your images/params so we can reproduce it may still be the only way. If you could experiment with parameters to see which ones might be key to triggering the error that would also help.-Mike

Profile

Quote

butlerdj1

02/16/2006 11:19AM

Newbie

Status: offline

Registered: 02/16/2006
Posts: 2

Thanks for your ideas Mike.I spotted some spuriously high and low values (e.g. +/-1E31) in the input images, which appeared to cause the memory corruption early at iteration 1, with cache=yes. Without the spurious values, the corruption error message occurred when allstar had nearly ended and was (presumably) writing the output files.Solution/work-around:
By setting 'cache=no' in the allstar parameter file, the memory corruption problem is gone; I get the impression that it runs more slowly however because of that, but that is not surprising.David

Profile

Quote