Welcome to iraf.net Wednesday, April 24 2024 @ 06:54 PM GMT


 Forum Index > Help Desk > General IRAF New Topic Post Reply
 ERROR: segmentation violation/temp file/x64/timing issue
   
richardbaxter
 08/25/2009 08:47AM (Read 12231 times)  
+----
Newbie

Status: offline


Registered: 08/25/2009
Posts: 6
Hi IRAF users,Can I please ask whether anyone has ever seen the following iraf error before (below)? This happens temperamentally (~ 3 out of 4 times) when executing gemini gmos commands (like gsflat or gsreduce). I have replicated this issue on both a FC10 x64 fast workstation and a CentOS 5.3 (~FC6) x64 fast workstation, and can confirm that this problem does not occur at all on a FC6 32bit slow laptop. The exact same iraf (latest) and extern/gemini (latest) installation has been performed on all machines. If I keep on executing the command it will work eventually. "ERROR: segmentation violation" generally indicates a file does not exist at the time it is being read. I have confirmed that the temporary file does indeed exist after the command has been executed even though iraf appears to fail to read it while the command is being executed. Based upon these symptoms, here are some possible causes (note all of these possibilities listed indicate it might be a general IRAF issue and not an issue specific to the gemini software package 1.10);
- the temporary file tmpimagexxx is not being created on the x64 machine in time for it to be able to be read by the next section of code
- another possibility would be something to do with the speed of the processor (~1Ghz laptop versus ~3Ghz workstation), but I doubt this would be the case since other gemini users would have picked up any CPU speed dependent issues.[code:1:4b1319479d]
GIREDUCE: Image tmpimagexxx overscan subtracted
ERROR: segmentation violation
"if (no == mef || (no == imaccess(l_image//"[1]") && no == imaccess(l ..."
line 51: gemtools$gimverify.cl (hidden task)
called as: `gimverify (image=tmpimagexxx)'
[/code:1:4b1319479d]I have done google searches for this issue, but with no success.Any ideas / diagnostics direction would be most appreciated.Thank you for you time,Richard Baxter---And here is the surrounding iraf output just for context;[code:1:4b1319479d]
GPREPARE -- Tue Aug 25 17:48:44 EST 2009Input list = N20090720S0081
Output list =
Output prefix = g
Raw path = rawdir$/
Add MDF = yes
Input MDF in case header keyword not found = Input rawdir$/N20090720S0081.fits Output gN20090720S0081.fits
GPREPARE: Using MDF defined in the header 2.0arcsec
GPREPARE: Taking MDF from directory gmos$data/

GPREPARE exit status: good.
----------------------------------------------------------------------------Input files:
gN20090720S0081Output files:
tmpimage13779sxGIREDUCE: Image tmpimage13779sx overscan subtracted
ERROR: segmentation violation
"if (no == mef || (no == imaccess(l_image//"[1]") && no == imaccess(l ..."
line 51: gemtools$gimverify.cl (hidden task)
called as: `gimverify (image=tmpimage13779sx)'
"gimverify(l_image)"
line 28: gemtools$gsetsec.cl (hidden task)
called as: `gsetsec (image=tmpimage13779sx, key_datsec=DATASEC)'
"gsetsec (output[ii], key_datsec=l_key_datasec)"
line 916: gmos$gireduce.cl
called as: `gireduce (inimages=N20090720S0081.fits, outpref=, outimages=tmpimage13779sx.fits, fl_over=yes, fl_trim=yes, fl_bias=no, fl_dark=no, fl_flat=no, fl_vardq=no, fl_addmdf=yes, bias=biasframe.fits, dark=, flat1=, flat2=, flat3=, flat4=, key_biassec=BIASSEC, key_datasec=DATASEC, rawpath=rawdir$, gp_outpref=g, sci_ext=SCI, var_ext=VAR, dq_ext=DQ, key_mdf=MASKNAME, mdffile=, mdfdir=gmos$data/, bpm=, gaindb=/iraf/extern/gemini/gmos/data/gmosamps.dat, sat=65000, key_nodcount=NODCOUNT, key_nodpix=NODPIX, key_filter=GRATING, key_ron=RDNOISE, key_gain=GAIN, fl_mult=yes, fl_inter=no, median=no, function=chebyshev, nbiascontam=4, biasrows=default, order=1, low_reject=3., high_reject=3., niterate=2, logfile=geminireductionlogfile.txt, verbose=yes)'
"fl_mult=yes, sat=l_sat, biasrows=l_biasrows)"
line 647: gmos$gsreduce.cl
called as: `gsreduce (inimages=@tmpfilelist3779mx, outimages=, outpref=gs, fl_over=yes, fl_trim=yes, fl_bias=no, fl_dark=no, fl_flat=no, fl_gmosaic=yes, fl_fixpix=yes, fl_cut=no, fl_title=no, fl_vardq=no, bias=biasframe.fits, dark=, flatim=, gradimage=, refimage=, key_exptime=EXPTIME, key_biassec=BIASSEC, key_datasec=DATASEC, rawpath=rawdir$, sci_ext=SCI, var_ext=VAR, dq_ext=DQ, key_mdf=MASKNAME, mdffile=, mdfdir=gmos$data/, bpm=, gaindb=default, gratingdb=gmos$data/GMOSgratings.dat, filterdb=gmos$data/GMOSfilters.dat, bpmfile=gmos$data/chipgaps.dat, key_ron=RDNOISE, key_gain=GAIN, ron=3.5, gain=2.2, sat=65000, ovs_flinter=no, ovs_med=no, ovs_func=chebyshev, ovs_order=1, ovs_lowr=3., ovs_highr=3., ovs_niter=2, nbiascontam=4, biasrows=default, logfile=geminireductionlogfile.txt, verbose=yes)'
"refimage="", nbiascontam=l_nbiascontam, biasrows=l_biasrows)"
line 762: gmos$gsflat.cl
called as: `gsflat (inflats=N20090720S0081.fits, specflat=N20090720S0081_flat.fits, fl_over=yes, fl_trim=yes, fl_bias=no, bias=biasframe.fits, rawpath=rawdir$, order=23, niterate=5)'
"# task $geminireduce = /home/rich/iraf/reductionscript.cl"
line 7: /home/rich/iraf/reductionscript.cl
called as: `geminireduce ()'
[/code:1:4b1319479d]

 
Profile Email
 Quote
fitz
 08/25/2009 08:47AM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
[quote:03bd29e56d] "ERROR: segmentation violation" generally indicates a file does not exist at the time it is being read.[/quote:03bd29e56d]I assume you mean that you see the segfault only when the file doesn't exist? A segmentation violation is generally a memory error, e.g. referencing a null pointer (such as an invalid file descriptor) or out-of-bounds on the array, it isn't specifically anything to do with file existentialism.[quote:03bd29e56d] I have confirmed that the temporary file does indeed exist after the command has been executed even though iraf appears to fail to read it while the command is being executed. [/quote:03bd29e56d]There isn't a known issue with file syncing on any platform, but on "64-bit filesystems" the file information (e.g. the inode) is a 64-bit struct and in some cases IRAF doesn't "see" it because iraf is still a 32-bit app. One way to test this is to reproduce the error, verify that the unix 'ls' command shows the file as non-empty, and then that the iraf 'dir' or 'type' command cannot see the file. In this case all you can do is wait for the 64-bit IRAF port or else move the data processing to a 32-bit filesystem partition.Another possible cause is that the data are on an NFS mounted disk, in this case there can be file sync issues that are easily solved by adding the 'sync' option to the mount.From the error trace, it appears the segfault is actually in the CL imaccess() function, but I'm not aware of any problems with this. Hope this helps.-Mike

 
Profile Email
 Quote
richardbaxter
 08/25/2009 08:47AM  
+----
Newbie

Status: offline


Registered: 08/25/2009
Posts: 6
Hi Mike,Thanks for your advice and all of your suggestions.I have now tested this script on a CentOS5.3 32bit installation on my workstation and exactly the same problem occurs as on the CentOS5.3 64bit installation on my workstation. Considering CentOS5.3 is based on FC6, and this script was working my FC6 install on my slower laptop, I must conclude this problem relates to either;a) some FC6/CentOS5.3 updates,
b) speed of PC, or;
c) some hardware/kernel incompatibility issue on my workstationNote all irafs installations are identical (workstation x64/i386, laptop i386).I will keep working on it,Cheers,Richard

 
Profile Email
 Quote
richardbaxter
 08/25/2009 08:47AM  
+----
Newbie

Status: offline


Registered: 08/25/2009
Posts: 6
I have now tested this script on an identical FC6 32bit installation on my workstation (as on my laptop) and exactly the same problem occurs as with the CentOS5.3 32bit, CentOS5.3 64bit and FC10 64bit installations on my workstation. I must conclude this problem relates to either;b) speed of PC (possibly the fact it is dual core and the laptop is single core), or;
c) some other hardware incompatibility issue on my workstationI will now be searching for commands to slow down iraf... If anyone can think of any other possible tests I could try please let me know. Thanks for you help,Richard

 
Profile Email
 Quote
klabrie
 08/25/2009 08:47AM  
++---
Junior

Status: offline


Registered: 12/13/2005
Posts: 22
Hi Richard,I am with the Gemini IRAF package group, and I would really like to get to the bottom of this.First, thanks for all the information and troubleshooting.We don't have a solution yet, but I just want to let you know that we have a couple machines very similar to yours (64-bit, FC10 & 11, 3GHz) and we will investigate the issue. Because of another similar report, we have tried to reproduce the problem on slower machines, and just like yours, everything works on those. We haven't tried the faster machines, yet.If you could email me (see below) the lpar of the gsflat or gsreduce command we could try to run exactly what your are running and see what we get.Kathleen
Email: klabrie@gemini.edu

 
Profile Email
 Quote
richardbaxter
 08/25/2009 08:47AM  
+----
Newbie

Status: offline


Registered: 08/25/2009
Posts: 6
I have found a work around to this problem;1. downgrade from gemini 1.10 to gemini 1.9(patch1).
2. if you get strange errors like "ERROR: Attempt to access undefined local variable `l_yoff'" or "ERROR: Attempt to access undefined local variable `grule'", it means that you are using the new GMOS-N B600 Grating G5307 (whose data appears not to have been available at the time of gemini software release 1.9). See http://www.gemini.edu/node/11255 for a solution, or just edit /iraf/extern/gemini/gmos/data/GMOSgratings.dat and add these lines to the gmos north area of the file;
B600+_G5307 600 461 1688 276 300 1180 0.0 50.0
B600-_G5307 600 461 1688 276 300 1180 0.0 50.0

Thanks Kathleen - I will email you the lpar contents and the exact commands I am executing shortly.Richard

 
Profile Email
 Quote
richardbaxter
 08/25/2009 08:47AM  
+----
Newbie

Status: offline


Registered: 08/25/2009
Posts: 6
(For reference only, this is another issue I found with gmos reductions when reverting to Gemini Software version 1.9 from version 1.10. When you execute gstransform and set the wavtran parameter, make sure that you do not specify ".fits" on the end of the file name, else you may get the following error:ERROR - GSTRANFORM: Wavelength or S-distortion calibration file not found
ERROR - GSTRANFORM: for image gsARCORSCIENCEFILENAME.fits[SCI,1]
ERROR - GSTRANFORM: output image tgsARCORSCIENCEFILENAME.fits will be incompleteThis may not be relevant for your own gmos reductions as you may not ever specify ".fits" extensions).

 
Profile Email
 Quote
richardbaxter
 08/25/2009 08:47AM  
+----
Newbie

Status: offline


Registered: 08/25/2009
Posts: 6
Note I have resolved my issue with gemini v1.10 (compared to v1.9) by applying the following updates;1. in login.cl, ensure that the reduction script is initiated as; 'task reductionscript = /home/rich/iraf/reductionscript.cl' (not '$reductionscript = /home/rich/iraf/reductionscript.cl')
2. in reductionscript.cl, ensure no comments (#...) are added before 'procedure reductionscript()'/'begin' OR do not list task definitions within comments
3. in reductionscript.cl, add some dummy text in between 'procedure reductionscript()' and 'begin' (Eg string file_name, struct *flist, etc)Cheers,Richard

 
Profile Email
 Quote
jturner
 08/25/2009 08:47AM  
+++++
Active Member

Status: offline


Registered: 12/29/2005
Posts: 165
Hi Richard & Mike,Sorry you haven't received further feedback from us to date (did you send the lpar etc. to Kathleen or open a helpdesk request?). The things you say resolved this don't make a lot of sense to me, but in any case I'm glad it sounds like you have it working (right?).For the record:
[quote:410c2207fa]- another possibility would be something to do with the speed of the processor (~1Ghz laptop versus ~3Ghz workstation), but I doubt this would be the case since other gemini users would have picked up any CPU speed dependent issues.
[/quote:410c2207fa]
In fact we have picked up a speed-dependent issue lately. We have seen a case where apparently IRAF's FITS kernel became corrupted because the headers of a MEF file were being updated faster than the FITS kernel (or the OS) can resolve time-stamps. Since the time stamp hasn't changed, the FITS kernel incorrectly concludes that its old cache of what the file used to look like is still valid and goes on to munge the buffer, eventually causing a memory error. At least that's how it looked to me when I tried burrowing down into the maze of code -- though I'm much less familiar with IRAF at that level than Mike or Nelson. I was able to work around that particular case by changing some header update code that was no longer needed in the CL script, but in general it may be true that this problem could crop up more as machines get faster. Any comments, Mike (sorry it took a while to report; let me know if you want more info.)?Cheers.James.

 
Profile Email
 Quote
fitz
 08/25/2009 08:47AM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
Now that you mention the issue of timestamps, I do recall that the time-resolution issue came up once before and we decided it wasn't an easy thing to fix because of the way it was ingrained in the FITS kernel design. I looked briefly at whether it was possible to simply disable the cache or else force perpetual updates, but stopped when I found a code comment that the "FK doesn't seem to work properly when the cache is disabled" and it appeared it would need more debugging. I'll add it to the list of things to look at for v2.15 but can't make any promises.In the longer term, one project being discussed is an alternative FITS kernel based on CFITSIO (we'd keep the old one for compatibility). This is one way to directly support compressed FITS and other conventions and we could simply layer IRAF conventions like inheritance on top of it. At the moment this is just an idea, no decisions have been made.As for the workarounds mentioned: Adding parameters before a begin statement doesn't affect the FITS kernel at all, it may just be adding some extra processing in the CL interpreter and changing the timing slightly so the cache problem isn't seen. The same is likely true of modifying the header update code in James' scripts and probably isn't a permanent solution either. You could put a "sleep(1)" in a script, however this wouldn't help if the problem happens within a compiled task working on an image.

 
Profile Email
 Quote
jturner
 08/25/2009 08:47AM  
+++++
Active Member

Status: offline


Registered: 12/29/2005
Posts: 165
[quote:6465a069ed]In the longer term, one project being discussed is an alternative FITS kernel based on CFITSIO (we'd keep the old one for compatibility). This is one way to directly support compressed FITS and other conventions and we could simply layer IRAF conventions like inheritance on top of it. At the moment this is just an idea, no decisions have been made. [/quote:6465a069ed]
That sounds a promising idea. The existing kernel certainly looked like it could be tricky to modify.
[quote:6465a069ed]As for the workarounds mentioned: Adding parameters before a begin statement doesn't affect the FITS kernel at all, it may just be adding some extra processing in the CL interpreter and changing the timing slightly so the cache problem isn't seen. The same is likely true of modifying the header update code in James' scripts and probably isn't a permanent solution either. You could put a "sleep(1)" in a script, however this wouldn't help if the problem happens within a compiled task working on an image.[/quote:6465a069ed]
Yes, that's what I thought. Actually, in the particular case I looked at I was able to remove the header update because it was just a workaround for an old bug. In other cases we could try sleep(). Fortunately we don't have a lot of SPP code and this seems to be showing up when making multiple header updates to MEF files (which core IRAF tasks don't really do). I hope it won't also show up within SPP tasks that add headers keywords and then do something else.Thanks,James.

 
Profile Email
 Quote
elehman
 08/25/2009 08:47AM  
+----
Newbie

Status: offline


Registered: 01/15/2010
Posts: 1
I have experienced this same problem when using gsreduce. I avoid it by hiding the output with verbose- or by hitting Control-o. This may be a superficial fix, but so far it hasn't caused any problems.

 
Profile Email
 Quote
Anonymous:
 08/25/2009 08:47AM  



useful hint. i tried it and it worked fine for me

 
 Quote
   
Content generated in: 0.33 seconds
New Topic Post Reply

Normal Topic Normal Topic
Sticky Topic Sticky Topic
Locked Topic Locked Topic
New Post New Post
Sticky Topic W/ New Post Sticky Topic W/ New Post
Locked Topic W/ New Post Locked Topic W/ New Post
View Anonymous Posts 
Anonymous users can post 
Filtered HTML Allowed 
Censored Content 
dog allergies remedies cialis 20 mg chilblain remedies


Privacy Policy
Terms of Use

User Functions

Login