Welcome to iraf.net Friday, March 29 2024 @ 08:00 AM GMT


 Forum Index > Help Desk > Applications New Topic Post Reply
 PANIC ... Memory has been corrupted...
   
ping
 01/19/2007 07:56PM (Read 9490 times)  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hello,I try to run an IRAF/daophot/allstar job. The job takes just 1Gb memory. The Linux machine I am using has much more than 1Gb free memory. But job always die with the error:PANIC in `/opt/IRAF2.12/iraf.pkg/noao/bin.redhat/x_daophot.e': Memory has been corrupted
ERROR: Abnormal termination of child process 'noaobin$x_daophot.e'Any suggestions how to solve this problem?Thanks for help!Ping Rolling Eyes

 
Profile Email
 Quote
fitz
 01/19/2007 07:56PM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
Ping,Assuming this isn't a memory bug in the task, the first thing that comes to mind is that you're either exceeding a personal resource limit on the size of the executable. I also notice a 'v2.12' in the pathname and depending on exactly which version of 2.12 you're using this may be more likely. One thing to try is to reset the 'stacksize' limit before you login to the cl, e.g.[code:1:9155368e44]prompt> limit stacksize unlimited
prompt> cl[/code:1:9155368e44](If you use BASH the equivalent command is "ulimit -s unlimited"). This is done automatically in recent 'cl' command scripts but if you do it before you login (or have it done in your .cshrc) it'll have the same effect.If that doesn't fix it please post more about what linux distro and version you are using. There was a feature introduced in Fedora called 'exec-shield' that affected memory for IMFORT tasks but as far as we know didn't harm SPP tasks. However what it did was basically add an offset to the memory pointer and so your 1GB may be in the upper end of the memory address range and causing problems. Use the banner bar to do a forum search on 'exec-shield' to find out how to disable it. Hope this helps.Cheers,
-Mike

 
Profile Email
 Quote
ping
 01/19/2007 07:56PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,Thanks for your reply!First, I did start iraf with
"limit stacksize unlimited ; cl"
But that didn't help.I have tried this task on several machines. They all have:
Fedora Core release 4 (Stentz)
All the machines give this error:
PANIC in `/iraf/iraf/noao/bin.redhat/x_daophot.e': Memory has been corruptedexcept one that can run the task. Ironically, it's the one with the smallest memory (2Gb).I did a forum search of 'exec-shield' but didn't find anything.
Then I did a google search and found:
http://www.fedorafaq.org/fc1/
http://www.dummies.com/WileyCDA/DummiesArticle/id-2900.htmlIf this problem is really caused by 'exec-shield', I would have to login as root to disable it, right?Thanks,Ping

 
Profile Email
 Quote
fitz
 01/19/2007 07:56PM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
Ping,Yes, disabling the exec-shield requires root permission. It can be made permanent so you dont' need root every time you run IRAF. I'm not sure why you couldn't find any forum search results, I see about a dozen. In any case, it isn't certain this is the problem here, it could just be that there is a memory bug only triggered when memory grows that large. Does processing in smaller sets of images work reliably?You should be able to define a MALLOC_CHECK_ environment variable to do some error checking, but unless the code is compiled for debugging the most you'd get out of this is an indication that it is indeed a heap corruption. Is there anything else you can think of that might give me a clue? This doesn't sound like any known issue.-Mike

 
Profile Email
 Quote
ping
 01/19/2007 07:56PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,I just tried a few more machines and they all give me this error:
PANIC in `/iraf/iraf/noao/bin.redhat/x_daophot.e': Memory has been corruptedYes, they can run smaller images fine.
Since disabling the exec-shield requires root permission, before I ask our system people to help on this, I want to be sure this is what causing the problem.
Is there a way to check the status of exec-shield without root login?How do to the MALLOC_CHECK_ environment you suggested?Thanks,Ping

 
Profile Email
 Quote
ping
 01/19/2007 07:56PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,It seem the problem is not related to exec-shield.I logged into each machine and typed:
"more /proc/sys/kernel/exec-shield"The one machine has no problem returned 9
The machine with problems returned either 9 or 0.Is this the right way to check the status of exec-shield?
If so, then the problem is not related to exec-shield.Thanks for help,Ping

 
Profile Email
 Quote
ping
 01/19/2007 07:56PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,I have asked our syshelp to disable exec-shield. But it did not solve the problem.
So the panic error is cause by something else.Any suggestions?Thanks,Ping

 
Profile Email
 Quote
fitz
 01/19/2007 07:56PM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
Hi Ping,The error is pretty generic and without being able to reproduce it myself I can't really say much about what might be causing it, other than there isn't a known problem. Tools like ElectricFence and Valgrind (and the "setenv MALLOC_CHECK_ 2" trick) can be used to trap a memory over/underflow at the time they happen but are only really useful when you have attached a debugger to the code.Turning off the ALLSTAR 'cache' parameter may help, otherwise I'd need some way to reprodue the error (e.g. tell me a list of 200 dev$pix images and setting this/that parameter will do it).Cheers,
-Mike

 
Profile Email
 Quote
ping
 01/19/2007 07:56PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike!Turn off cache seems working! So far I haven't got the error.
What's the effect of cache? Should I just keep it off?Thanks,Ping

 
Profile Email
 Quote
fitz
 01/19/2007 07:56PM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
Ping,The 'cache' parameter tells the task to try to keep (up to) three images plus a starlist in memory to improve the processing time (as opposed to hitting the disk repeatedly). Practically speaking, what this does is consume a lot of the physical memory and if your images are very large you may be using a lot of swap space and losing any benefits, it may also be amplifying the effect of a memory leak and leading to the error (just a guess). In any case, if it works without the cache then keep going with that.Cheers,
-Mike

 
Profile Email
 Quote
ping
 01/19/2007 07:56PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,So it takes longer to run the job when I turn the 'cache' off.
I am using a machine that has plenty free physical memory, hoping to run it as fast as possible. So ideally I should turn the cache on, right?
Even using three times the memory, it's still far less than the memory available.
But because of this error, I have to turn cache off and settle for slower speed.Ping

 
Profile Email
 Quote
   
Content generated in: 0.23 seconds
New Topic Post Reply

Normal Topic Normal Topic
Sticky Topic Sticky Topic
Locked Topic Locked Topic
New Post New Post
Sticky Topic W/ New Post Sticky Topic W/ New Post
Locked Topic W/ New Post Locked Topic W/ New Post
View Anonymous Posts 
Anonymous users can post 
Filtered HTML Allowed 
Censored Content 
dog allergies remedies cialis 20 mg chilblain remedies


Privacy Policy
Terms of Use

User Functions

Login