Welcome to iraf.net Friday, April 26 2024 @ 03:15 AM GMT


 Forum Index > Help Desk > Applications New Topic Post Reply
 Memory has been corrupted
   
ping
 12/05/2008 03:43PM (Read 8392 times)  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hello,I am running allstar task and got this error:PANIC in `/iraf/iraf/noao/bin.redhat/x_daophot.e': Memory has been corruptedThis is the first time I am running it on my new linux machine, which has 4 times memory of my old linux machine.I am not running any other tasks. I typed "top" command and saw I have plenty of free memory. Any suggestions?Thank you!Ping Zhao

 
Profile Email
 Quote
valdes
 12/05/2008 03:43PM  
+++++
Active Member

Status: offline


Registered: 11/11/2005
Posts: 728
Hello Ping,There are two possiblities for this message. One is a bug in the code due to, well, memory being corrupted by overwritting part of memory allocated earlier and then being released.A second possibility with applications that use a lot of memory like imcombine and allstar is that the allocated pointer wrapped around a 32bit integer resulting in a negative pointer value (which is not allowed in IRAF). This is a problem we have recently found. There is no solution to this "running out of memory" but we can trap bad pointer and instead of a panic memory corrupted error there would be a normal error abort for out of memory.If it is due to running out of memory (which could also be a programming bug or and actual attempt to use too much memory) then whether or not a negative pointer problem is involved you would still need to make adjustments to the task or image sizes to reduce the amount of memory needed by allstar.I would put my money on running out of memory because allstar really is an application that needs to do things "in memory" and so uses lots of memory. If you reduce the memory usage to what you think is a small amount and you still get the error then we would need to have a way to reproduce the error here to debug it.Yours,
Frank Valdes

 
Profile Email
 Quote
valdes
 12/05/2008 03:43PM  
+++++
Active Member

Status: offline


Registered: 11/11/2005
Posts: 728
Oh, I should say that beyond a certain point it is not the memory on the machine that is the limit but the amount of memory IRAF can use in an application. So the fact you now have a machine with more memory is not necessarily a way to say you should not have run out of memory. In fact it may encourage allstar or your parameters to try and use even more memory and trigger the 32 bit pointer limit of IRAF.Frank

 
Profile Email
 Quote
ping
 12/05/2008 03:43PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
HI Frank,Thank you very much for your reply.So what do you suggest should I do to test what problem it is?Thank you!Ping

 
Profile Email
 Quote
ping
 12/05/2008 03:43PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Frank,I tried different machines. I got the same error on all the powerfull linux machines with up to 128Gb memory
Now I found one very slow solaris machine with only 2Gb memory, and it works.Does this tell us something?
How do I solve this problem?Thanks,Ping

 
Profile Email
 Quote
ping
 12/05/2008 03:43PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Frank,I don't think this is running out of memory. I just run the same task same image I run before using my old machine which worked. Now on my new machine it doesn't work.It has to be with the configuration with the machines.Thanks,Ping

 
Profile Email
 Quote
ping
 12/05/2008 03:43PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Frank,I found that allstar runs fine on 32bit linux machines but having the same error
on all the 64bit linux machines.PANIC in `/iraf/iraf/noao/bin.redhat/x_daophot.e': Memory has been corruptedIs it possible there is a bug in x_daophot.e for the 64bit machine?Thanks,Ping

 
Profile Email
 Quote
fitz
 12/05/2008 03:43PM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
[quote:6f77ccb2bd]Is it possible there is a bug in x_daophot.e for the 64bit machine? [/quote:6f77ccb2bd] See the thread https://iraf.net/phpBB2/viewtopic.php?t=88272 for a similar report. It may well be that the 32-bit daophot binary is finding some extreme behavior on 64-bit systems and these are related, but it could also be simply that on the new hardware an existing bug in the code is being triggered. I'm working on reproducing the other problem and you might try the statically linked binary mentioned in the other thread, but if this is indeed a problem caused by FP precision on 64-bit machines a proper fix won't likely be available until the 64-bit port is done. In any case, it would help to have as much information as possible (data files, parameters, etc) in order to reproduce your problem.-Mike

 
Profile Email
 Quote
ping
 12/05/2008 03:43PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,Thank you very much!I can give you my data and parameter files. Where should I put them?Thanks,Ping

 
Profile Email
 Quote
fitz
 12/05/2008 03:43PM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
Just put a tarball of data, parameters file and whatever instructions are needed to reproduce the problem in the anonftp archive ftp://iraf.noao.edu/pub or post a URL to your site.-Mike

 
Profile Email
 Quote
ping
 12/05/2008 03:43PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,I have put memory_corrupt.tar
in iraf.noao.edu/pubThe instruction is in README file.Thank you!Ping

 
Profile Email
 Quote
fitz
 12/05/2008 03:43PM  
AAAAA
Admin

Status: offline


Registered: 09/30/2005
Posts: 4040
Unfortunately (depending on your perspective), I wasn't able to reproduce the problem -- the task ran for ~71 minutes and completed with no errors. I've replaced the tarfile you uploaded with the results if you want to have a look. The task never used more than about 627MB of memory when running.This was done using v2.14.1 on a 64-bit FC5 system with 1GB RAM. Something you might try is to set a unix 'MAXWORKSET' variable to define upper limit on the memory used by the process, e.g.[code:1:ceb965972c]setenv MAXWORKSET 512[/code:1:ceb965972c]would set the limit at 512Mb of physical memory. If the corruption is being caused by allocating a pointer that overflows a 32-bit value this may change the allocation behavior to avoid it.Your README seems to indicate this happens fairly early on, but without being able to reproduce it there's not much I can say about what might be causing it, sorry.-Mike

 
Profile Email
 Quote
ping
 12/05/2008 03:43PM  
+++--
Chatty

Status: offline


Registered: 04/13/2006
Posts: 45
Hi Mike,Thank you very much!Yes, the error happens immediately when I start the job.But now
setenv MAXWORKSET 512
works!Since my machine have much more than 512Mb memory. I tried to do
setenv MAXWORKSET 1024
but this doesn't work and give the same error.The different thing is that when I
setenv MAXWORKSET 512
allstar immediately produce the sub.a.fits file, which usually were produced when the job is done. Also there is a wtxxxa.fits file.Thank you!Ping

 
Profile Email
 Quote
AstroRoyale
 12/05/2008 03:43PM  
+----
Newbie

Status: offline


Registered: 05/31/2009
Posts: 1
I'm also getting a corrupt memory message when I try to scombine a large number (~100) GHRS spectra from the HST. I've got plenty of memory, 4Gb, and htop and conky both say memory is doing fine.
Tried setenv MAXWORKSET 512, to no avail. Is there some way to get around this? Cheers,
AstroRoyale

 
Profile Email
 Quote
valdes
 12/05/2008 03:43PM  
+++++
Active Member

Status: offline


Registered: 11/11/2005
Posts: 728
A memory corruption error is usually not the same as running out of memory. This would be due to a bug in the program. I am not aware of a problem but scombine is complex and old that I would not be surprised at a problem that doesn't affect imcombine. There is not much I can suggest. I am not very familar with HST formats. Are the GHRS 1D spectra? If they are not too big and the error is reproducible you could give me the spectra, parameters, and host type and I could try and reproduce and debug it.Yours,
Frank Valdes

 
Profile Email
 Quote
   
Content generated in: 0.71 seconds
New Topic Post Reply

Normal Topic Normal Topic
Sticky Topic Sticky Topic
Locked Topic Locked Topic
New Post New Post
Sticky Topic W/ New Post Sticky Topic W/ New Post
Locked Topic W/ New Post Locked Topic W/ New Post
View Anonymous Posts 
Anonymous users can post 
Filtered HTML Allowed 
Censored Content 
dog allergies remedies cialis 20 mg chilblain remedies


Privacy Policy
Terms of Use

User Functions

Login