Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Segmentation fault, mpi and gfortran 1

Status
Not open for further replies.

FloatingFeather

Programmer
Apr 17, 2017
18
AR
Hi. I have this error which only occurs when I try to run my code in a cluster. In my desktop computer I run it in parallel without any problem.

[compute-0-2:76201] *** Process received signal ***
[compute-0-2:76201] Signal: Segmentation fault (11)
[compute-0-2:76201] Signal code: Address not mapped (1)
[compute-0-2:76201] Failing at address: 0x7fc2fd2d7e80
[compute-0-2:76201] [ 0] /lib64/libpthread.so.0() [0x318a20f710]
[compute-0-2:76201] [ 1] /opt/openmpi/lib/libmpi.so.1(opal_memory_ptmalloc2_int_free+0x2a1) [0x7fbafcf757a1]
[compute-0-2:76201] [ 2] /opt/openmpi/lib/libmpi.so.1(opal_memory_ptmalloc2_free+0xd3) [0x7fbafcf75bc3]
[compute-0-2:76201] [ 3] pinv993.x(MAIN__+0x65d) [0x434fed]
[compute-0-2:76201] [ 4] pinv993.x(main+0x2a) [0x5155ca]
[compute-0-2:76201] [ 5] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3189a1ed5d]
[compute-0-2:76201] [ 6] pinv993.x() [0x40d629]
[compute-0-2:76201] *** End of error message ***


Any idea of what could it be originating this problem? it occurs even if I run with only one processor. The cluster doesn't allow me to use the -fbounds=checks flag, and when I use it in my desktop pc it doesn't find any errors.

Thanks in advance
 
When you run it locally, are you running 32 or 64 bit? The addresses that opal_memory_ptalloc_int_free don't look like 64-bit addresses - they normally end with a 0 or 8.
 
Hi. Thanks for your reply. When I run it locally I run on a 64 bit system (ryzen 7). Even though the cluster haven't allowed me to use fcheck-bounds, I could trace where the error was originated, it's quite weird. The error occurred when deallocating an array, but I still don't understand why this happens. I could solve it just by not deallocating it, but I don't understand what the problem is.
 
Hi. Is it possible that I get the segmentation fault if somebody else is using the same node as I am using? the cluster is having problems, so I think it is a possibility that somebody else is sharing some activity in a node with me.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top