I'm using gcc 2.95.2 on multi-processor solaris box and am encountering a transient segmentation fault.
I'm creating an element in a thread and adding a pointer to it to an std::queue (and signalling a semaphore). This thread has no more access to the element (it drops the pointer, and never takes things out of the queue).
A second thread waits on the sema and then takes an element out of the queue. After processing A, it is deleted. Now, this element contains an std::list which is also deleted which in turn deletes all of its contents. Sometimes (maybe once in 10,000) there is a segmentation fault while deleting a list entry. This only happens on our multi-processor box, it runs fine on all the single-processor machines I've tried, and sometimes runs fine on the multi-processor machine.
Since its a seg fault I guessed that perhaps I was deleting an element twice (although I would expect that to cause an error everytime) and confirmed that all copy constructors perform fully deep copies, etc. No effect.
Now, if I take out the thread, so that I add to the std::queue and then immediately remove the element to perform the processing, the problem seems to go away (I tried 4 times as many iterations with no error).
Does anyone know if the std::list implementation has problems threading on multi-processor machines? If for some reason the std::list dtor were running past the end of its array it might explain the seg fault...
Any ideas?
I'm creating an element in a thread and adding a pointer to it to an std::queue (and signalling a semaphore). This thread has no more access to the element (it drops the pointer, and never takes things out of the queue).
A second thread waits on the sema and then takes an element out of the queue. After processing A, it is deleted. Now, this element contains an std::list which is also deleted which in turn deletes all of its contents. Sometimes (maybe once in 10,000) there is a segmentation fault while deleting a list entry. This only happens on our multi-processor box, it runs fine on all the single-processor machines I've tried, and sometimes runs fine on the multi-processor machine.
Since its a seg fault I guessed that perhaps I was deleting an element twice (although I would expect that to cause an error everytime) and confirmed that all copy constructors perform fully deep copies, etc. No effect.
Now, if I take out the thread, so that I add to the std::queue and then immediately remove the element to perform the processing, the problem seems to go away (I tried 4 times as many iterations with no error).
Does anyone know if the std::list implementation has problems threading on multi-processor machines? If for some reason the std::list dtor were running past the end of its array it might explain the seg fault...
Any ideas?