Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

PCIe error explanation?

Status
Not open for further replies.

CecilXavier

Technical User
Oct 2, 2007
104
US
I am trying to get a better understanding on PCIe data transmissions. Specifically the kinds of errors and what they mean. An example. On a bus I see one side throw Rcbe_ERR_COR, reply_rollover and reply_timeout. On the other side of that bus (across layer 1) I see receiver_err, reply_timeout and reply_rollover. These errors are recovering, but are enough that they are slowing the job down significantly. I thought the receiver errors were that side of the bus saying they "received" something that they had to correct. Ie , the hardware on that side recovered from the error. Is this the right thought process?
 
This is a good article to read over: https://www.design-reuse.com/articles/38374/pcie-error-logging-and-handling-on-a-typical-soc.html

Also, can you elaborate on exactly what hardware you are using, what drivers are installed, where you are reading the error logs from, and any other information? If the above article gives you the info you need, then disregard.

One other thought I had was thermal creep. Have you tried something as simple as removing the card and reseating it? Or if there are two ports, using the other to see if the errors follow?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top