STM32-P407 Ethernet randomly dropping TX packets

Started by cnlohr, January 23, 2014, 08:08:01 PM

Previous topic - Next topic

cnlohr

I am having a difficult time with reliable ethernet communications on my STM32-P407.  I am using the same core as the demo provided and am having the following situation.

I am testing this by restarting the processor, pinging it 10,000 to 60,000 times, and printing the number of pings the processor received and sent, and recording how many pings I received and sent.

Almost all the time (~90%) the packets being dropped are the TX packets.

When there is no chatter on the line, i.e. connected directly to my laptop, or into a switch with other network devices disconnected, I get virtually no dropped packets, maybe .001%.

When there is chatter on the line, i.e. random network broadcasts, etc.  My dropped packet percentage rises to ~0.04%.

When I add chatter, i.e. making a lot of broadcast packets, my dropped packet percentage rises to ~.2%.

Again, almost all drops happen on TX from the board.  I have verified that the OWN is not timing out on TX, and that there are no faults.  I now trap the following (and none are happening)  EnetDmaTx.Tx.TxDesc0.ES | EnetDmaTx.Tx.TxDesc0.FF | EnetDmaTx.Tx.TxDesc0.IPE | EnetDmaTx.Tx.TxDesc0.JT| EnetDmaTx.Tx.TxDesc0.IHE
| EnetDmaTx.Tx.TxDesc0.LSC | EnetDmaTx.Tx.TxDesc0.NC | EnetDmaTx.Tx.TxDesc0.LC | EnetDmaTx.Tx.TxDesc0.EC | EnetDmaTx.Tx.TxDesc0.ED | EnetDmaTx.Tx.TxDesc0.UF | EnetDmaTx.Tx.TxDesc0.DB.

I have noticed the following flags are set after every period: ETH_DMA_IT_TBU and ETH_DMA_IT_ET.

ALSO: As a bonus, I tested in 10MBit/s mode.

In 10mbit duplex, regular network traffic, the error ratios are higher! (almost 10x higher!)

In 10mbit half duplex, even with a chatty network, there are less dropped packets.

These rates are unacceptably high, and I have no idea how to track the lost packet bug down further.

I have checked my network, by pinging another device on the same switch, and it has 0 dropped packets.

Could it be the PHY on the P407?  Maybe the stack?  Does anyone else know if the MAC in the STM32F407 handles lots of network chatter gracefully?  Why am I getting "Transmit buffer unavailable interrupts"  after every TX packet?

Summary:
* Drop ratio is effected by link speed and half/full duplex.
* Dropped packets are almost exclusively TX.
* Packets only drop in presence of extra network traffic going into the NIC.
* Under normal conditions, dropped packet ratio is around .04% (with ~40-50 broadcast messages across the network per second)
* Only dropped packets when talking to the STM32-P407.  Other devices do not exhibit dropped packets.



I switched to my own IP stack now, and am getting the same results.  I am going to try using the tx and rx functions in stm32f4xx.h/.c instead of ethernet.c  - I just can't find any example code for that >.<.


Any help or ideas would be greatly appreciated!!!


Thanks,
Charles

cnlohr

Just for any poor soul who may happen across this thread, I found my biggest problem:  The LCOL (or LC) flag was set.  I don't know what late collisions are, but it seems if you actively search for any time the ES (or Error Summary) flag is set, almost all of the failures are due to a Late Collision.

Now - I just have to figure out why it seems as though my receive buffer is one packet.

Jan K.

Hello,

first of: yes, I know, this topic is already quite old. But I happened to have the same problem and found this post. It helped quite a lot but it still took me the better part of the day to actually solve it.

So here is what happened to me:
- Randomly dropped packets, higher droprate at higher link load
- Ethernet configured in CubeMX to use the LAN8742 Phy (I am using a Nucleo F2 Board)
- Autonegotiation enabled, advertising 100M/10M full/half (all capabilities of the Phy)
- After reading this post, I also noticed that dropped frames had the 'late collision' error set

According to my research, this bit is exclusively used in half-duplex connections. My computer told me the autonegotiation resultet in 100M full-duplex, so the bit should never be set.
I also learned that late collisions are a sign of duplex-mismatch as the full-duplex partner does not check for activity before sending. This is exactly what happened.
So, how did I get a duplex mismatch while using autonegotiation with full-duplex advertised by both sides?

It was CubeMX (of course!). (Version 4.22.1)

When I created the project, I pretty much used the default values under Configuration/ETH (LAN8742 was already set as default). After the autonegotiation, the CubeMX code checks a status register of the Phy for the result. The address of this status register (and the position of some bits) can be adjusted under ETH configuration/Advanced Parameters/Extended: External PHY Configuration. I just kept them at the default value as the correct Phy was already selected. However, the default address was 0x10 (which is the EDPD NLP / Crossover Time Register in the LAN8720). The correct register address would be 0x1F (PHY Special Control/Status Register). Thus, the code read a wrong autonegotiation result and configured everything as half-duplex.

For a LAN8720, the correct values are:
PHY special control/status register offset: 0x1F
PHY Speed mask: 0x0004
PHY Duplex mask: 0x0010
PHY Interrupt Source Flag register Offset: 1D
PHY Link down interrupt: 0x0010

Hoping this will save someone a few hours,
Jan

cf

Thanks Jan,

You definitly saved me some hours.

A little additional information:

For the KS8721B PHY on the Olimex STM32-P407 the settings are as Jan states Except:

Interrupt Source Flag Register Offset = 0x1B
PHY interrupt Link down interrupt = 0x0600

Carsten F