MOD-ENC28J60 and Linux (Debian/GNU)

Started by leo, December 03, 2012, 05:43:03 pm

Previous topic - Next topic

lorenzo

Hi viniciusfre,

yes, I think you are right. It is better to use spi_transfer_buf instead of data. This morning I'm going to try with this code modification and let you know if something will change.

RX still have problems, maybe it could be something related to DMA.

Thanks.

lorenzo

Hi all,

thanks to the help of viniciusfre my enc28j60 is working fine now!
It looks like it was the data buffer which causes the problems and broke some data in the packet. I think the data buffer is valid only locally, so during the path to the spi it is not valid anymore...

You can find all the files you need here: https://dl.dropbox.com/u/96533863/enc28j60_v1.tar.gz

Please try it and let me know!

Thnak you again!
Lorenzo

lorenzo

Hi all,

please use this version: https://dl.dropbox.com/u/96533863/enc28j60_v2.tar.gz

I removed dmesg debug so network is faster and more usable...

Thanks.
Lorenzo

vinifr

Hi Lorenzo.

Nice! ;D It is good news, because I plan to buy this Module for my project.

I was not sure, but I suspected that 'u8 *data' was being overwritten, because it is a pointer variable.

Good!  ;)

Try submit this patch the to linux-mainline or linux-sunxi

lorenzo

Hi viniciusfre,

do you know who is the maintainer of the linux-sunxi kernel?

Thank you again for the help ;) :D

Lorenzo


leo

January 30, 2013, 12:35:51 pm #36 Last Edit: January 30, 2013, 12:38:46 pm by leo
Awesome work!  Thanks a lot!
I tested the patch today with the latest 3.0 und 3.4 kernels and your script.bin
At the beginning everything works like a charm.
Then i started testing with some massive traffic and huge file transfers and ran into troubles.
I tried some flood ping.

At the beginning everything looks good.


eth0      Link encap:Ethernet  HWaddr de:5d:5c:bd:5b:71 
          inet addr:192.168.192.215  Bcast:192.168.192.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:74585 errors:0 dropped:40 overruns:0 frame:0
          TX packets:74131 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7606752 (7.2 MiB)  TX bytes:7264290 (6.9 MiB)
          Interrupt:28


but then...


[  125.900000] [spi]: drivers/spi/spi_sunxi.c(L899) cpu tx data time out!
[  133.050000] ------------[ cut here ]------------
[  133.050000] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x2a8/0x2d0()
[  133.060000] NETDEV WATCHDOG: eth0 (enc28j60): transmit queue 0 timed out
[  133.060000] Modules linked in: enc28j60
[  133.070000] [<c003a450>] (unwind_backtrace+0x0/0xfc) from [<c00582a0>] (warn_slowpath_common+0x4c/0x64)
[  133.080000] [<c00582a0>] (warn_slowpath_common+0x4c/0x64) from [<c005834c>] (warn_slowpath_fmt+0x30/0x40)
[  133.090000] [<c005834c>] (warn_slowpath_fmt+0x30/0x40) from [<c0438950>] (dev_watchdog+0x2a8/0x2d0)
[  133.100000] [<c0438950>] (dev_watchdog+0x2a8/0x2d0) from [<c0064b20>] (run_timer_softirq+0x124/0x280)
[  133.100000] [<c0064b20>] (run_timer_softirq+0x124/0x280) from [<c005e4dc>] (__do_softirq+0x88/0x118)
[  133.110000] [<c005e4dc>] (__do_softirq+0x88/0x118) from [<c005e760>] (irq_exit+0x60/0x68)
[  133.120000] [<c005e760>] (irq_exit+0x60/0x68) from [<c002e034>] (asm_do_IRQ+0x34/0x84)
[  133.130000] [<c002e034>] (asm_do_IRQ+0x34/0x84) from [<c0033e48>] (__irq_svc+0x48/0x12c)
[  133.140000] Exception stack(0xc06c7f78 to 0xc06c7fc0)
[  133.140000] 7f60:                                                       00000000 00000001
[  133.150000] 7f80: c06c7fc0 00000000 c06c6000 c0711b44 c06cb414 c06cb40c 40004059 413fc082
[  133.160000] 7fa0: 00000000 00000000 00000000 c06c7fc0 c0035448 c003544c 60000013 ffffffff
[  133.170000] [<c0033e48>] (__irq_svc+0x48/0x12c) from [<c003544c>] (default_idle+0x24/0x28)
[  133.180000] [<c003544c>] (default_idle+0x24/0x28) from [<c00358b0>] (cpu_idle+0x80/0xb4)
[  133.180000] [<c00358b0>] (cpu_idle+0x80/0xb4) from [<c00089d8>] (start_kernel+0x290/0x2e0)
[  133.190000] [<c00089d8>] (start_kernel+0x290/0x2e0) from [<40008054>] (0x40008054)
[  133.200000] ---[ end trace 79ba4c5eeb54235c ]---
[  133.210000] net eth0: link down
[  133.220000] net eth0: link up - Half duplex



eth0      Link encap:Ethernet  HWaddr de:5d:5c:bd:5b:71 
          inet addr:192.168.192.215  Bcast:192.168.192.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:97961 errors:0 dropped:41 overruns:0 frame:0
          TX packets:96774 errors:1 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9990404 (9.5 MiB)  TX bytes:9483216 (9.0 MiB)
          Interrupt:28


The interface is still working but i get more link down and link up messages and TX errors are counting up.


[ 1153.010000] net eth0: link down
[ 1153.060000] net eth0: link up - Half duplex
[ 1217.300000] [spi]: drivers/spi/spi_sunxi.c(L899) cpu tx data time out!
[ 1225.010000] net eth0: link down
[ 1225.060000] net eth0: link up - Half duplex
[ 1239.410000] [spi]: drivers/spi/spi_sunxi.c(L899) cpu tx data time out!
[ 1245.010000] net eth0: link down
[ 1245.060000] net eth0: link up - Half duplex
[ 1277.360000] [spi]: drivers/spi/spi_sunxi.c(L899) cpu tx data time out!
[ 1285.010000] net eth0: link down
[ 1285.060000] net eth0: link up - Half duplex
[ 1367.720000] [spi]: drivers/spi/spi_sunxi.c(L899) cpu tx data time out!


Is it just a problem with my hardware? Can you please try if you get the same results when flood pinging or sending big files to the board.

Kind Reguards

Leo

vinifr

January 30, 2013, 01:32:33 pm #37 Last Edit: January 30, 2013, 01:53:21 pm by viniciusfre
Hi Leo,

This error did not occur on the driver enc28j60.c, but on the spi_sunxi.c:

spin_lock_irqsave(&aw_spi->lock, flags);
for(; tx_len > 0; --tx_len) {
   writeb(*tx_buf++, base_addr + SPI_TXDATA_REG);
            }
   spin_unlock_irqrestore(&aw_spi->lock, flags);
   while(aw_spi_query_txfifo(base_addr)&&(--poll_time > 0) );/* txFIFO counter */
   if(poll_time <= 0) {
   spi_wrn("cpu tx data time out!\n");
}


Means that a timeout happened, poll_time reached zero.

This is because not being used DMA:

if(t->len <= BULK_DATA_BOUNDARY) {


Maybe now you can increase spi speed, .max_speed_hz = 20000000,

lorenzo

Hi viniciusfre, Leo

right viniciusfre, and I also have that kind of error that slow down the network performance.
Do you think that SPI sent without DMA is an issue, or the code is correct?

Why it shouldn't use DMA?

Lorenzo

vinifr

January 30, 2013, 02:01:35 pm #39 Last Edit: January 31, 2013, 07:43:17 am by vinifr
The platform driver spi_sunxi.c uses DMA only when t->len is greater than BULK_DATA_BOUNDARY(64)

if(t->len <= BULK_DATA_BOUNDARY) {
...
else {
            //spi_msg(" tx -> by dma\n");
            #if defined(CONFIG_SUN4I_SPI_NDMA) || defined(CONFIG_SUN5I_SPI_NDMA)
            aw_spi_sel_dma_type(0, base_addr);
            #else
            aw_spi_sel_dma_type(1, base_addr);
            #endif

Maybe now you can increase spi speed, .max_speed_hz = 20000000,

There is still another problem, I think you should keep this line also:
memcpy(data, &rx_buf[SPI_OPLEN], len);
because the first byte(data[0]) can be a dummy byte. RX packets:74585 errors:0 dropped:40 overruns:0 frame:0... 40 packets was dropped!

Enc28j60' datasheet says nothing about it, but the driver' author may be right.

lorenzo

Hi vinifr,


memcpy(data, &rx_buf[SPI_OPLEN], len);

this code could be right when using a single message for TX and RX, like it was in the original code. Now rx_buf should contain only RX data: I tried this code but it brings to RX errors and no valid frame reception.

Now the problem seems to be related to spi tx packets and timeout...

I already update max speed to 20MHz.

Lorenzo

vinifr

Quote from: lorenzo on January 31, 2013, 05:54:35 pm
Hi vinifr,


memcpy(data, &rx_buf[SPI_OPLEN], len);

this code could be right when using a single message for TX and RX, like it was in the original code. Now rx_buf should contain only RX data: I tried this code but it brings to RX errors and no valid frame reception.

Now the problem seems to be related to spi tx packets and timeout...

I already update max speed to 20MHz.

Lorenzo


Ok. The correct code is memcpy(data, rx_buf, len);

Strange, had you problems with dropped packages like Leo? Remember, enc28j60 is 10BASE-T, so your download rate must be less than 10Mbits/s. 

You guys have any other idea? :P

lorenzo

Hi vinifr,

yes I also have similar problem.
I modified spi_sunxi this way:


if(tx_buf){
        if(t->len <= 0) { //BULK_DATA_BOUNDARY) {


and things get better, I have a more stable and reliable connection... Removing non-DMA tx generates now these messages:


[ 2813.780000] dma0: IRQ with no loaded buffer?
[ 2821.010000] net eth1: link down
[ 2821.070000] net eth1: link up - Half duplex
[ 2837.240000] dma0: IRQ with no loaded buffer?
[ 2845.010000] net eth1: link down
[ 2845.060000] net eth1: link up - Half duplex


It seems that non-DMA tx generate timeout quite often, and DMA transfer bring to DMA irq issues...
I don't know how to overcome this problem... Any ideas?

Lorenzo

vinifr

February 01, 2013, 12:22:54 am #43 Last Edit: February 01, 2013, 03:15:29 pm by vinifr
Hi,

You enable only tx DMA, but do not rx DMA. I found the snippet of code:
case SW_DMALOAD_NONE:
printk(KERN_ERR "dma%d: IRQ with no loaded buffer?\n",
       chan->number);
break;

And your if accept empty buffer:
if(t->len <= 0) { //BULK_DATA_BOUNDARY) {

About tx timeout I suggest this alteration:

aw_spi_start_xfer(base_addr);
// write
if(tx_buf){    
if(t->len <= BULK_DATA_BOUNDARY) {           
            unsigned temp_tx_len = tx_len;
            unsigned int poll_time = 0x7ffff;
            //spi_msg(" tx -> by ahb\n");
            spin_lock_irqsave(&aw_spi->lock, flags);
            for(; tx_len > 0; --tx_len) {
                writeb(*tx_buf++, base_addr + SPI_TXDATA_REG);
            }
            spin_unlock_irqrestore(&aw_spi->lock, flags);
            while( temp_tx_len &&(--poll_time > 0) )/* txFIFO counter */
            {
  if(aw_spi_query_txfifo(base_addr))
  {
      --temp_tx_len;
          poll_time++;
  }
            }
            if(poll_time <= 0) {
                spi_wrn("cpu tx data time out!\n");
            }


Wow, this will not work ever? But do not give up, okay?  ;D ;)

lorenzo

Hi vinifr,

sorry but I had some other work to do and I have tried only today your patch:


aw_spi_start_xfer(base_addr);
// write
if(tx_buf){    
if(t->len <= BULK_DATA_BOUNDARY) {           
            unsigned temp_tx_len = tx_len;
            unsigned int poll_time = 0x7ffff;
            //spi_msg(" tx -> by ahb\n");
            spin_lock_irqsave(&aw_spi->lock, flags);
            for(; tx_len > 0; --tx_len) {
                writeb(*tx_buf++, base_addr + SPI_TXDATA_REG);
            }
            spin_unlock_irqrestore(&aw_spi->lock, flags);
            while( temp_tx_len &&(--poll_time > 0) )/* txFIFO counter */
            {
  if(aw_spi_query_txfifo(base_addr))
  {
      --temp_tx_len;
          poll_time++;
  }
            }
            if(poll_time <= 0) {
                spi_wrn("cpu tx data time out!\n");
            }


This patch can't work because aw_spi_query_txfifo(base_addr) return 0 if the TX queue has no more bytes to send. So if there are 0 bytes in the queue the TX is done. This code always wait for timeout before stop the loop, and this is not good.

It seems that tx timout errors were triggered by 2 bytes SPI transfer, that actually should never happen... Where 2 bytes transfers come from?

Lorenzo