A20 micro eth0 problem

Started by fuzero, December 05, 2017, 04:13:45 PM

Previous topic - Next topic

fuzero

Hi all,
I have a problem on my a20-micro with kernel:
# uname -a
Linux a20-116 3.4.103-00033-g9a1cd034181a-dirty #44 SMP PREEMPT Fri Mar 10 08:50:33 EET 2017 armv7l GNU/Linux

Sometimes and ramdonly eth0 stop and i cannot ping machine but all system is working locally.

/var/log/kern.log say:


Dec  5 13:00:14 localhost kernel: [17688.240063] ------------[ cut here ]------------
Dec  5 13:00:14 localhost kernel: [17688.250884] WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x2b8/0x2c4()
Dec  5 13:00:14 localhost kernel: [17688.263966] NETDEV WATCHDOG: eth0 (sunxi_emac): transmit queue 0 timed out
Dec  5 13:00:14 localhost kernel: [17688.272706] Modules linked in: disp_ump mali_drm drm sunxi_can can_dev spidev spi_sun7i sun4i_csi0 videobuf_dma_contig videobuf_core pwm_sunxi g_ether sun4i_keyboard gpio_sunxi sunxi_cedar_mod cdc_acm sun4i_ts mali gt2005 nand nand_ecc nand_bch sg bch nand_ids mtd sunxi_gmac ump hdmi lcd disp cfbfillrect cfbimgblt cfbcopyarea
Dec  5 13:00:14 localhost kernel: [17688.307381] [<c0015418>] (unwind_backtrace+0x0/0x12c) from [<c0037464>] (warn_slowpath_common+0x54/0x64)
Dec  5 13:00:14 localhost kernel: [17688.315671] [<c0037464>] (warn_slowpath_common+0x54/0x64) from [<c00374a4>] (warn_slowpath_fmt+0x30/0x40)
Dec  5 13:00:14 localhost kernel: [17688.323423] [<c00374a4>] (warn_slowpath_fmt+0x30/0x40) from [<c047d06c>] (dev_watchdog+0x2b8/0x2c4)
Dec  5 13:00:14 localhost kernel: [17688.331384] [<c047d06c>] (dev_watchdog+0x2b8/0x2c4) from [<c0042a90>] (run_timer_softirq+0x11c/0x26c)
Dec  5 13:00:14 localhost kernel: [17688.339246] [<c0042a90>] (run_timer_softirq+0x11c/0x26c) from [<c003d6fc>] (__do_softirq+0xbc/0x14c)
Dec  5 13:00:14 localhost kernel: [17688.346143] [<c003d6fc>] (__do_softirq+0xbc/0x14c) from [<c003dbfc>] (irq_exit+0x90/0x94)
Dec  5 13:00:14 localhost kernel: [17688.352767] [<c003dbfc>] (irq_exit+0x90/0x94) from [<c000f5ac>] (handle_IRQ+0x60/0xb0)
Dec  5 13:00:14 localhost kernel: [17688.359935] [<c000f5ac>] (handle_IRQ+0x60/0xb0) from [<c00084e4>] (gic_handle_irq+0x28/0x58)
Dec  5 13:00:14 localhost kernel: [17688.367007] [<c00084e4>] (gic_handle_irq+0x28/0x58) from [<c000e900>] (__irq_svc+0x40/0x70)
Dec  5 13:00:14 localhost kernel: [17688.370771] Exception stack(0xef065f88 to 0xef065fd0)
Dec  5 13:00:14 localhost kernel: [17688.377647] 5f80:                   ffffffed 00000001 0fffc000 00000000 c086fd08 c0577a30
Dec  5 13:00:14 localhost kernel: [17688.384539] 5fa0: ef064000 ef064000 c082c710 ef064018 ef064000 00000000 001a888d ef065fd0
Dec  5 13:00:14 localhost kernel: [17688.388293] 5fc0: c000f8ec c000f8f0 60000013 ffffffff
Dec  5 13:00:14 localhost kernel: [17688.395187] [<c000e900>] (__irq_svc+0x40/0x70) from [<c000f8f0>] (default_idle+0x2c/0x30)
Dec  5 13:00:14 localhost kernel: [17688.402069] [<c000f8f0>] (default_idle+0x2c/0x30) from [<c000fbc4>] (cpu_idle+0xe4/0x118)
Dec  5 13:00:14 localhost kernel: [17688.407945] [<c000fbc4>] (cpu_idle+0xe4/0x118) from [<4056a0b4>] (0x4056a0b4)
Dec  5 13:00:14 localhost kernel: [17688.411558] ---[ end trace 17ff9fd797a92405 ]---
Dec  5 13:00:14 localhost kernel: [17688.421336] sunxi_emac sunxi_emac.0: tx time out, resetting emac
Dec  5 13:00:23 localhost kernel: [17698.241090] sunxi_emac sunxi_emac.0: tx time out, resetting emac
Dec  5 13:00:33 localhost kernel: [17708.240632] sunxi_emac sunxi_emac.0: tx time out, resetting emac
Dec  5 13:00:43 localhost kernel: [17718.240208] sunxi_emac sunxi_emac.0: tx time out, resetting emac
Dec  5 13:00:53 localhost kernel: [17728.239756] sunxi_emac sunxi_emac.0: tx time out, resetting emac
Dec  5 13:01:03 localhost kernel: [17738.239324] sunxi_emac sunxi_emac.0: tx time out, resetting emac
... and so on untill I reset


Does anyone have an idea what's the problem ?
Thanks in advance.


fuzero

New test done,
the problem is i2c !!!
When i send command like
i2cset -y -f 1 0x58 0x10 0x01
used to speak to uext MOD-IO start all problems.

Without i2c command now is one week uptime with no problem.

I try also phyton script with i2c command but have same problem.



LubOlimex

It seems that the command toggles a relay. Does the whole board turn off? What do you have attached to the relay?

Switching very big loads with electromechanical relay would cause heavy noise and you would probably face a severe version of the typical problems related to electromechanical relays. I would recommend you to either chose another method to control your load or even better think how to step it down to values that allow easier and more reliable switching via a relay. Inductive AC and DC loads are also hard for electomechanical relays where welding contacts cause problems. Normal way to solve this problem is to use clamp diodes or zener diodes in the poles of the load. However this lengthens the off delay and it does not totally eliminate the sparking in relay contacts. I am sure there is a lot of solutions that might be suggested online. There are a number of workarounds or partial solutions but they depend on your exact design and setup. There is no solution that I can recommend 100% - you will have to try different things.

If just the WIFI hangs, there might be software means to restart just that part of the board.

Best regards,
Lub/OLIMEX
Technical support and documentation manager at Olimex

fuzero

Thanks for your answer,
the problem is the load , i have attach a ring bell and may be heavy noise that makes the problem.

I have tested and the same command i2c to an unattach relay and it don't make problem.
Now I try other solution for my load.

Best regards.
Fulvio