Olinuxino A20 Lime 2 problems

Started by ealex, April 15, 2016, 11:32:41 AM

Previous topic - Next topic

ealex

Hello

I just got a new Olinuxino-Lime2 and I have some problems with it.
From the start - I accidentally applied 7.5V on the 5V power input and that might be causing some of them.

Test setup:
- Lime2 board without NAND
- 6600mAh battery pack from Olimex
- 16GB uSD card
- FTDI based USB-USART converter
- 5V/2.5A psu
- lab power supply
- "performance" governor selected
- heatsink on the cpu

Before applying 7.5V:

1. got the board powered from the battery ( one that i got with an older olinuxino-micro )
2. tried latest ARCH-linux for Lime2 - the kernel would freeze when AXP209 drivers where loaded
3. tried an older ARCH-linux that was running perfectly on A20-Micro, the kernel would freeze when AXP209 drivers where loaded
4. tried a 4.4.6 custom build kernel that is currently running in my A20-micro "nas" - it would work properly as all AXP209 support is removed from the kernel / drivers
5. tried the official Debian release - it worked properly, AXP209 drivers work ok, etc
=> mainline kernel support for AXP is useless ( i plan on making a driver for this )

All the tests where basic - boot, login, check that i can login via SSH and download a file.
The board was powered mainly from the battery and from a 5V lab psu when charging the battery ( to look at currents, power, etc)

After these tests I've accidentally applied 7.5V on the 5V only power supply pin, for ~ 10s. The charging led begun blinking quickly -> triggered me to remove the power. There was nothing attached to the board - no battery and no SD-card at that moment.

After that, the board booted properly with my custom image that was running on A20-micro -> everything seemed to work OK, including battery charging.

I've added "stress" to the board and kept it for ~30 minutes with "stress -c 2 -m 2 -d 2 -v" -> the board got quite warm ( 60 deg ) but it was stable.
I managed to get latest Lime2 ARCH running by altering the DTB and removing AXP209 related configurations -> got kernel 4.5 running properly.


The problem appears when I try to build something on the board - a new kernel for example.
The board will randomly freeze during the build process with no kernel dump, etc.
While it's building the input current tends to peak at ~350mA from 5V, as it is when running "stress", with a lot of variation, depending on what the CPU is doing.
When the freeze happens the power draw becomes very stable at 250mA, even if the idle power draw is 160mA. The cpu remains warm - at ~ 40deg. The same for the AXP chip.

After the first freeze i've checked the voltage test pads on the board:
- 5V_EXT : 5.050V as the PSU is dialed for 5.1V to cover cable losses
- IPSOUT : 4.995 and quite stable, does not changes with different loads
- 1.2V_CPU : jumping between 1.35 to 1.44-1.45 depending on the load, a lot of noise
- 1.2V_INT : clean and stable
- 3.0VA : clean and stable
- 1.5V : clean and stable
- 3.3V : clean and stable
- 2.5V : clean and stable


The only supply rail that is out of specs is the 1.2V_CPU.
With the scope I can see quite a lot of noise, related to the CPU activity. The peak-to-peak noise amplitude can get over 350mV on high-load moments. Also, the base voltage tends to drop as the cpu is working harder.
When it freezes, the noise drops to 100mV peak-to-peak, almost constant. I can only see some periodic bursts, at ~ 2 seconds interval, where the noise reaches 138mV peak-to-peak.
Looking at the 1.2V_CPU line at start-up, it seems to properly start-up at 1.2V then rises to 1.45 as UBOOT is started.

The ripple patterns are different when "stress" is used versus a build. The build generates random load patterns that could

What could cause those freezes ? I've seen several forum posts related to this behavior on "fresh" Lime2 boards.
I assume the high ripple i am seeing on the 1.2V_CPU rail can be a reason: the voltage might sometimes dip enough to corrupt RAM / internal CPU state and send the CPU to a fault handler or just lock it.

The next step for debugging this is to add a SMD tantalum in parallel with C166 / C167 to try to reduce the ripple on that supply line - maybe that is causing all the random freezes.

Should i keep looking into this ? Or I can consider that AXP209 is damaged from applying 7.5V and just get a new board ? All the other reports of Lime2 freezing give me some hope that AXP209 is still ok and that's a general problem.

ealex

Just found this: http://forum.armbian.com/index.php/topic/295-freezing-problem-with-olimex-a20-lime-2/ , page 2

I will try to make a new U-BOOT with the patch to keep the RAM at a lower speed, to see if it fixes the problem.

Cleaning up the power rails still looks like a thing to look at

olHelp

Also having a LIME2 that freezes under load (in my case: running an I2P node &moving large files).
I have not tried the custom u-Boot, but i still suspect that 2.5A may be missing the critical reserve in some cases. At least the board froze with several 1A,1.5A,2.5A cheap wall plugs. Next on my list is a 5V 4A plug, if i get around testing it.
For reference, i got a pandaboard that runs fine with a load of ~6.5 and a uptime of whenever-i-need-a-new-kernel (may be >1year) on a 4A plug

JohnS

Is it the version of uboot that's setting that too high voltage?

Is it actually too high for the chips concerned?

Either way, you need a good PSU with plenty of amps as well as stable volts.

John

ealex

the latest image i tried is latest armbian - it did not make any difference.
The power supply is more than enough. The measurements where made using my lab psu -> no cheap power plug.
Also, I had nothing connected to the board - just the ethernet and usb-serial converter

I did all the measurements early this morning, before leaving for work -> there might be some errors in there.

Yes, UBOOT seems to be setting the voltage to 1.45V instead of 1.2V. I did not look at A20's datasheet to see if 1.4V is too much.
I'll try cloning the u-boot section that's working properly on the A20-micro board, from the other forum thread it seems that the DRAM frequency is set to 480MHz for Lime instead of 384MHz for Micro. That could matter combined with the dirty 1.2V_CPU rail.

Maybe someone form Olimex has a better idea, as it seems it's not related to my board only.

Once i get back home I will
- get latest mainline UBOOT and build with 384MHz DRAM clock to see if that fixes anything
- add a tantalum cap over C166 + improve my measurement set-up, half of the noise could be just because I was not probing properly
- check where UBOOT is rising the psu rails, if it's doing that and see how the board behaves with the original 1.2V



chradev

#5
I have already tested the new A20-Olinuxino-Lime2-eMMC (HW rev. E) for a week and it behave stable.
Which HW revision and option (-4GB or -eMMC) Lime2 board you have?

I work with customized by me Armbian and succeed to add eMMC support and slow Ethernet patch to kernel 4.4.6 and U-Boot v2015-10. You can take a look on my posts:
I have also succeed to to run RPI Monitor customized to work with AXP209 which is very useful to monitor Lime2 parameters.
http://forum.armbian.com/index.php/topic/155-testers-wanted-sunxi-adjustments-for-rpi-monitor/page-3#entry7725

I cannot say something about possible fail after applying high voltage but in some forum post or in documentations around Lime2 I read that using 6 layer PCB makes possible to run RAM at 480MHz instead of 384MHz as is in Lime board. Of course using high quality PSU is important so I use 5V/4A wall adapter.

I have observed freezing only in high temperatures achieved at high CPU and network loads when Lime2 board is placed in Olimex' plastic box even I have additional devices like SATA SSD, USB-WiFI and USB-LAN.

There is a discussion about axp209 and kernel 4.x driver lack and how to use sysfs instead:
http://forum.armbian.com/index.php/topic/611-wip-axp209-mainline-sysfs-interface/page-3
so I am truly interested if you plan to work on this subject and can take part in the process.

EDIT: In by Armbian (next version with kernel 4.4.6) build environment DDR RAM frequency is set to 480MHz in U-Boot config file.

ealex

back home.
I got some snapshots on the 1.2V_CPU line while waiting for a new u-boot to build.
Also managed to catch the cpu rail when it froze.
For the measurement set-up i used very short springs soldered directly to the test point and the nearest ground connection.

I can not upload images here -> they are linked to a personal server.
Normal noise samples:


With heavy sd-card access


Starting a kernel  build


The moment it froze - you can see how much the noise drops


I will come back with an update once i get a new u-boot with slower DDR clock

ealex

update:

got it to 384MHz DDR clock - now it's stable, it did not crash during a build
the peak power usage got lower - 1.95W at most

The psu wave-forms look worse with light system load but a lot better when heavily loaded:

one of the spikes that appear with light load:


light load:


heavy load:


next step - add a 10uF tantalum cap, to see if there is any improvement or it makes the AXP209 chip even more unstable

ealex

and some samples after adding a 10uF/16V tantalum on the 1.2V_CPU

light load, the change happens when i switched governor to performance:


during a make clean


during make:


looks like it's stable enough to leave overnight under stress / make.
tomorrow i plan to change the cap to a 22uF one, test again and re-check if it's stable with 480MHz

chradev

Which HW revision and option (-4GB or -eMMC) Lime2 board you have?

ealex

#10
It's the 4GB one, no NAND chip installed, just empty footprint.
Got it 2 weeks ago straight from Olimex.



(edit - fixed board type )

olHelp

to add my 2c: the board seems stable with a 2.5a wall wart und RAM at 360Mhz. Will report back in some days

ealex

update - tried moving back to 480MHz with the added cap - no luck, froze as ever
for now - removed the added cap and dropped down to 384 and it's stable.
-> there is more to the problem that just the 1.2V_CPU rail.
the board / layout is too complex for me to debug -> i might be looking at the completely wrong point


anyway, i'm wasting too much time on it ... i wanted the board to start making a nice AXP209 all in one driver for the mainline kernel: with the official one my power supply almost caught fire when i connected the battery, as there was no way to configure the charging current.

for now it'll use it as it is, as it's stable now and does not seem to overheat even after 8 hours of "stress" - in free air, with a heatsink on the A20 chip



chradev

As I suppose your board is probably old HW Rev. C.
Look around white text "A20-OlinuXlino-Lime2" between CPU and Flash on top side of the board.
Could you confirm that?

Could you share what are the CPU and PMU temperatures at the end of the 8 hours of "stress" test?

I have the same problems with 2x boards A20-OlinuXlino-Lime2-4GB HW rev. C and similar once with 2x boards A20-OlinuXlino-Lime2-eMMC rev. HW E. You can take a look on my posts:
https://www.olimex.com/forum/index.php?topic=5180.0

ealex

I confirm it's revision C - it was partially obscured by the heatsink

i don't have any tool to properly measure the temperature now, i will reply here once i get something usable.