[A20-SOM204] Automatic shutdown (probably issues with AXP209)

Started by dpierleo, January 24, 2020, 05:21:52 PM

Previous topic - Next topic

dpierleo

Hi,

I'm working with a A20-SOM204 mounted in an evaluation board A20-SOM204-EVB with the last official Olimex image:

Armbian_5.92.4_Olinuxino-a20_Ubuntu_bionic_next_5.2.21_desktop

The system often makes by itself an automatic shutdown and then it hangups.

I tried with two evaluation board, two SOM204, two sdcard with the same last official image, but the wrong behavior is the same: also after few seconds from startup, the system automatically shutdown.

I have done several test but I did not manage to find a logic for replicating the bug. It appears randomly and sometimes continuously so that it makes impossible to work with the system.

Further I noted that the button PWR_BUT in the evaluation board often does not work (neither with a short pression nor with a long pression).

Is there any problems with the ic AXP209 that manages power in the board? Is there any patch to apply as overlay for that device?

Thank you in advance for your help

JohnS

Just wondering... what do you use to supply power as that would be #1 suspect (whether noise, sags, or whatever).

I suspect that PWR_BUT needs some magic in the software if you want it to do something useful.  (The schematic/etc probably says which bit of which port it's connect to, so at least that way you can go about seeing it.)

John

LubOlimex

So sometimes the setup works fine? When exactly the problem occurs? How do you restart the board? If you remove the power supply you need to wait 5 seconds before applying the power jack again - else some capacitors would still be charged and this would lead to improper voltage levels.

What is this about the button - is it a mechanical fault with it - like plastic being stuck? 
Technical support and documentation manager at Olimex

dpierleo

Hi LubOlimex,

Quote from: LubOlimex on January 27, 2020, 09:00:04 AMSo sometimes the setup works fine? When exactly the problem occurs?

The problem is not setup step. The system always boots up correctly. The problem is that, when the system is running, at a certain point an automatic shutdown starts without any command from user, making all the stops of the services and hanging up at the end of sequence (as if the user had performed a "shutdown now" command but he did not!). I tried with two different sets of devices (two evb, two som204 and two sdcard with the same official image) to exclude any specific fault of hardware.

We noted that the problem did not occur with an old kernel 3. For this reason I thought that a problem with driver of AXP209 in new kernel 5 could be there.

Quote from: LubOlimex on January 27, 2020, 09:00:04 AMHow do you restart the board?

We tried in different ways: with RESET button, with PWR_BUT button (but sometimes it does not work and this is another problem that in my opinion could be related to previous one).

Quote from: LubOlimex on January 27, 2020, 09:00:04 AMWhat is this about the button - is it a mechanical fault with it - like plastic being stuck?

It is not a mechanical fault: I tried also to make a short circuit manually (bypassing the switch) and the system did not shutdown. Please note also that I used two sets of hardware: different som, different evb but identical official image from Olimex.

LubOlimex

Based on your description seems mainly like a software issue but it can still be a powering issue especially if fault occurs during stressing tasks. Are you sure your 5V power supply is powerful enough, can it provide at least 2A of current? What other peripherals do you power through the board?

Can you provide us logs that show the boards behavior before and during the problem? Maybe our Linux personnel can analyze these logs. You might want to upload them else where and provide download links here.
Technical support and documentation manager at Olimex

dpierleo

Quote from: LubOlimex on January 27, 2020, 12:11:51 PMAre you sure your 5V power supply is powerful enough, can it provide at least 2A of current?

My power supply provides 5V and 3A of current.

Quote from: LubOlimex on January 27, 2020, 12:11:51 PM...especially if fault occurs during stressing tasks

The fault occurs immediately after boot up (I do not even log in) and with original image without any software added by me.

Quote from: LubOlimex on January 27, 2020, 12:11:51 PMWhat other peripherals do you power through the board?

In my test there are no other peripherals added: only evb and som connected to power supply and serial for debug.

-----

Normally I do not use the RESET button to restart the system but I noticed that there is a sequence that uses that button, with which I can replicate incorrect behavior:

1) power the system
2) when booting is completed (login prompt appears) I press RESET button for 2 seconds
3) the system restarts
4) when booting is completed (login prompt appears) I press PWR_BUT button and system makes a shutdown
5) when "reboot: Power down" appears I press PWR_BUT button for 2 seconds
6) the system restarts but now the wrong behavior appears so after few seconds by login prompt the system makes an auto shutdown by itself

-----

At the link:
https://app.box.com/s/p80h5awdqzeo14csfc3bfyfpsyhyl9u5

you can find:
1) auto_shutdown.mp4: a video that shows you what happens when wrong behavior appears (after boot up in the video, wait few seconds to see the auto shutdown sequence of the system);
2) board_setup.jpg: a picture of the setup (you can see that are connected only the power cable and the serial cable);
3) power_supply_label.jpg: the label of my power supply;
4) from /var/log folder of filesystem the following log files: armbian-hardware-monitor.log, auth.log, kern.log and syslog (I cutted off lines referred to days until yesterday).

Thank you for your help

LubOlimex

I did what you did exactly and it doesn't turn off. Used the same board revision, downloaded the same image on a fresh card. Same sequence:

1. Let it boot to login
2. Brief press RESET button
3. Let it boot to login
4. Brief press PWR_BUT
5. Let it shut down and reach "reboot: Power down"
6. Hold PWR_BUT to turn on
7. Let it boot to login
8. Wait for hang up

At this point at your side you say the board shuts down; it doesn't shut down here.

Can you confirm that both boards that you have experience the same problem in the same sequence listed above?

What is the hardware revision of the bottom board A20-SOM204-EVB?

The only thing different at first sight is that I used laboratory power supply. But it can be something else:

Can you check if the A20-SOM204 is well-aligned in the slot of A20-SOM204-EVB? Can it move left or right? Try different positions in the slot.

What are the chances of faulty PWR_BUT (some sort of mechanical fault that keeps it pressed or something similar)?
Technical support and documentation manager at Olimex

dpierleo

Hi LubOlimex,

Quote from: LubOlimex on January 28, 2020, 09:50:01 AMAt this point at your side you say the board shuts down; it doesn't shut down here.
Can you confirm that both boards that you have experience the same problem in the same sequence listed above?

As I told you in a previous message, I cannot always replicate the problem. The sequence that I explained to you is only the closest method I found to obtain the incorrect behavior but this sequence does not always lead me to the problem, exactly as happens to you.

Quote from: LubOlimex on January 28, 2020, 09:50:01 AMWhat is the hardware revision of the bottom board A20-SOM204-EVB?

Rev.B 2017

Quote from: LubOlimex on January 28, 2020, 09:50:01 AMWhat are the chances of faulty PWR_BUT (some sort of mechanical fault that keeps it pressed or something similar)?

I think I can exclude any mechanical fault of the button for the reasons explained below.

-----

I am trying to find a logic for replicating the wrong behavior and I am doing several tests. I would like to explain to you my tests and ask you to do the same tests and see if the same behavior occurs at your side.

I always do the same sequence:
1. Let it boot to login
2. When prompt of login appears wait few seconds (see explanation below)
3. Brief press PWR_BUT
4. Let it shut down and reach "reboot: Power down"
5. Hold PWR_BUT to turn on
6. Let it boot to login
7. Repeat sequence from point 1

I saw that if in the point 2 I wait a short time (2-4 seconds) before pushing the button, then PWR_BUT works well at next point 3. Instead if in the point 2 I wait for a long time before pushing the button (15 seconds in one evb+som204, 20 seconds in another evb+som204) then PWR_BUT does not work (in this latter case, when I am stuck at point 3, I have to push RESET button to restart the sequence).

I did the sequence with short time for at least 20 times and PWR_BUT always worked well. I did the sequence with long time for at least 20 times and PWR_BUT never worked well.

I repeated all tests above for two sets of (board evb + som204) with the same official image and I obtained the same behaviors. The only different thing I noted is that long time is different: 15 seconds or 20 seconds depending on the set of evb+som204.

These tests are also a proof that button is not defective because when I wait for short time PWR_BUT works well for many times continuously (both in point 3 and in point 5).

JohnS

Bearing in mind how it changes when you wait for a different time it's hard to see how it can be anything but software.  If you figure out what causes it please post a follow-up!

John

dpierleo

Hi LubOlimex,

Quote from: dpierleo on January 28, 2020, 04:37:08 PMI saw that if in the point 2 I wait a short time (2-4 seconds) before pushing the button, then PWR_BUT works well at next point 3. Instead if in the point 2 I wait for a long time before pushing the button (15 seconds in one evb+som204, 20 seconds in another evb+som204) then PWR_BUT does not work (in this latter case, when I am stuck at point 3, I have to push RESET button to restart the sequence).

I did the sequence with short time for at least 20 times and PWR_BUT always worked well. I did the sequence with long time for at least 20 times and PWR_BUT never worked well.

Please, could you tell me if you tried to do the same tests? Have you obtained the same results?

It is important to know if there is some bug in AXP209 driver.

Thank you in advance

selfbg

Hi,

This issue is not an AXP209 driver bug. Actually it's very basic issue, that you've missed (also me for some hours), because you don't have attached HDMI cable.

When you press the PWR button within 2,3 seconds, the shutdown works fine. This is because XFCE4 is not started yet.

After 15 seconds, xfce4-power-monitor maps the key. After that, pressing the button starts logout action button - do you want to hibernate, logout, suspend, etc.

There are several workarounds.
  • Disable lightdm
    If you're not planning to use X, simply disable lightdm:
    systemd disable lightdm
    systemd stop lightdm
  • Change the default xfce4 action
    After you create a non-root user, login-in with it and change this file:
    .config/xfce4/xfconf/xfce-perchannel-xml/xfce4-power-manager.xml

    Modify this
    <property name="power-button-action" type="uint" value="something"/>
    to
    <property name="power-button-action" type="uint" value="4"/>

    Save and reboot the board.


dpierleo

Hi selfbg,

thank you for your suggestions, following your advice we found the source of the problem: xfce4-power-management wrongly detects battery presence even if nothing is connected. So, at default setting, the system decides to shutdown when it reads a battery level under a certain threshold (low battery).
As we can understand, the driver of AXP209 does not work correctly because in "Xfce Power Manager" window, in the tab "Devices", we get not consistent data.
Please see the following images:





Our scenario:
1) no usb power connected
2) no battery
3) ac power connected

As you can see, ac-power is not shown at all. Battery seems connected and its voltage level changes randomly. We discovered that the automatic shutdown that occurs in our equipment depends on this strange behavior: battery charging/discharging wrong readings.
Our system executes a critical shutdown because ac-power seems to be not present and battery-power seems to be too low to mantain the system active.

1) Is there any way to completely disable battery management in xfce4? We will never use the battery.

2) Is there any way to make appear/enable ac-power-supply in "Xfce Power Manager" window? If it is correctly identified the system even with wrong voltage reading of battery level will not execute critical shutdown.

Thank you in advance for your help

dpierleo

Hi selfbg,

We solved the issues related to line power and bad management of battery charge level!
First of all I tried to remove module axp20x_battery using command:

modprobe -r axp20x_battery
In this way I obtained that the system does not shutdown automatically anymore.
But I wanted to inquire into the origin of the problem and I continued to look for a different and more consistent and right solution.
At the end I discovered that the kernel in your image is compiled without the module axp20x_ac_power so I recompiled the kernel enabling it:

<M> X-Powers AXP20X and AXP22X AC power supply driver
In this way all works fine, because the system recognizes ac-power-line plugged in and does not shutdown even if the battery is very low.

Why the driver axp20x_battery reads a battery level low even if the battery is not present, keeps wrapped in mistery! Probably that driver works badly.

Following picture shows how xfce appears after enabling of ac-power-line driver:


Following picture shows how xfce manages the level of missing battery in wrong way:


Note that I have always done the tests not only in our board but also in evaluation board and I confirm you that the problem of the automatic shutdown is also present in the evaluation board with official olimex image because it depends both on missing axp20x_ac_power driver and the wrong reading of axp20x_battery driver.

Thank you for your help