diagnosing A20 OLinuXino Lime2 lock-ups?

Started by wu-lee, August 06, 2020, 05:16:03 PM

Previous topic - Next topic

wu-lee

I have an A20 OLinuXino Lime2 bought via Freedombox, as per:

https://wiki.debian.org/FreedomBox/Manual#FreedomBox.2FHardware.2FA20-OLinuXino-Lime2.A20_OLinuXino_Lime2

Subsequently I've installed NextCloudPi (an Armbian derivative) on it. (Versions below)

It works fairly well overall, although it struggles a bit and Nextcloud isn't very responsive. I've added an external 2.5 inch hard drive, on SATA, powered via the connector on the board. (This isn't solid state.)

The big problem I have is that it regularly freezes. This is typically not accompanied by any error messages, either in the logs (dmesg, kern.log, syslog etc.) or on the console.  The console just shows the log-in prompt, but the display is frozen, there is no keyboard response or networking etc.

The power supply it came with was only a 1A supply, which I understand is low if there's a hard disk attached. Therefore I bought a 5V 3A power supply.

However, a month or so later, this has not solved the problem. For now I have the system shut itself down each night and a timed supply power-cycles it shortly after, rebooting it. Despite this, sometimes the system freezes after a few hours of running, with no pattern I can see.

I've used armbianmonitor to try and check for low voltages watching via a remote ssh connection - I don't have the logs to hand but they didn't seem to drop below 5V before a crash. The CPU temperatures likewise didn't seem to spike.

So I'm looking for ways to positively identify the cause. Can anyone suggest a good approach?

Thanks!

olimex

try using our images from images.olimex.com , not armbian and there will be no freezes

wu-lee

Thanks - although the reason I'm using armbian is that this is a pre-built image for nextcloudpi, which includes more than just the kernel and base OS.

To use olimex images, I may have to re-supply the missing pieces, which might not be trivial to get back to the state where I can treat it as nextcloudpi again (but even finding out would require research).

Is there a way of just installing the olimex kernel, perhaps? Do you have a debian PPA? (A quick search online doesn't find one.)

JohnS

Might be best to contact armbian to say the faults you see, and that you do not get them with the Olimex image(s), so armbian know and can set about fixing armbian.

John

olimex

@wu-lee what are the advantages to use the nerxtcloudpi instead to install nextcloud with single command line using the Olimex image?

$ sudo snap install nextcloud


wu-lee

@olimex: the main advantage is not having to invest more time tinkering with my Nextcloud server, trying different things to avoid hangups, possibly to no avail.

I did have a snap-based Nextcloud install on an larger computer running Ubuntu prior to purchasing the Lime2. This worked ok, modulo some problems with app-armor (mis)configurations preventing certain Nextcloud functions working correctly (I don't recall the details right now).

I switched to using the Lime2 in an attempt to go low-power. It was originally a FreedomBox, but I found FreedomBox less useful than I'd hoped. So: an experiment. But one I find  useful daily, so actually it's not convenient if it doesn't work well.

I chose NextCloudPi, because that is specifically tailored to run on single-board computers like the Pi, or the Lime2, which the snap-based package is not.  NextCloudPi claims to run on any Armbian-supported architecture, which the Lime2 is meant to be - as I understand it. Is this correct?

Quote$ sudo snap install nextcloud

Switching my Lime2 to the Olimex image plus the Nextcloud snap is not a simple one-command operation like this. This is because I have about 400GB of data to transfer, and rebuilding it on the new OS saddles me with a certain amount of effort and risk I don't take entirely lightly. I am also not currently convinced the problems will be solved if I do that.

This is why I'm asking for help to diagnose the lock-ups, because I want to try and establish what is causing them and ensure they can be solved. I would rather avoid simply trying different things and hope I get lucky. :)

I reason that not only would this solve my problem, but it would help those people with a similar problem on Lime2. It would potentially show Lime2's capability (or otherwise?) for this application, and so other people could make appropriate choices.  I want to see Lime2 succeed here.

wu-lee

Quote from: JohnS on August 29, 2020, 04:19:30 PMMight be best to contact armbian to say the faults you see, and that you do not get them with the Olimex image(s), so armbian know and can set about fixing armbian.

John

I would, but I would like to be able to say more than simply "it freezes randomly".  They'd just send me back to Olimex for support.

Perhaps if I knew what was different between the Olimex images and stock Armbian?  Perhaps if I'd run the Olimex kernel on Armbian and seen that the problems stop? Or even if I had some way to enable kernel debug messages, this could reveal why the freezes happen.

In fact I did try the last option, but the NextCloudPi kernel upgrade to a debug kernel isn't apt-based, and uses the armbian-config program, which broke mid-upgrade, and left my Lime2 in an unbootable state. This was painful!  I've quite a lot of experience with Linux using Grub/Grub2, but UBOOT is new to me, and when it breaks I don't find it easy to fix without a lot of work RTFM.

Therefore I am hoping there are other (safer) options, and if not, come prepepared with advice and know how...

JohnS

If you report SOMETHING to Armbian however brief you'd hope they would ask for more information if they need it and tell you what they want.

Failing to report a freeze to them gives them no chance.

It's likely they know what has changed.

If you'd rather find out, grab a complete build tree for both and do a tree diff.  Yes it's painful.

John

olimex

the "freeze" problem is that Armbian do not use safe values for the A20 RAM clock, some boards work, some don't some work and fail after time
this could be easily seen and compared with our uboot and theirs DDR clock settings, we use conservative values as we want the board to operate reliable in wide temperature range.

JohnS

Can't be that hard for wu-lee to diff & then change the armbian uboot.

John

wu-lee

@olimex: Thanks for saying there *is* a known problem with stock Armbian on Lime2 and what it is. That's really useful to know!


wu-lee

@john - I would consider making a diff as you suggest, but can anyone tell me where to get the files I need to diff?

What do you suggest I do once I've diffed? It doesn't quite solve my problem to know that there is a difference, even if I can pinpoint it exactly. I need to then get those fixes into a package I can install on my running system, otherwise I still only have the option of installing the Olimex disk image and then rebuilding that into a nextcloud server.


wu-lee

So, scanning these forums again I see the pinned post about A20-OLinuXino images. Apologies for not noticing these before.

It looks like there *are* .deb packages which would allow me to simply install the Olimex kernel without re-installing the whole system, here:

ftp://staging.olimex.com/Allwinner_Images/A20-OLinuXino/1.latest_mainline_images/buster/debs/

@Olimex - Can you confirm these should work when installed on a running stock Armbian OS like mine? I'm aware I could cripple my system if I do the wrong thing.


The images don't seem to be in an apt repository, however, so I guess if there are updates I need to notice and manually re-install. Correct?

(Also I see a nextcloud install script which I have yet to inspect to see if it helps. But that's another matter.)

olimex

as I wrote before I do not recommend you to use obsolete Armbian images and to use the images provided by Olimex which are known to work stabile
Nextcloud installation takes one line, then the configuration of the services you want would be quite less toublesome that to go in altering images which from the questions you ask is something you are not familiar.

JohnS

Quote from: wu-lee on September 03, 2020, 11:43:20 AM@john - I would consider making a diff as you suggest, but can anyone tell me where to get the files I need to diff?

What do you suggest I do once I've diffed? It doesn't quite solve my problem to know that there is a difference, even if I can pinpoint it exactly. I need to then get those fixes into a package I can install on my running system, otherwise I still only have the option of installing the Olimex disk image and then rebuilding that into a nextcloud server.
I don't mean to be insulting but if you need to ask those sorts of questions you should stay away from doing these things - yet - and instead go reading up on lots of things about developing software, doing web searches and so on.  After a bit you should realise how to do the things I mentioned.

John