Olimex Support Forum

OLinuXino Android / Linux boards and System On Modules => A64 => Topic started by: mossroy on March 23, 2021, 03:43:25 PM

Title: Some temperature sensors are missing in recent images
Post by: mossroy on March 23, 2021, 03:43:25 PM
In older images provided by Olimex, several temperature sensors were exposed. In particular some coming from the CPU.

Example with a board installed with A64-OLinuXino-buster-minimal-20200601-131837.img (running with kernel 5.8.18) :
$ sensors
axp813_adc-isa-0000
Adapter: ISA adapter
temp1:        +35.7°C 

gpu1_thermal-virtual-0
Adapter: Virtual device
temp1:        +48.3°C 

cpu0_thermal-virtual-0
Adapter: Virtual device
temp1:        +48.0°C  (crit = +90.0°C)

axp20x_battery-isa-0000
Adapter: ISA adapter
in0:          +0.00 V 
curr1:        +0.00 A 

axp813_ac-isa-0000
Adapter: ISA adapter
in0:              N/A  (min =  +4.00 V)

gpu0_thermal-virtual-0
Adapter: Virtual device
temp1:        +48.7°C

But most of these sensors have disappeared in more recent images.

For example, on a board installed with A64-OLinuXino-buster-minimal-20201207-193928.img (upgraded to kernel 5.10.23) :
$ sensors
axp813_adc-isa-0000
Adapter: ISA adapter
temp1:        +36.7°C

The CPU sensors are valuable to check it's not overheating. In particular now that the CPU governor is set to "performance" by default
Title: Re: Some temperature sensors are missing in recent images
Post by: mossroy on May 15, 2021, 09:31:18 PM
This is unfortunately not fixed with an upgrade to the latest kernel provided in the olimex repo (5.10.36).

Could you please fix this regression? (and/or tell us if there is a workaround)
Title: Re: Some temperature sensors are missing in recent images
Post by: jch on May 26, 2021, 05:53:17 PM
They're now managed by the thermal subsystem, which allows the kernel to throttle the CPU when it overheats.  You can get the values from sysfs:
$ for t in /sys/devices/virtual/thermal/thermal_zone*; do cat $t/type $t/temp; done
cpu0-thermal
43972
gpu0-thermal
44323
gpu1-thermal
44206
Title: Re: Some temperature sensors are missing in recent images
Post by: mossroy on May 30, 2021, 11:59:53 AM
Thanks for the info.

Unfortunately, this is not natively supported by the tools I was happily using so far :

All were working fine with previous versions of the kernel.

And all were working fine with the (unstable) mainline debian bullseye I managed to install (see https://www.olimex.com/forum/index.php?msg=31474), based on kernel 5.10.x too.

It's good news that the CPU is throttled in case of overheat, but there has been a regression that seems specific to recent Olimex images
Title: Re: Some temperature sensors are missing in recent images
Post by: jch on May 30, 2021, 05:38:48 PM
Right.  I've filed an issue request at https://github.com/OLIMEX/olimage/issues/4
Title: Re: Some temperature sensors are missing in recent images
Post by: mossroy on June 04, 2021, 04:26:01 PM
About the CPU throttling when overheating, it seems to be not always working.

Today, I have put a very heavily load on one of my A64 boards, and it has shut down with the following syslog message :
Quotekernel:[ 5506.549851] thermal thermal_zone0: critical temperature reached (90 C), shutting down

This board had been very recently installed with latest image A64-OLinuXino-buster-minimal-20210513-112230.img , with kernel 5.10.36. It's in a room with average temperature

I would have much preferred a CPU throttling than a brutal shutdown that forces a manual restart.
Title: Re: Some temperature sensors are missing in recent images
Post by: jch on June 04, 2021, 07:16:24 PM
> About the CPU throttling when overheating, it seems to be not always working.

Strange, it works for me. At four threads, the board will reliably throttle at 70°C, and temperature remains stable.

> kernel:[ 5506.549851] thermal thermal_zone0: critical temperature reached (90 C), shutting down

My understanding is that the kernel should start throttling the CPU at 70°C, throttle it further at 80°C, and shut it down at 90°C.  If the CPU reached 90°C, then shutting down is the correct behaviour, but it shouldn't have happened in the first place.  Could you please check the value of
cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*_temp
It should say 70000 80000 90000, if that's not the case, there's something wrong with your device tree.

Can you please confirm that throttling is happening?  Run
watch cat /sys/devices/virtual/thermal/thermal_zone0/temp /sys/devices/virtual/thermal/cooling_device0/cur_state
You should see the second value switch to 1 when the temperature goes above 70000.
Title: Re: Some temperature sensors are missing in recent images
Post by: mossroy on June 06, 2021, 10:57:50 AM
cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*_tempgives your expected output

Regarding the throttling, I'll have to generate a heavy load again to check
Title: Re: Some temperature sensors are missing in recent images
Post by: mossroy on June 07, 2021, 10:42:09 AM
After generating some heavy load again, I confirm that your second value switches to 1 each time the temperature exceeds 70000, and switches back to 0 when it comes back below.
Title: Re: Some temperature sensors are missing in recent images
Post by: jch on June 09, 2021, 03:11:30 PM
Then it looks like everything is working like intended — I've got no explanation for what could have gone wrong before.  If you find a reliable way to reproduce the issue, I'll be glad to have a look.
Title: Re: Some temperature sensors are missing in recent images
Post by: mossroy on June 10, 2021, 11:14:25 AM
The heavy load was produced by a compilation, probably multi-threaded on the 4 cores.

My board was inside the official metal box https://www.olimex.com/Products/OLinuXino/A64/BOX-A64-BLACK/ , with a heatsink https://www.olimex.com/Products/Components/Misc/ALUMINIUM-HEATSINK-20x20x6MM/ , in an apartment room with average spring temperature.