Wandboard Quad Rev. C1 - Temperature failure/System freeze

Postby lmendes » Thu Mar 02, 2017 8:41 am

I own a Wandboard Quad from 2014 which is working nicely inside the box reaching about 75ºC under load conditions and is able to keep working under these conditions for days, but the same is not happening with 4 new Wanboard Quads produced in 2016, which I acquired from MOUSER distributor like the one in 2014.
Essentially all of them boot Linux and are able to enter the accelerated X GUI, however if I run a small stress load composed of glxgears, glxheads, a Vivante Qt test application and the screensaver, two of them fail after a few minutes and the other two before two hours under test. The tests are the same as the ones performed with the 2014 wandboard which works as expected. All tests share the same UBoot, same Linux distribution, same applications and same Vivante driver, which can even be YOCTO Krogoth with core-image-minimal-xfce, plus mesa test applications and Vivante test applications, Except that the 2016 Wandboards are tested without the box for better dissipation.
The failure behaviour observed is system freezing with last generated image frame displayed on the screen. The temperature of the CPU when failure occurs is 45 degrees Celsius for the two Wandboards that fail after 2 hours, the other two fail even at a lower temperature. The CPU temperature was checked by logging cat /sys/class/thermal/thermal-zone0/temp at each second to a log file.
If a 40x40mm FAN is placed on top of the heatsink right above the CPU the two Wanboards that would otherwise fail after 2 hours, then they are able to pass the test and work for days, with the CPU reaching at most 27 degrees Celsisus. The other 2 Wandboards fail after a few minutes even with the FAN placed on top of the heatsink.
All wandboards can be restarted and boot into X GUI after failure, just by power cycling with a few seconds interval.
Before the Wandboard freezes, at least, the ones lasting 2 hours, it is observed that the frames freeze for a few instants and them resume, these freezes become more frequent until the whole system freezes.
I suspect that the oscillator maybe defective failing to oscillate if the temperature reaches 45ºC.
Can you provide some information about what can be done to fix it? Maybe replacing a component? Is the problem known?
