Why turning a plane’s computer off and on again could save your life

Software, overflow and human errors all have the potential to cause serious problems.

Try 6 issues for £9.99 when you subscribe to BBC Science Focus Magazine!

Photo credit: Getty

Published: December 17, 2023 at 4:00 pm

On 4 June 1996, the maiden flight of the Ariane 5 launcher didn’t go well. 40 seconds after take-off, the massive rocket suddenly veered from its flight path and exploded. The cause was a tiny software error: a floating-point number represented using 64 bits was converted to a 16-bit signed integer, but the conversion failed as the number was larger than 32,767 – the maximum that 16 bits could represent.

This overflow error caused the software to dump debugging data into the area of memory being used to control the rocket’s engines. The backup computer did no better, with the result that the rocket lost control and came to a fiery end.

In 2015 it was reported that tests had revealed a similar overflow error could shut down the electricity of Boeing 787 aircraft if their generator control units were on for 248 days continuously. Under these circumstances their software counters reached 2,147,483,647 – the maximum value for a 32-bit signed register. Turning them off and on would reset the counter to make them work again and, luckily, it never led to disasters, in the way the much faultier software of the 737 Max did, three years later.

While overflow errors like these are similar to rounding errors, there’s a subtle difference. Instead of a number being too big, a rounding error is typically caused when a number is inaccurately calculated and stored in binary.

For example, the results of some calculations are irrational numbers: like the number Pi (3.14159265…). It never ends so we have to approximate its value, perhaps as just 3.142. Even simple calculations such as 2/3 in decimal can’t be written down precisely and may have to be the equivalent of 0.667 in binary. Continue to perform calculations like this and the tiny errors accumulate, until they add up to be significant.

Rounding errors can affect missiles...

One of the most notorious examples of this kind of error came in the Gulf War. A Patriot missile was launched to stop an incoming Scud missile, but instead struck a barracks, killing 28 soldiers and injuring many more. The cause was a rounding error in the tracking system that had accumulated until the missile was sent in a terrible direction. A calculation involving the time caused the error, which got worse the longer the system was on.

... and trains

Software bugs such as these are easy to overlook and can be tragic in their results. But you don’t even need the software to be buggy for accidents to happen. In May 2019 an experienced train driver unfamiliar with the new software in his train was attempting to restart the computer and instead accidentally accelerated to 15mph, crashing into another train and derailing his. Luckily, no-one was injured. 

Read more:

Check out our ultimate fun facts page for more mind-blowing science.