Well, if the primary is known not to be in a good state, you might as well fail over and hope that the issue was a fried disk or a cosmic bit flip or something.
The real safety feature is the 4 hour lead time before manual processing becomes necessary.
One of the key safety controls in aviation is “if this breaks for any reason, what do we do”, not so much “how do we stop this breaking in the first place”.
It's very hard to ensure you capture every single possible failure mode. Yes, the engineering control is important but it's not the most critical. What to do if it does fail (for any reason) is the truly critical control, because it solves for the possibility of not knowing every possible way something might fail and therefore missing some way to prevent a failure
One or more of three results can come from the engineering exercise of trying to keep something from breaking in the first place:
1. You could know the solution, but it would be too heavy.
2. You could know the solution, but it would include more parts, each of which would need the same process on it, and the process might fail the same way
3. You miss something and it fails anyway, so your "what if this fails" path better be well rehearsed and executed.
Real engineering is facing the tradeoffs head on, not hand waving them away.
The engineering controls don't independently make systems safe, they make things more reliable and cost-effective, and hopefully reduce the number of times the process controls kick in.
The process controls do however independently make things safe.
The reason for this is that there are 'unknown unknowns'—we accept that our knowledge and skills are imperfect, and there may be failures that occur which could have been eliminated with the proper engineering controls, but we, as imperfect beings and organisations, did not implement the engineering controls because we did not identify this possible failure mode.
There are also known errors, where the cost of implementing engineering controls may simply outweigh the benefits when adequate process controls are in place.
It was in a bad state, but in a very inane way: a flight plan in its processing queue was faulty. The system itself was mostly fine. It was just not well-written enough to distinguish an input error from an internal error, and thus didn't just skip the faulty flight plan.
Indeed, that intention is quite transparent in this case. Anyways, I suspect that invalid input exists that would have made the system react in a similar way
The real safety feature is the 4 hour lead time before manual processing becomes necessary.
One of the key safety controls in aviation is “if this breaks for any reason, what do we do”, not so much “how do we stop this breaking in the first place”.