Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
United States

US FAA Adopts New Safeguards After Computer Outage Halted Flights (reuters.com) 25

The Federal Aviation Administration (FAA) told lawmakers Monday it had made a series of changes to prevent a repeat of a key computer system outage that forced a nationwide Jan. 11 ground stop disrupting more than 11,000 flights. From a report: The FAA said it has implemented "a one-hour synchronization delay for one of the backup databases. This action will prevent data errors from immediately reaching that backup database." The FAA also said it "now requires at least two individuals to be present during the maintenance of the (messaging) system, including one federal manager."
This discussion has been archived. No new comments can be posted.

US FAA Adopts New Safeguards After Computer Outage Halted Flights

Comments Filter:
  • by Anonymous Coward

    Turn keys on my mark.

    MARK - system goes down....

  • Yeah, great fix. And I have seen "managers" making sure things were done right. Most of the time they were not even looking and used their phone. One also directly told me he had no clue how things worked on the tech side.

    • Sounds like this is the "short term band aid" fixes they are doing. Hopefully they are also looking to long-term fixes as well.
      • by gweihir ( 88907 )

        You think? I do not. When something goes this badly wrong, the rot sits deep on all levels.

    • Yeah, I would think having a peer check your work as you do the deployment would be way more effective at preventing errors.

      • by gweihir ( 88907 )

        Not only you. Anybody that does this competently uses two _experts_ for anything like this because otherwise it is pointless.

        • And almost any major upgrade must have a "rollback" plan in place before it is approved. The Change Request checklist at our location included a rollback requirement where we had explain the plan to reverse any changes we made, the procedure to do it, who could do it, and the timeline to accomplish it.

          If the risk assessment said the rollback could be completed in two hours by saving the data before it was upgraded and restoring it if problem occurred, it was probably approved. If the procedure was to back

  • Even modern systems are Highly Available, not "Always Available" - nothing is 100%, but there are procedures and designs that protect and minimize the impact of mistakes. Change Requests and "manager over the shoulder" aren't really going to improve anything except creating paperwork and job angst.
    I'm not generically a fan of remaking working systems, but there does come a time when the requirements or outcomes have shifted far enough that throwing a system away and restarting is the better option. In my

    • by Anonymous Coward

      I'm not generically a fan of remaking working systems

      Yep, that's the reason COBOL is still in use.

    • Change Requests and "manager over the shoulder" aren't really going to improve anything except creating paperwork and job angst

      I would say this is just how it goes in US Government jobs, but at the same time I've seen what happens when someone tries to explain what you just said to a Congressional sub-committee. The folks in Congress are the ones that make the "eyes over your shoulder" happen way more often than not. Tell you the truth, I think it's just projection.

    • Privatization destroyed the UK's railways, gutted French EDF, took to the brink of collapse Prague (CZ) water utilities...
      Privatization can go both ways.

    • by pesho ( 843750 )
      The NOTAM system that failed is not ATC. Your post is misguided.
  • The system is only mission critical in the eyes of the FAA who have made obtaining NoTAMs a pre-flight requirement for pilots. Following other incidents where pilots did things like landing on a closed runway, directors of the FAA have admitted to Congress that NTAMs are largely noise and that pilots often ignored them or miss a single key line in pages of non-useful cryptically encoded lines of notices. Air Traffic control also provides this information to pilots and even better, provides it to them as t
    • You make a good point, but NOTAMs still have their place. Flights between two uncontrolled airports comes to mind, although of course if you're paranoid like me you'll call your destination beforehand in that case. The wider question is of course 1. How can you fuck up something so simple and 2. How easy and cheap would it be to replace the current system with a web-based simple text solution?

  • 8-inch floppy drives to 5.25in.

  • As they should have done long ago, and as every dev shop that deploys software critical to the operation of their companies. It's way too common for developers to have free rein on production systems. If the *only* way to make changes to a production system, is to create a deployment and test the deployment on a test system first, these kinds of issues will happen far less frequently.

[We] use bad software and bad machines for the wrong things. -- R.W. Hamming

Working...