Friday, December 9, 2011

Complexity kills

In this case, it killed all 228 people aboard Air France flight 447 a couple years back.  Popular Mechanics has a detailed analysis of what we know from the recovered Black Box recordings, and it makes for fascinating reading.  Here are the parts most interesting to me:
Just then an alarm sounds for 2.2 seconds, indicating that the autopilot is disconnecting. The cause is the fact that the plane's pitot tubes, externally mounted sensors that determine air speed, have iced over, so the human pilots will now have to fly the plane by hand.

Note, however, that the plane has suffered no mechanical malfunction. Aside from the loss of airspeed indication, everything is working fine. Otelli reports that many airline pilots (and, indeed, he himself) subsequently flew a simulation of the flight from this point and were able to do so without any trouble. But neither Bonin nor Roberts has ever received training in how to deal with an unreliable airspeed indicator at cruise altitude, or in flying the airplane by hand under such conditions.

...

Almost as soon as Bonin pulls up into a climb, the plane's computer reacts. A warning chime alerts the cockpit to the fact that they are leaving their programmed altitude. Then the stall warning sounds. This is a synthesized human voice that repeatedly calls out, "Stall!" in English, followed by a loud and intentionally annoying sound called a "cricket." A stall is a potentially dangerous situation that can result from flying too slowly. At a critical speed, a wing suddenly becomes much less effective at generating lift, and a plane can plunge precipitously. All pilots are trained to push the controls forward when they're at risk of a stall so the plane will dive and gain speed.

The Airbus's stall alarm is designed to be impossible to ignore. Yet for the duration of the flight, none of the pilots will mention it, or acknowledge the possibility that the plane has indeed stalled—even though the word "Stall!" will blare through the cockpit 75 times. Throughout, Bonin will keep pulling back on the stick, the exact opposite of what he must do to recover from the stall.
What makes this particularly interesting is that the Airbus has a history of pilot/auto-pilot confusion.  An early crash (don't have the book here, sorry, but it was in the early 1990s) happened because the autopilot overrode the pilot's controls, preventing the plane from recovering from a stall and ultimately crashing it.  Here, the auto-pilot didn't override the pilot's controls, allowing him to basically bring the speed of the plane to zero.

The moral, from a software engineering perspective, is that there probably isn't any right way to design a system that will always work.  What is absolutely critical is to have obvious and transparent communication of critical information that the pilot needs.

Airbus has a spotty record on this.  Another crash (this one in Greece in the 1990s) was due to confusion about rate of descent vs. altitude, and the computer being incorrectly set.  That plane flew into a mountain.

Now, it's easy for me to say this, after all, I'm not a pilot and don't really know all the possible inputs that might lead to a crash.  However, there is a long history of strange "pilot error" in Airbus planes, and I quite frankly don't think that Air France hires incompetent pilots.  I think that the system complexity of their "Glass Cockpit" has exceeded the pilot's ability to understand what's going on, and in particular what is happening in emergency situations, and what the computer will do and what it won't do.

27 comments:

bluesun said...

Time to bring back the flight engineer?

NotClauswitz said...

Fortunately United has an all-Boeing fleet to Hawaii since I dread the thought of swimming 200-odd miles to fine a island...

Anonymous said...

I didn't bookmark it, unfortunately, but shortly after flight 447 crashed, I read that Air France does actually hire pilots who have very few hours by US standards, and particularly fewer hours in planes that don't think they can fly themselves.

But I can't find the reference, so take that as just some internet bonehead with a possibly faulty memory. Might have been "Ask the Pilot" at Salon.

Rob K said...

I remember in a software engineering class back in the early `90s talking about Airbus planes dropping from the sky because of their poor programming. Sounds like they haven't changed much.

Broken Andy said...

One question: if the air speed sensors had become inoperable, how does the computer know the plane is slowing down enough to sound the "stall" warning?

deadcenter said...

Rather than complexity, they forgot rule 1, Fly the dam plane. Okay, they didn't have a reliable airspeed indicator, they still had an Artificial Horizon that would have been showing the wings above the horizon, indicating a climb. The Altimeter would have shown them continuing to gain altitude. Shallow dive to a lower altitude. Wait for pitot tubes to defrost. Fly the dam plane. This was less too much complexity and more lack of training.

I am not a pilot. I took the ground school class as part of a Physics of Flight class in grad school and Dad was a carrier based pilot in the Navy.

lelnet said...

"neither Bonin nor Roberts has ever received training in how to deal with an unreliable airspeed indicator at cruise altitude, or in flying the airplane by hand under such conditions"

If this is actually true, then your assumption that Air France doesn't hire incompetent pilots is bogus.

You can't get a _private_ certificate from the school I went to without getting training like that. Let alone a freakin' ATP.

It may well be true that the system is unreasonably complex, and this contributed to the accident. But if they're hiring people who haven't been trained to deal with equipment failures, then they're hiring incompetent pilots.

NotClauswitz said...

Sat in the back of an Airbus going down to Guatemala in 1989 and the during the entire 5+hour flight-time the ass-end of the Plane was in active motion as the computers were constantly adjusting and readjusting the elevators and rudder of the plane - there was not one quiet calm moment of flight in the back. It totally sucked from a passenger perspective and most of it was unnecessary extra-work. But that's what I expect from a plane designed by Eurocrats.

Old NFO said...

Another 'problem' with Airbus is NO feedback through the stick, most airplanes if you get in a stall, you get a stick shaker...

WoFat said...

My country is rich. Your country is poor. We will share a currency that has the same value in both countries. Sire, that'll work.

Angus McThag said...

Andy: angle of attack indicator can give stall warnings.

The thing that stumped me was that you can pull one stick full back and the other stick doesn't give you any indication. How are the crew supposed to work together if they only have voice communication to tell what the other guy is up to? Bonin held that stick full rear the whole fall and the other guy could only neutralize the elevators with a full-forward input.

I wondered then, as I wonder now, why so much authority was given to the computer when there's no reason for an airliner to be inherently unstable. The Airbus isn't unstable, so it doesn't NEED the computer tweaking constantly to fly.

Jake (formerly Riposte3) said...

"Andy: angle of attack indicator can give stall warnings."

More specifically, there's a turbulence created near the leading edge of the wing just before an aircraft actually stalls. There are sensors designed to detect this turbulence that give a stall warning.

I don't know what the big planes use, but on the Cessnas and Beechcraft I rode in for my orientation flights in Civil Air Patrol, the stall sensor seemed to be just a little tab that could slide forward and back in a slot near the leading edge of the wing, underneath. It was designed so that the pre-stall turbulence would make it move extremely rapidly, and (I think - it's been almost 20 years now) the vibration would be transmitted through the airframe and into the cockpit. It was loud enough to be heard over the engine and through the radio headsets.

Anonymous said...

::::Sigh::::

Planes stall because of Angle Of Attack (AOA), not airspeed. A jet at 50 knots can be flying in control and not stalled, in some circumstances. The same jet at 250 knots can be fully stalled, in other circumstances. Depends only on what angle the wind is hitting the wings. Pulling on the stick (generally) increases the angle of attack.

A stall occurs from the TRAILING edge of the wing and moves forward, not the leading edge. That little tab on the leading edge that Jake talks about is a very simple AOA indicator, digital you might say. AOA below a critical value, no warning. AOA above a critical value, tab moves, warning sounds. The ones on Cessnas are even simpler. Just a New Year’s Eve horn in the cockpit that sounds when the air from the cabin flows out that hole (because the air pressure around the hole is low, because it's stalling).

The pilots had received training in how to recover from stalls, but the training they received didn't really help them. Their problem was symptomatic of the current trend toward less experienced pilots without much primary flight time moving into big jets. In big jets, the engine power will USUALLY be able to recover a stalled aircraft (powering THROUGH the stall). Pilots are trained in big jets not to push on the stick so that they don't sink much during the recovery, to hold a positive deck angle, and wait for the engines to fix the stall. They are trained this way to COUNTER the training that every pilot used to get in smaller planes, where the primary stall recovery method is to push the stick forward BRISKLY. This breaks the stall, but loses altitude. The syllabus assumes that the computer won’t allow the plane to depart (spin) and that the altitude loss is the biggest problem. The old method built solid reflexes into pilots that stalls have to be broken FAST, because stalling causes loss of control, which will kill you fast at low altitude (where stalling is most common and most dangerous).

Cont'd.
FormerFlyer

Anonymous said...

Cont'd.

The Airbus has a completely different flight control system, and it was designed by engineers. The engineers seem to have been under the impression that the pilots were trying to crash the planes, so they locked off a lot of functions and options that are available to pilots of other fleets and handled them in different ways. These differences are very counter-intuitive to most pilots, and have caused some spectacular incidents and some tragic accidents. A common conversation between traditional pilots and Airbus trainers and engineers goes like this:
Pilot: “Why can’t I do [basic function that is common in every other plane].”
Engineer: “Why would you want to do that?”
Pilot: “So I can fly the airplane”
Engineer: “The computer takes care of that. You don’t have to do that in this plane.”
Pilot: “But there are times that I need to do that. What if I need to?”
Engineer: “We don’t want you to, so we made it so you can’t. Don’t get in that situation.”
Pilot(under his breath): “Arrogant little pr1ck.”
As exhibit A, I submit the clip of an AIRBUS TEST PILOT crashing a brand new jet at an airshow in Paris. http://www.youtube.com/watch?v=KEH7OpnA-I4 . The pilot forgot to tell the computer that he wanted to go around, the computer was set to land, and even though the pilot jammed the throttles to the stops to increase power, the computer wouldn’t allow it because it was trying to land. In any other airplane, when the throttles go to the stops, the engines go to full power. Airbus thinks the pilot might be too stupid to know what he’s doing, so they locked off the throttles during that phase. There’s a good idea.

There will be more accidents from low-time pilots flying transport category jets as the years go along, because there is no longer a difficult apprenticeship that pilots have to go through to get into airline cockpits. Nobody wants the jobs anymore, and the airlines are hiring lower and lower time pilots, very especially overseas. Until training changes to account for the fact that these are pilots with VERY limited experience, there will be more accidents.

When I was flying, there were thousands of applicants for every airline job, most with 2000-4000 hours of Pilot In Command time in commercial aviation, or with years of military flight experience. Many of us were 2nd or 3rd generation pilots. Today, of all my former peers who fly the line, I don’t know of a single pilot who’s kids want to follow in their father’s footsteps. The job sucks, the pay sucks, and the animosity between the unions and the pilots is one minor step short of open violence. This will not be corrected overnight. This may not be corrected at all.

FormerFlyer

Anonymous said...

Sorry, in the last paragraph I meant to say "The animosity between the union pilots and the airline management is one minor step short of open violence."

My bad.

FormerFlyer

Borepatch said...

FormerFlyer, thanks for the information-rich comments. I think you were saying from a flying perspective what I was saying from a security perspective: the designer can't know what is going to happen in every situation, and a robust design will take this into account.

I'm not at all confident that the Airbus software design is robust that way. It certainly has a long history of fatal crashes. While the engineers keep trying to say "pilot error", at some point you have to wonder if the design just sucks.

Jake (formerly Riposte3) said...

FormerFlyer, thanks for the correction and clarification. Like I said, it's been about 20 years, and you know what happens when you try to recall something you learned that long ago if you haven't kept it fresh - no matter how thoroughly you might have memorized it at the time!

The CAP squadron I was in had a flight instructor as a member, and he donated a couple of slots in one of his ground school classes right at the time that I was able to take advantage of the offer. Unfortunately, I couldn't afford to go on to the flight portion after that. I still regret it, but there really wasn't anything I could do about it at the time, and now I'd have to start all over again.

Rob K said...

The thinking of the Airbus engineers describeb by FormerFlyer about preventing pilots from doing the "wrong" thing sounds remarkably like the thinking of the Java language designers about preventing programmers from doing the wrong thing, and it is absolutely the wrong way to go about things. Good design in any field focuses on enabling people to do the right thing easily.

Anonymous said...

Borepatch:
Sorry, I get wordy. You are exactly correct: The designers aren't users, don't know everything the user will need or want, or think that what they want is stupid and that they know better. In aviation, we pay for those lessons in blood. We used to call it "Tombstone" or "Graveyard" safety. Lessons learned the really, REALLY hard way.

Jake:
Sorry you didn't get the chance to fly. There's nothing like it.

Rob:
My favorite class at Embry-Riddle Aeronautical University was a 400 level safety class called Human Factors in Engineering. Theme of the class was you have to design not for an ideal user, nor for an average user, but for your TYPICAL user. You also have to take into account those at the end of the bell curves that work with your system, and accommodate them.

The professor started the first day with my favorite class introduction of all time, “How many of you are already commercial pilots? 27? Ok, without knowing any of you very well, I can categorically say that, if I were to interview your friends and family about you, they would tell me that somewhere between 20 and 25 of you are total A$$H0L3S.” He went on to point out that he was not making a slur or a judgment, just an identification of his user group. He said that, to become commercial pilots, we had to be driven, outcome oriented, no nonsense, practical and hardnosed. He said that we needed to be a little smarter than average, but not nearly as smart as we thought we were. He said that most of the people that started out on that path were probably already pre-selected for most of those traits, and that the 300-400 hours Pilot in Command time we needed to get our ticket would have built those traits into those that lacked them, or weeded them out. We all knew people that were lacking in a critical trait or skill set that had quit or died of terminal wishful thinking or creative stupidity. That only served to reinforce those traits in us. He said that, all in all, we weren’t fun people to have around at a party. (He was right!) He also said that any designer that didn't take those personality traits into consideration when designing systems we'd be using was going to kill people.

Then he asked for each of us commercial pilots to raise our hands if we were the A$$H0L3S he was describing. Naturally, everyone of us 27 thought that we were the exceptions. So you can’t trust the user’s feedback entirely, either.

FormerFlyer

Broken Andy said...

As a software engineer and software system designer, I can tell you that the users who think they know what they are doing cause the most problems. The users hwo admit they don't know what they are doing generally are easy to help.

@Rob K, Yes Java sometimes tries to prevent you from doing the wrong thing, some of those features are quite common in modern programming languages. That is in contrast to Perl, which will let you do the wrong thing about 15 different wrong ways.

Jake (formerly Riposte3) said...

@Andy: In software, that may make sense, because the guys who design software for programming also do programming.

How many of the Airbus engineers also fly large aircraft on a regular basis?

Rob K said...

@Andy, sure Perl will let you do the wrong thing in fifteen different ways, but it will also let you do the right thing in a hundred different ways. There's more than one way to do it, after all. :) Perl focuses on enabling the programmer. The idea is to make it easier to do the right thing, if you know what the right thing is, than to do the wrong thing.

No matter the language (or plane or any other system) no amount of trying to prevent the user from doing the wrong thing will prevent the user from doing something wrong if the the user has no clue what the right thing to do is in the first place. Conversely, all your attempts to prevent the user from doing the wrong thing can completely prevent him from actually doing what he has determined to be the right thing.

Borepatch said...

Can I just say that this is the coolest comment thread ever seen here? You guys rock.

Broken Andy said...

Well, the stuff about airplanes and system design is cool. The Perl vs. Java is just juvenile. Still doesn't matter cuz Perl sucks!

:)

lelnet said...

"No matter the language (or plane or any other system) no amount of trying to prevent the user from doing the wrong thing will prevent the user from doing something wrong if the the user has no clue what the right thing to do is in the first place."

Also known as "I'm a human. You're a machine. There is no conceivable situation in which I don't outrank you."

The pilot is in the cockpit coping with the actual situation in which the plane is trying to fly through the air, at substantial risk of actual death if he finds himself having to get into an argument with the machines in an emergency. The engineer, on the other hand, is safely on the ground, and risks only embarrassment. I trust the pilot more, even if he _is_ an asshole.

Anonymous said...

. . . and, he is.

;-)

FormerFlyer

NotClauswitz said...

When I was designing (UI) for the on-screen guide we designed from the user-side with the user-perspective (it helps to stay guide-ignorant), and went to work on everything we could User-think of to do User-damage and especially Mom-damage that we could. We designed in fail-safes for ignorance - which kept the design pretty flat and without a lot of depth to swim deeply into, and with a rubber bottom so the headfirst divers wouldn't hurt themselves.