RubyConf 2021

This is not a talk about airplane crashes

This is not a talk about airplane crashes

by Andromeda Yelton

This video, titled "This is not a talk about airplane crashes," presented by Andromeda Yelton at RubyConf 2021, focuses on the intersection of human decision-making, safety protocols, and systemic issues in aviation, rather than merely recounting airplane accidents. Yelton highlights crucial lessons learned from various aviation incidents to illustrate how analyzing these events can benefit broader contexts, especially in technical systems where human interaction is involved.

Key Points Discussed throughout the Video:
- Human Factors in Aviation: The talk emphasizes the importance of recognizing human limitations such as fatigue and stress, which can lead to poor decision-making in high-pressure situations.
- Colgan Air Flight 3407: This incident highlights the tragic consequences of pilot fatigue and inadequate training on recognizing fatigue signs. Poor decision-making in response to a warning alarm led to the crash, resulting in 50 fatalities.
- United Airlines Flight 173: The scrutiny of this flight underscores the need for Crew Resource Management (CRM) training. Even with various indicators, the crew failed to manage their fuel effectively before crashing due to a lack of communication and assertiveness among crew members.
- British Air Tours Flight 28M: This case revealed unexpected outcomes due to the inhalation of toxic gases released by flame-retardant materials in cabin fires. The investigation prompted changes in testing regulations and highlighted the need for effective emergency procedures and training.
- Air France Flight 358: This incident illustrates successful evacuation procedures thanks to well-trained cabin crew. Despite a challenging situation, effective teamwork and communication led to the safe evacuation of all passengers.

Conclusions and Takeaways:
- Systemic improvements and the sharing of lessons learned from past tragedies can make aviation safer. Understanding how individual mistakes occur within broader systems can enhance safety protocols.
- A participative culture in the cockpit encourages open communication and mitigates the risks associated with hierarchical structures.
- The importance of properly training crew members (both pilots and flight attendants) in crisis situations can significantly impact survival rates during emergencies.

Overall, Yelton's talk connects the dots between aviation incidents and valuable insights applicable to myriad human-centric technical environments, demonstrating that through meticulous investigation and systemic learning, we can foster safer operational practices.

00:00:11.040 Hi, I'm Andromeda Yelton, that Andromeda on Twitter, and this is not a talk about airplane crashes. That said, we will be talking about several airplane crashes that involve significant loss of life and damage to property.
00:00:17.760 So if that's not something you want to listen to right now, there are a lot of other excellent RubyConf talks, and I suggest that you go pull up one of those. All right, this is me. You can't see me, so I wanted to make sure you knew what I looked like and had the chance to imagine me with lots of facial expressions and talking with my hands, because that's what I do.
00:00:36.559 I am a software engineer who actually trained as a librarian, so I've written most of my software in libraries, academic and cultural heritage organizations such as the Wikimedia Foundation and the Library of Congress. In that context, I've actually written a lot of Ruby because it turns out Rails is super popular in the library and open source world. I am also a student pilot. I have flown this plane, the brown and white one in the front, and I’m a big nerd about aviation safety and history.
00:01:11.760 Now, if I were with you face to face, I'd say right now that we were all going to read this slide out loud every time we saw it, and we'd practice doing that as a group. Right now, I'm not there, so just picture that in your head or maybe say it out loud whenever you see this slide. Those of you who have seen Ernie Miller's brilliant talk 'How to Build a Skyscraper' know where I'm going with this and will recognize how much the skeleton of my talk owes to his. Those of you who have not seen it, do yourself a favor, Google Ernie Miller's talk 'How to Build a Skyscraper.' You won't regret it.
00:01:54.799 With that said, let's talk about Colgan Air 3407. This photograph is of a Bombardier Q400, which is the same type of plane that was in the accident, although not the exact same aircraft. This flight was bound from Newark Liberty International Airport to Buffalo Niagara International Airport on February 12, 2009.
00:02:11.360 It was delayed about two hours, resulting in its final departure from Newark at 9:18 PM with 49 people on board. As they were descending into Buffalo, they experienced a stick shaker alarm. This is one of the most serious alerts that an aircraft can give. It's a warning of an impending stall.
00:02:31.519 So what exactly is a stall? Well, when a plane is flying, it has wings, and those wings are airfoils which generate lift from the way that oncoming air flies or flows around them. When oncoming air hits the leading edge of the wing, it separates, and part of it flows below and part of it flows above the wing. As you increase the angle of attack of the aircraft—meaning as you increase the angle at which the wings meet the oncoming air—you eventually reach a critical angle of attack at which that separated flow doesn't really come together after the wing or doesn't come together for a long time. Above the wing, there are vortices, vacuums, and really turbulent airflow that disrupt the lift being generated. The result is that the wing can no longer generate adequate lift.
00:03:56.720 You need to keep your aircraft below the critical angle of attack. This angle varies with different aircraft but is usually around 17 degrees. Up to that critical point, a higher angle of attack actually helps to generate lift. In fact, the slower you go, the higher the angle of attack you need in order to generate enough lift to stay airborne. This creates an envelope within which you can fly—if you go too slow, it's impossible to fly at all because the angle needed to generate enough lift meets the critical angle of attack, and you can no longer maintain flight. Therefore, pilots are trained to pitch their nose downward and add thrust during a stall. Pitching the nose down reduces the angle of attack, while adding thrust increases the aircraft's speed, thus increasing the range of angles of attack at which it's possible to generate enough lift to fly.
00:05:07.039 As a student pilot, I have already been told to execute this maneuver, although I haven't practiced it yet. What did the crew of Colgan Air 3407 do in response to the stick shaker alarm? They pitched up and added thrust—but only a little thrust, not enough to increase the plane's speed. At this point, they were effectively climbing a hill in the sky, and the outcome was that their angle of attack increased while their airspeed decreased. Over the next 30 seconds, the angle of attack varied between 10 and 27 degrees, which was well above the critical angle. The result was that the plane generated much less lift, but there was still just as much gravity acting on it, leading to only one possible outcome: they ultimately lost control of the aircraft, which descended to the ground.
00:05:43.919 In this incident, there were ultimately 50 fatalities: the 49 people on board plus one person in a house that is no longer present in this photograph. So how did this happen? Why would trained, qualified pilots respond so poorly to this rudimentary but critical alarm? The National Transportation Safety Board (NTSB) of the United States identified numerous causes for this accident, and I will focus on one for the sake of time: the actions of the first officer the night before the crash.
00:06:35.760 As you'll recall, the plane departed from Newark, but the first officer actually lived in Seattle. This is common in aviation for pilots not to live in the same city as their flights because they can get a ride. The night before the incident, she had been on a red-eye flight from Seattle to Memphis and then departed Memphis around 3 AM for New York. The captain, based in Florida, had been in Newark the previous night but had not had a hotel; he had been sleeping in the airport. You might ask, where do they sleep? Well, we know that the captain had taken some naps in the staff lounge at Newark airport. This was a problem for Colgan Air, as many crew members were napping in staff lounges not designed for that purpose. To combat this, Colgan made those spaces less conducive to napping by increasing the lighting and enforcing compliance with crew fatigue policies. However, they hadn't trained their personnel on how to recognize the signs of fatigue or provided a reasonable place for them to sleep.
00:08:04.640 Why was the captain in the Newark airport and not in a hotel room? Because there was no hotel room available. The fact is, as humans, we live in bodies, and if those bodies are not adequately rested—even if we are capable and well-trained—we can't perform at our best. The first officer was caught on the cockpit voice recorder saying, 'I feel like Colgan walks all over me. This company treats me like crap.' One hour, 10 minutes, and 10 seconds later, she would be dead as a result of this incident.
00:08:50.560 Following recommendations from the NTSB, the Federal Aviation Administration (FAA) in the United States instituted legal restrictions on flight time, duty time, and rest for aircrews. Similar rules were later adopted abroad. But I want to remind you: this is not a talk about airplane crashes.
00:09:15.120 Now let's talk about United 173. This photo is of the accident aircraft, which was a DC-10 on route from JFK to Portland, Oregon, via Denver, on December 28, 1978. For the most part, what follows is taken directly from the cockpit voice recorder. The first two slides precede that recording, but the rest are all direct quotes with timestamps from that recorder.
00:09:45.440 At about 17:10 or 5 PM local time, the landing gear were deployed as the pilots were on approach to Portland airport. They observed a loud thump and a yawing motion of the aircraft, and the main landing gear light did not illuminate. The crew began to troubleshoot, trying to figure out what was going on—had the landing gear deployed successfully? They looked for other indicators that might provide that information.
00:10:01.600 They began discussing emergency landing and evacuation procedures in case their landing did not go smoothly. They briefed the flight attendant so that she would know what was happening and could prepare the cabin. At around 17:38, they called the United maintenance office in San Francisco and engaged in a radio conversation to troubleshoot the landing gear issues further.
00:10:21.920 The captain stated his intention to hold for about 15 to 20 more minutes—until just shy of 18:00. During the conversation with the maintenance crew, they asked him to confirm that he would ultimately be landing about five minutes past the hour. He confirmed that was correct. He didn't want to hurry; he wanted to give the flight attendants plenty of time to prepare the cabin. It was a clear day, and they expected no problems.
00:10:55.680 At 17:50:20, the captain asked the other pilots for an updated weight estimate and expressed the idea that they'd have about 15 more minutes until landing. The first officer confirmed that they would indeed have 15 minutes, but the flight engineer warned that would run them really low on fuel. About 12 minutes later, the flight engineer noted that they had approximately three minutes of fuel left. Twenty-two seconds thereafter, the first officer informed Portland air traffic control that they intended to land on 28 Left in about five minutes. Thirty-nine seconds later, Portland asked them for the souls on board and the fuel remaining. The flight crew then spent the next three minutes discussing the landing gear system.
00:12:15.680 At 18:06:46, the first officer stated, 'We're going to lose an engine, buddy,' to which the captain replied, 'Why?' At 18:13:25, the flight engineer announced, 'We just lost engines one and two.' The captain declared, 'They're all going; we can't make Troutdale!' The first officer affirmed, 'We can't make anything.' Twenty-one seconds later, very belatedly, they made the mayday call, but less than a minute after that, they crashed into the suburbs of Portland after running entirely out of fuel.
00:12:42.800 Of the 181 passengers and eight crew members aboard, eight passengers and two crew members were killed, while another 21 passengers and two crew members were seriously injured. Despite crashing in a heavily populated suburban area, no one on the ground was hurt. Why? You know from listening to the cockpit voice recorder that the plane crashed because it ran out of fuel, even though at several points the crew commented on the fuel situation and the time remaining.
00:13:00.240 At no point, however, did they compare their intentions for landing the aircraft against the fuel situation and the time situation. This raises the question of how it is possible that with all this information and a perfectly sound aircraft, they still managed to run it into the ground outside of Portland. I want to emphasize this incident because it was the first NTSB report to mention crew resource management, also known as cockpit resource management. There are actually many crashes where this is a factor. In a number of cases prior to United 173, it was a conspicuous element but did not yet have a name or formal recommendation in error crash investigation reports.
00:13:46.240 So, what is crew resource management? The report indicated a desire to ensure that flight crews were trained in it, particularly with emphasis on the merits of participative management for captains and assertiveness training for other cockpit crew members. Crew resource management focuses on the dynamics of authority within the cockpit. Traditionally, aviation has been hierarchical and authoritarian, with the captain occupying a very commanding role on the flight deck.
00:14:14.480 However, the authority of the captain can exert subtle pressure on the crew to conform to his way of thinking, which may hinder interaction and proper monitoring, and may force another crew member to yield their right to express an opinion. In contrast, crew resource management suggests that it is the leader's responsibility to foster a participatory culture, inviting and welcoming feedback from all team members—even when that feedback may disagree with the captain's viewpoint. There may be situations when other crew members have the best situational awareness and understanding of a problem, as the cockpit voice recorder suggests happened in this incident.
00:15:27.520 Crew resource management emphasizes that while the captain is responsible for creating a participatory atmosphere, it is equally the responsibility of the followers to be assertive and speak up, avoiding passive reinforcement of potential errors by their inaction. While it does advocate for a level of hierarchy, it warns against the pitfalls of an overly steep authority gradient that can discourage participation. In aviation, a number of specific behaviors have been developed and formalized into training, ensuring that everyone understands how CRM works. This practice has become widespread and demonstrably contributes to safer skies.
00:16:35.120 Again, this is not a talk about airplane crashes. So let's talk about British Air Tours 28M. This photo is of the accident aircraft, which on August 22, 1985, was bound from Manchester in the United Kingdom to Corfu in Greece for a vacation flight. The flight crew heard a loud thump during takeoff, presuming it was a burst tire or bird strike, prompting them to abort the takeoff, which was the right decision. They had not yet taken off, still being on the ground—making it the safest choice in an uncertain situation.
00:17:00.160 However, while a burst tire had been a reasonable hypothesis, what they actually experienced was the explosive failure of an engine part. The tower notified them of a fire, and fire trucks immediately noticed the situation, deploying within 25 seconds. Alarming as this looks, the report of the AAIB, which is the UK equivalent of the NTSB, stated that many factors should have biased events toward a favorable outcome for this incident. Despite the immediate emergency response, 55 of the 137 people on board died—a tragic outcome that shook the aviation industry as they believed it should've been nearly 100% survivable.
00:17:49.840 What happened? At the time, a main concern of safety testing was flashover: a catastrophic situation where a large area such as the cabin suddenly ignites entirely. Understandably, there was focus on preventing flashover, leading to an emphasis on cabin material flame retardance. However, testing did not match production realities with the British Airways 28M incident, where cross-ventilation from multiple open emergency doors prevented flashover from occurring. The AAIB realized this was more representative of real-world accidents than the tests often conducted.
00:18:39.200 While flashover did not happen, of the 55 fatalities, 48 died from smoke inhalation rather than burns. Flame retardant materials, intended to slow down the spread of fire, actually produced toxic gases: carbon monoxide, sulfur dioxide, ammonia, hydrogen cyanide, nitrogen oxides, hydrogen chloride, and hydrogen fluoride. Those last three, when in contact with water—like humidity or moisture in lungs—form strong acids. Many people escaped before the smoke worsened due to swift action by the crew, while some incapacitated survivors were rescued by an assertive flight attendant who physically pulled them to safety.
00:19:46.080 While there was some prior awareness of toxic outgassing from cabin materials, neither US nor UK regulators had specific standards addressing this issue. Consequently, tests were carried out measuring the wrong factors. The report called for improved tests but went further by noting regulations at the time mandated that to be certified as airworthy, it must be possible to evacuate an aircraft within 90 seconds—even if half of the doors are blocked. However, this timeframe had not been effective because there were no briefings for passengers seated in exit rows, leading to delays during eviction. Moreover, panic during real-world scenarios caused people to jam in narrow spaces, impeding evacuation, as prior testing scenarios had gone without such human factors consideration.
00:20:59.840 As a result, the flight attendants had no briefing to guide passengers on door operations, prolonging the evacuation. This is why, today, flight attendants provide thorough instructions and train passengers on emergency procedures. The flight attendant who managed to assist passengers out of the cabin while facing toxic smoke exposure demonstrated assertiveness, entering the hazardous cabin to check for survivors until ordered to evacuate by fire personnel.
00:22:09.040 Now let’s discuss Air France 358. The aircraft involved in this incident, on August 2, 2005, was en route from Paris to Toronto Pearson Airport—often referred to as the larger airport in Toronto compared to the smaller one. On this day, the weather was appalling, with over 500 flights cancelled at Toronto Pearson Airport. In the time leading up to the accident, the rainfall rate was recorded at 4 inches per hour, or about 10 centimeters. Furthermore, just two minutes before the plane landed, there were at least 20 lightning strikes around the runway, significantly affecting airport operations.
00:22:51.840 Due to the storm, several critical pieces of airport equipment were rendered inoperative or destroyed, including wind measuring devices. This meant air traffic control was unable to provide the pilots with accurate weather data. To make matters worse, with various unusable runways, all planes were redirected to land on 24L, the shortest available runway, which had flooded conditions. Despite discussing possible diversions to cities like Niagara Falls, Ottawa, or Cleveland, the crew ultimately chose to attempt landing in the worsening weather since they believed conditions had previously been viable.
00:23:30.560 Despite taking on extra fuel prior to departure to mitigate the chance of a diversion, they mentally committed to their landing plan and lost flexibility as conditions deteriorated. Upon attempting to land on runway 24L, they required around 8,780 feet to stop safely, but they landed 3,800 feet down the runway. Compounding this situation, they delayed in deploying thrust reversers due to high workload, failing to slow down adequately. Consequently, they overran the runway and found themselves in a ravine, just short of the extremely busy Highway 401 during rush hour.
00:24:05.120 Even though the plane was still on airport property and fire crews arrived within 52 seconds, it ultimately took four hours to control the ensuing fire. Remarkably, of the 309 individuals on board, all survived. Only 33 required hospitalization, and of those, 21 were treated and released. Despite facing catastrophic conditions, the crew's training and decisive actions resulted in a successful evacuation.
00:25:52.160 According to the Transportation Safety Board, the successful evacuation was due to exemplary and professional performance from the cabin crew. They utilized astute risk analyses and directed passengers toward suitable emergency exits while maintaining robust communication. Furthermore, the crew ensured that the required minimum number of flight attendants was present, allowing them to provide adequate coverage during the crisis. Ultimately, the evacuation took less than two minutes, with the last person to leave being one of the pilots, who confirmed that the cabin was empty.
00:27:02.080 The survival of 309 people in Canada in 2005 is attributable to lessons learned from events like the British Air Tours incident in 1985. Throughout the world, accident investigators relentlessly pursue the truth, document it, and share findings. These investigations don't merely stop at individual pilot error; they delve into how specific circumstances could lead qualified individuals to make poor choices and strive to alter systemic factors moving forward, optimizing the environments for superior performance. Again, this is not a talk about airplane crashes. Thank you for your time.