Incident Response
You Got Ruby In My PHP! (You Got PHP In My Ruby!)

Summarized using AI

You Got Ruby In My PHP! (You Got PHP In My Ruby!)

Rein Henrichs • April 07, 2011 • Earth

In his talk at Ruby on Ales 2011, Rein Henrichs discusses the critical topic of handling a security breach, drawing from his personal experience of being hacked. The talk is guided by the need for awareness around security policies in application hosting environments, particularly when working with PHP applications in the cloud.

Key points of the discussion include:

- Initial Response to a Breach: Henrichs describes his sheer fright and panic upon learning of the hack. He emphasizes the importance of remaining calm and regaining control of both oneself and the systems involved.

- Containment Strategy: The safest immediate response to a breach is shutting down compromised systems to prevent further exploitation. Henrichs recounts having to take down over a thousand PHP applications to secure their infrastructure.

- Root Cause Analysis: The failure stemmed from inadequate security measures in a shared hosting environment, where mistakes such as lack of access control and shared SSH private keys facilitated the hackers' work.

- Communication with Customers: Informing users about the breach is crucial, and transparency about their data's potential exposure must be prioritized, despite the difficulty it presents.

- Rebuilding Post-Breach: Henrichs stresses that recovery should involve rebuilding systems from scratch and not merely trying to clean up compromised servers. Using configuration management can aid in the secure recreation of infrastructure.

Throughout the talk, Henrichs shares the profound lesson that security must take precedence over new feature development, especially in environments exposed to the internet. He urges startups to be proactive in securing their systems and to learn from their mistakes to prevent future breaches.

The key takeaway from the session is that security is an ongoing process requiring constant vigilance and improvement, as well as the necessity of creating a culture of security awareness among all team members. Henrichs concludes with a call for transparency and accountability, which fosters trust with users. This talk serves as a crucial reminder of the risks involved in web application security and the best practices that can mitigate those risks.

You Got Ruby In My PHP! (You Got PHP In My Ruby!)
Rein Henrichs • April 07, 2011 • Earth

Help us caption & translate this video!

http://amara.org/v/GZCi/

Ruby on Ales 2011

00:00:11.840 Thank you.
00:00:22.800 I was going to come here and give a talk about how we host a bunch of PHP applications in the cloud and how awesome it is, using fun words to impress you, or perhaps even scare you. However, on Saturday, that changed, so we'll talk about that instead because it's more fun.
00:00:51.480 On Saturday at around 11 PM, after I got home from the bar, I received a message from my assistant informing me that we had just been hacked.
00:01:00.000 Of course, my immediate response was, 'Where?' But then I heard the even worse news that it was everywhere—literally, servers I didn’t even know we had were also compromised. My immediate reaction was sheer fright and fear.
00:01:30.060 My next thought was, what do I do now? How do we fix this? The reason I'm giving you this talk is that I hope I can offer you some hard-earned advice on how to deal with a security breach, what to do immediately afterwards, and how to recover over the next days and weeks.
00:01:59.579 Maybe I'll come back next year if we're still around and tell you how everything ended up going for us. The first thing that most people will tell you when you get hacked is to not panic. It's very important not to panic, but that is often easier said than done.
00:02:06.600 The reality is, if you are already panicking, you can't control that. What you really need to do is try to stop panicking. Realize that your primal fear centers in your brain are already activated, releasing adrenaline into your system, making you want to make this bad thing go away. It’s critical to regain control of yourself and then of the systems connected to you.
00:02:41.879 Usually, that doesn’t mean logging in and picking up where you left off; it means shutting the system down. Take everything offline—shut it down, take it off the internet, and try to contain the failure as much as possible. It’s highly unlikely that you know the full extent of the exploit and which parts of your system are compromised.
00:03:09.840 The safest course of action is to assume that everything is compromised and shut it down. It's going to be difficult, especially if we're talking about live sites. In our case, we had to take down over a thousand PHP applications. It's going to hurt, but you have to do it. Keep all of your data, logs, and access records because if anything was modified, you may need it for understanding what happened and how to prevent it in the future.
00:04:03.180 Let me talk a bit about how this happened to us. To do that, I have to explain our infrastructure. The way our PHP Fox infrastructure works is that we have a multi-tier system for every PHP application we host. This starts with a cache, for which we use Varnish. We then load balance using Nginx to a number of app servers, each reading from a DB master and clone.
00:04:20.579 In case all of our app servers are down—a rare occurrence—we have a shared Apache failover environment, and that is actually where our trouble started. The issue began in the shared environment because we did something very stupid: we didn’t secure it properly.
00:05:00.780 At the end of the talk, all of you who have never made stupid mistakes can tell me how I should have secured this properly. One thing you must understand is that when you're running a startup on a lean budget with a small team, you often make trade-offs between developing new features and ensuring security. We made the very bad choice to focus on new features instead of security.
00:05:40.740 If there's one takeaway I can give you, it’s this: if you have anything connected to the internet, make sure it's secure first. Don't assume you're not a target simply because you think you’re not popular yet.
00:06:05.100 Just to illustrate how unprepared we were, a script that was supposed to run in a dedicated user instance accidentally ran in the shared environment, which was found to be insecure, granting full root access. The problem was compounded because the shared Apache server lacked proper access control or containment for individual applications. We had also shared SSH private keys across our systems.
00:06:37.380 If you can think of any stupid thing we could have done to facilitate the hackers’ work, we probably did it. They gained access to the shared hosting box and quickly infiltrated other systems, including the load-balancing machine and our database servers, leading to a massive breach.
00:07:37.260 Among the accounts compromised was our Twitter password, which unfortunately also served as the password for our other accounts such as DNS and Google Apps. Because of this, the attackers gained control over multiple services, including our blog.
00:08:19.620 Once we managed to regain control of our servers, we shut down over a thousand application servers for our customers. While we didn’t have direct evidence that they were compromised, we didn’t want to take any chances, especially regarding the private data of others.
00:09:04.740 After shutting everything down, we began figuring out how to communicate the situation to our customers. The most painful part of a security breach is informing them that their data might be insecure. This moment is difficult, but the bad thing has already happened, and all you can control now is your response.
00:09:50.940 Disclose honestly any security vulnerabilities to your customers and what steps they can take—like changing their passwords, removing sensitive data, and avoiding sharing passwords across different services. In our case, the actual exposure to customer data was minimized since we don’t store credit card information and hash our passwords.
00:10:35.120 However, it’s crucial to understand that even with hashing and salting, if your database is compromised, you must assume all passwords might be compromised. You may have to send out reset emails to all users to ensure their accounts are secure, especially if they reused passwords across multiple sites.
00:11:40.440 After regaining control and notifying users, the next step is to start recovery. The worst thing you can do is power on systems that have been hacked without understanding the nature of the hack. If you don't know why you were hacked, you can't simply clean the system and start over. The only way to recover properly is to rebuild your servers from scratch.
00:12:36.420 Rebuild from scratch. If you're in the cloud, simply launch new servers. If you're using physical hardware, reinstall the operating system. Recovery will only lead to disaster if you try to salvage anything from the compromised servers.
00:13:30.120 Use configuration management to rebuild your systems. After identifying the attack vectors and understanding how the breach occurred, we utilized this approach to recreate our infrastructure secure from the very beginning. We ensured we didn’t make the same mistakes.
00:15:17.880 It's not that we didn’t know anything about security; it's that we made poor priorities. Now, we have the chance to reinforce our systems' security, employing best practices that should have been in place long ago. Learn from others who are knowledgeable about security, and don't be afraid to consult them for advice.
00:16:00.140 Once you've identified and fixed as many issues as you can, conduct another system audit before reconnecting your services. Bring back your users’ systems online first before worrying about your own infrastructure.
00:16:48.020 Finally, regain trust by openly admitting faults and detailing the exact nature of the breach. Transparency is crucial for rebuilding confidence with your users.
00:17:40.220 Security is a process, not a product. It requires constant improvement, and be aware that most vulnerabilities are a series of smaller issues that can lead to larger exploits.
00:18:07.140 Always assume that any part of your system can be compromised. Don’t rely on certifications that suggest your system is secure. Security is about taking every precaution, and small oversights can lead to major issues.
00:19:33.340 I hope that what I’ve shared today helps prevent personal catastrophes. If anyone here learns from this experience and avoids similar situations, I consider that my job well done. I’m happy to take questions now.
00:20:50.100 Thank you all for your attention.
Explore all talks recorded at Ruby on Ales 2011
+8