RailsConf 2018

Taking the Pain Out of Support Engineering

Taking the Pain Out of Support Engineering

by Cecy Correa

The video titled "Taking the Pain Out of Support Engineering" presented by Cecy Correa at RailsConf 2018 addresses the challenges and best practices in support engineering, drawing on Ms. Correa's extensive experience in the field. The talk emphasizes the importance of effective support engineering in creating positive experiences for customers, particularly when dealing with technical support for software products such as APIs.

Key Points Discussed:

  • Definition of Support Engineering: Support engineering involves providing technical assistance to developers or end-users, emphasizing the role of support engineers in troubleshooting and communication.
  • Critical Thinking: The talk encourages teams to think critically about potential problems early in the product development process. An example cited is Correa’s experience with Electronic Arts, where early access to a game allowed the support team to predict user questions and create helpful documentation.
  • Prioritizing Relationships: Emphasizing the importance of maintaining a relationship between support teams and development teams, Correa shares a case involving a confusing DLC redemption process in Mass Effect 2, demonstrating how poor communication led to overwhelming support requests.
  • Support Structures: Correa outlines two models of support: dedicated support, where specific teams own tickets from start to finish, and rotating support, which may lead to inefficiencies. She argues that dedicated support fosters continuity and improves customer relationships.
  • Ownership and Escalation: The importance of proper ticket ownership is highlighted, comparing removed and owned escalations. Correa advocates for a system where support engineers both own the issue and communicate directly with clients to improve accountability and resolve issues faster.
  • Setting Boundaries: She discusses the importance of establishing clear boundaries with customers to avoid burnout, illustrating this with an anecdote where she had to create space between responses to prevent a developer from relying too heavily on her for coding help.
  • Reinforcing Good Behavior: An essential aspect of managing support involves setting expectations for customer communication to maintain morale among support staff.
  • Tools and Documentation: Correa emphasizes the significance of documentation and tools such as dashboards, log aggregators, and support playbooks, which provide transparency, reduce training time, and improve support efficiency.

Conclusion:

Cecy Correa concludes by asserting that good support engineers are valuable as they can identify patterns and inefficiencies within the product. Investing in support engineering contributes to a healthier engineering culture overall, ultimately benefiting customer service and product development. The discussion encourages companies to recognize and prioritize the role of support engineers to foster both client satisfaction and employee well-being.

00:00:11.179 And welcome to "Taking the Pain Out of Support Engineering." My name is Cecy Correa.
00:00:17.880 I'm a software engineer over at ContextIO. Mostly what I do when I'm talking about support engineering is handle support for a publicly available API.
00:00:24.390 Typically, my support users are other developers, so that's mostly the frame in which I'm facing this talk. I hope a lot of the topics we discuss today will also be applicable to other types of support teams.
00:00:38.700 Before we dive deeper into support, I want to start with a little story. Back when I was in college, I got a job working at an amusement park in Houston, Texas, called Six Flags Astroworld.
00:00:53.969 If you've ever worked at an amusement park, you know it involves a crazy customer service-oriented type of work culture.
00:01:06.390 One of the biggest takeaways I had from that job, and honestly, it's been a while since I worked there, but I remember one key lesson they ingrained in everyone: if you don’t know the answer to something and you can't help a customer, don’t just walk away saying, "I don’t know." Instead, say, "I don’t know, but I will find out for you." This mindset has stuck with me throughout every job I've ever had.
00:01:18.689 I believe that this mindset of not knowing something but being willing to seek out the answer is at the heart of what makes a good support engineer or a good support engineering team.
00:01:32.840 So, what is a support engineer? I’ve been talking to various people throughout the conference. When they ask me about my work in support engineering, I've found that a lot of people have different definitions.
00:01:46.470 For me, a support engineer is a developer or a technical person who provides technical support to other developers or end-users, whether internally or externally. The specific type of support really depends on the size of your company and how support is currently handled. Again, for me, support engineering primarily means supporting developers who are integrating with our public API.
00:02:06.719 Why am I passionate about support engineering? Honestly, support has a bit of a negative reputation in tech, but I’m really proud to work in support, and I genuinely enjoy it.
00:02:20.670 This appreciation likely stems from the fact that I’ve had two significant roles in my career where my job was to support others. It might sound odd to say I had two first jobs, but my first job out of college was in the customer support division at Electronic Arts, mainly writing content for their website.
00:02:38.880 Years later, I shifted to programming, and my first official programming job, where I had "engineer" in my title, was as a support engineer for ContextIO. In both instances, support was my entryway into the industry, which is why I have a fondness for support.
00:02:54.330 Now, let's discuss some support engineering best practices. Today, we will learn how to think critically about problems, prioritize our relationship with the support team, ensure business continuity regarding tickets, discuss ownership of tickets, explore boundaries—something that isn’t often discussed but is crucial for maintaining a happy team—and finally, cover some indispensable tools for my support life.
00:03:17.880 Part one: thinking critically. I want to share an experience from my time at Electronic Arts where I was assigned to work on The Sims 3. This was quite some time ago, so allow me to give you a brief overview.
00:03:37.170 As the subject matter expert for The Sims 3, I received early access to the game, which allowed me to play it for a couple of weeks at work.
00:03:48.120 My goal was to identify elements that might confuse end-users and preemptively create content to help them navigate those issues. One of the things the studio did well was provide the support team with early access to the game.
00:04:06.000 This opportunity doesn’t always happen; often, support teams would receive either no build at all or a mere list of resources to consult. However, in this instance, the studio generously gave us a build of the game prior to its launch, which was invaluable.
00:04:24.840 They allotted us ample testing time before the launch so we could proactively create FAQs. I recently even found a FAQ I wrote about eight years ago during the launch, and it’s still available online. This serves as a testament to what can happen when you equip your team with the proper time and tools to develop quality content.
00:04:44.700 This taught me that if you suspect something could be an FAQ, it likely should be. Providing your support team sufficient time to think critically about what they are supporting is pivotal for every product launch.
00:05:01.830 Now, let’s contrast that with a less successful launch, Mass Effect 2. Despite being a great game, numerous customers faced issues redeeming their DLC codes when they pre-ordered the game. This generated a wave of calls and emails to our support center from gamers feeling confused about the process.
00:05:19.980 The root of the problem was a lack of early access for the support team to the game. Had we been given the opportunity to test it, we might have noticed the confusing flow early on and preemptively created content to assist users.
00:05:37.200 Eventually, we did create some helpful content to guide users on how to redeem their DLC, but it would have been far more effective to identify this issue before the game launched.
00:06:02.250 After acknowledging the volume of support requests, the studio did come to a realization and began to prioritize building a stronger relationship with the support team moving forward.
00:06:15.090 With that credit, I believe they improved their processes for future launches. Let’s now move to part two—continuity.
00:06:31.440 My experiences at gaming companies made me realize how continuity is crucial in support, particularly in the tech industry, especially when dealing with a public API.
00:06:45.790 Generally, I see two paradigms in support: dedicated and rotating support systems. In a dedicated support model, one person or a group of people is solely responsible for handling support inquiries.
00:07:08.250 On the other hand, rotating support means one person from the team rotates in and out of support responsibilities for a set period, usually weekly or bi-weekly. This is common in smaller teams where there's not enough volume to justify full-time support.
00:07:25.990 Let’s discuss why dedicated support typically works better. It ensures business continuity for support tickets. One person is responsible for taking the ticket from initial contact to resolution, providing a direct line of communication for customers.
00:07:41.010 This approach helps build a history of support and allows for identifying efficiencies over time. If someone is doing support roles continuously, they will find patterns that allow them to automate repetitive tasks, leading to greater efficiency.
00:08:06.390 Additionally, dedicated support helps establish relationships with customers. In my role, I support a product that is an API, which means my relationship with customers often spans years.
00:08:20.190 It’s critical that they feel comfortable reaching out to us; when they encounter bugs or issues, we need to hear about them so we can address them.
00:08:37.440 When dedicated support becomes ineffective is typically when there’s no pathway to promote support engineers out of their roles. Many support positions are filled by junior developers, which can be a great way to level up skills.
00:08:53.320 However, without a clear exit strategy for junior developers, they can become stuck in support roles, losing out on valuable relationship-building and knowledge gained while working in support.
00:09:08.300 Now, rotating support has its challenges too. Often, when someone does rotating support, they handle inquiries for a week and then move on, leaving them disconnected from any ongoing issues.
00:09:23.190 This can lead to decreased continuity, as issues and bug fixes might slip through the cracks, causing confusion during handoffs.
00:09:39.740 In some cases, when team members rotate quickly, they can forget processes, leading to inefficiencies and requiring brief ramp-up times.
00:09:53.690 However, if your team does follow a rotating model, it’s vital to maintain clear processes for handing off tickets and responsibility. For any long-term issues, create support tickets in your ticketing system and make sure they’re placed in the active sprint.
00:10:18.900 In my experience, I’ve noticed if a ticket isn’t addressed actively, it might linger for too long, and you could get graded negatively on the response time.
00:10:34.360 It’s essential to clarify that once an issue is involved in your rotation, you still own that issue until it’s resolved. It’s important to facilitate ownership and accountability to avoid issues being lost.
00:10:51.730 Let’s tackle ownership and escalations. I’d like to differentiate between two types of ticket escalations I like to call 'removed' versus 'owned.'
00:11:08.240 A removed escalation generally involves passing an issue to another team for resolution, where the support engineer does not directly communicate with the end-user.
00:11:25.789 This removed layer can dilute the sense of urgency for the developer fixing the issue, leading to misprioritization, as they tend to prioritize new feature work over fixing support issues.
00:11:43.259 This pattern can result in longer resolution times since the person temporarily fixing an issue often doesn’t feel pressure from the end-user.
00:12:01.040 On the other hand, an owned escalation means the developer who resolves the issue also communicates with the end-user. This level of ownership fosters a sense of accountability and helps keep urgency intact.
00:12:18.790 In an owned escalation, a support engineer can directly escalate to another team while also monitoring how quickly that team addresses the issue.
00:12:33.610 For example, if I escalate a ticket to John, I can let him know, 'Hey, here’s an issue that you’re responsible for. If you need more information, feel free to ask the end-user directly.' This reduces the number of barriers and keeps the communication channel open.
00:12:46.799 Now, let’s turn to the crucial topic of boundaries. Boundaries are essential in support roles to avoid burnout. If communication channels lack clarity, the support engineer can easily feel overwhelmed.
00:13:01.350 In my experience with smaller teams, we have set support hours and clearly communicated times when we are available for responses. We also implemented an automated response for after-hours inquiries, informing users of our operating hours.
00:13:17.670 Additionally, allowing reasonable time between responses helps avoid the pressure of continuous support. I faced a situation with a developer who kept reaching out over trivial Ruby programming errors. I realized that if I continued answering immediately, we risked making ourselves overly accessible.
00:13:35.580 Setting boundaries about how much support to provide allows users to troubleshoot issues independently. I learned to time box my responses, giving developers a chance to explore the documentation before coming back.
00:13:52.220 This buffer time led to impressive outcomes; I often received replies stating, 'I read your documentation, and it solved my issue.' That affirmation is rewarding.
00:14:06.390 To reinforce healthy behaviors, I encourage cooperation over hostility. Recently I received a frustrated email regarding our web hooks from a developer.
00:14:22.190 His tone was aggressive, and since our support tickets reach the entire engineering team, it was disheartening to see such negativity.
00:14:33.900 In my response, I asserted that we have zero tolerance for abusive language, emphasizing the need for mutual respect.
00:14:41.600 As a result, his subsequent emails were polite and professional. It's vital for team morale to establish respect and support for thinking positively.
00:14:50.920 I recognize that my support engineers need backing so they can do their jobs effectively. Their happiness truly matters, and as management, it’s crucial to support them in addressing unacceptable behavior.
00:15:04.840 Lastly, let’s delve into tools for the job, specifically around documentation. We primarily discuss code documentation, but it’s critical to also document troubleshooting processes.
00:15:20.520 I implemented a 'Support Playbook' for my team to be a living document containing common task instructions and solutions.
00:15:35.990 This approach made onboarding new support members quicker as they could reference the playbook rather than requiring constant pairing.
00:15:50.410 Empowering the support team with knowledge reduces training time and identifies potential areas for automation. If you notice team members frequently consulting the same page in the playbook, it may be ripe for improvement.
00:16:05.000 Additionally, equipping the team with necessary data is invaluable. I rely heavily on dashboards to provide real-time information, helping me respond quickly to developer inquiries regarding the status of our API.
00:16:21.490 Tools like DataDog and Graphana are instrumental for monitoring status and logs. I can promptly reassure users that our API is operational or provide context about any issues.
00:16:39.380 I can't overstate my reliance on Log aggregation tools, such as Scalar, which not only allow me to check logs but also set alerts.
00:16:54.740 These alerts integrate with our Slack and incident response systems, which is crucial for monitoring and maintaining the integrity of our product.
00:17:09.110 We have also utilized RunScope for automated API testing to monitor our product health closely.
00:17:23.830 With this tool, tests run every 5 to 10 minutes, allowing us to monitor response times and catch issues quickly, which enhances our development processes.
00:17:38.790 Lastly, I want to stress that defining what a good support engineer looks like is key in driving innovation, as they’re skilled in troubleshooting, communication, and pattern recognition.
00:17:54.680 I genuinely believe support engineering is vital for developing junior developers, particularly when you allow them growth opportunities beyond just support.
00:18:10.330 A robust culture of support engineering equates to cultivating a great engineering culture overall. If you have any questions about support engineering, I'm happy to discuss further after this, maybe in the hall.
00:18:26.250 My name is Cecy Correa, feel free to contact me on Twitter @CecyCorrea. I’m always open to talking about support.