Euruko 2019

Surrounded by Microservices

Surrounded by Microservices

by Damir Svrtan

In this presentation at Euruko 2019, Damir Svrtan, a Senior Software Developer at Netflix, discusses the architectural challenges and solutions within the Netflix Studio, particularly focusing on the use of microservices in handling distributed data.

Svrtan opens with a brief history of Netflix, from its origins as a DVD rental service to becoming a leading player in the streaming and original content production industry. He outlines key phases of Netflix's evolution:

  • DVD Rental Service: Initially allowed customers to watch movies without late fees.
  • Transition to Streaming: This risky move in 2007 changed content consumption dynamics.
  • Original Content Production: This shift began in 2013 with 'House of Cards' and has expanded significantly.

The traditional movie-making industry relies heavily on manual processes and physical paperwork, which presents challenges for a modern production environment that Netflix has recognized and is addressing through technological solutions.

Key points highlighted in the presentation include:

  • Microservice Architecture: Netflix's early approach involved a monolithic Ruby on Rails application that gradually evolved into specialized microservices as understanding of the film industry improved.
  • Hexagonal Architecture: Svrtan advocates for this architecture model to handle inputs and outputs efficiently, facilitating easier swapping of data sources without altering core logic.
  • Decoupled Data Sources: Emphasizing the importance of defining repositories for entity management while allowing various data source implementations to be interchangeable, handling everything from SQL databases to REST APIs.
  • Error Management: In a microservices environment, effective error handling is essential to ensure a resilient application, where single failures do not compromise overall functionality.
  • Scalability Balance: Svrtan also discusses the trade-off between consistency and availability, particularly relevant to Netflix's operational context.

In conclusion, he stresses the importance of delaying decisions until more information is available, which leads to better architectural choices. He encourages ongoing discussion and collaboration around these topics during the event. Overall, the talk provides valuable insights into how Netflix is leveraging microservices and hexagonal architecture to support a dynamic and expansive production landscape.

00:00:05.680 Let's bring out our next speaker, Damir Svrtan, who is joining us from San Francisco. He works at Netflix as a Senior Software Developer and has a passion for punk music, so I’m hoping he’ll rock the boat today. We saw a tweet from Damir that mentioned he found a box of programming books on the street in San Francisco, which is something that definitely happens there. Without further ado, take it away, Damir. Thank you.
00:00:58.640 Great! Hi Euruko, I’m glad to see all of you here and I hope you’re having a good time. As mentioned, my name is Damir Svrtan, and I work in the Studio Engineering organization at Netflix. Netflix is often referred to as a pioneer of the microservice movement, and I want to share my practical experiences of being surrounded by microservices and some of the patterns our team uses to tackle distributed data. Many of us in Europe know Netflix primarily as a streaming service, but it actually existed for ten years before moving into streaming, starting as a business that shipped DVDs in red envelopes. Customers could open up the envelope, watch the movies as many times as they wanted, and return them at their convenience with no late fees. Interestingly, this service is still used by over two and a half million people in the United States.
00:02:24.000 The DVD rental business was the first phase of Netflix. The second phase was the streaming service, which changed how people watch content. It was a risky move back in 2007 when streaming technologies were not particularly reliable, but Netflix eventually provided a viable way for people to watch great content without the disruption of endless commercials. Cable TV in the U.S. often features more than 15 minutes of commercials within an hour of programming, which is nearly unbearable to watch.
00:03:35.840 In 2013, another big shift occurred at Netflix: the production of original content. It began with 'House of Cards' and continued with shows like 'Orange is the New Black' and 'Stranger Things.' Now, Netflix produces hundreds of original shows and movies each year, outpacing most major film studios combined. This shift transformed Netflix from a technology company focused solely on automation and efficiency into a significant player in the entertainment industry.
00:04:17.040 The movie-making business is a century-old industry that, unfortunately, has not changed much since its inception. Most of the work is still done on paper and requires significant manual labor. For instance, many movie productions still rely on fax machines to distribute large amounts of paperwork. It’s not uncommon for a single movie production to print out more than 50,000 pages within the first week of shooting. Given the scale at which Netflix produces content, these traditional methods do not work well.
00:05:02.080 The scope of movie productions and film studios involves many domains, from deciding which scripts to produce, acquiring content, and discovering talent, to managing contracts, paying actors, and handling marketing. It's an endless cycle. So, how does Netflix tackle these problems? By leveraging technology and creating applications that address key parts of each production. Netflix is in the process of developing its first fully digital studio to help automate the boilerplate and tedious tasks that creatives typically don’t want to deal with. We are talking about a suite of more than 50 applications covering the aforementioned areas.
00:05:46.960 Initially, Netflix tackled the studio space with a monolithic Ruby on Rails application, which allowed rapid development and quick changes despite the lack of knowledge about the film space. Gradually, as our team learned more, we began decomposing the Rails monolith and migrating data into specialized apps or microservices. This decision was not primarily driven by performance issues, but rather by the desire to set clear boundaries around various areas and domains.
00:06:46.320 Having joined Netflix about a year ago, our team began working on a new application that spans multiple domains of the business. We chose to build this app using Ruby on Rails due to our expertise with the framework. It supports agile development, enhances team velocity, minimizes time to market, and offers an abundance of gems, allowing us to leverage pre-built solutions rather than reinventing the wheel.
00:07:25.440 One of the key elements in Rails is its pragmatic use of the Active Record pattern. This pattern combines several functions into one class, including main objects, business rules, validations, and persistence. While this is excellent for rapid development, it can lead to tightly coupled models to the database, making decoupling challenging. The reality in our studio world is that we work with a lot of distributed data, especially when projects span multiple domains and require various data sources.
00:08:24.960 For example, user data and permissions might reside behind a REST API, while movie information may be behind a gRPC endpoint. Many applications consume data from these services, necessitating adjustments to the persistence layer in Rails, especially considering that the data in our database today might reside in a service tomorrow. Consequently, we need an architecture that clearly separates business logic from implementation details and protocols.
00:09:05.760 This realization brings us to the hexagonal architecture, which is ideal for addressing the problems we face. Its core premise is to position inputs and outputs at the edges of our design. Inputs could be requests hitting the app server or invocations from the Rails console, while outputs represent data flowing into the persistence layer. This could involve interactions with SQL databases, NoSQL databases, or messaging systems.
00:10:26.080 By employing this architecture, we isolate core logic from external concerns. Inputs and outputs can have their adapters swapped without altering the core code, providing flexibility. Let's explore some fundamental concepts inherent in this architecture, which may not be groundbreaking but are crucial. At the center of our domain are our entities. In the Netflix studio context, an entity could be a movie, a production, a shooting location, or an employee. Entities themselves remain agnostic to their persistence, diverging from the Active Record convention.
00:11:07.520 We define repositories as interfaces for retrieving and manipulating entities. For instance, a production repository is designed to work with a production entity class and typically interacts with a movie production class, which could be an Active Record model. These repositories manage methods that communicate with the data sources, enabling us to decouple our core code from persistence concerns.
00:12:09.920 Data sources act as adapters to various storage implementations, whether that’s a SQL database, a REST API, or even simpler formats like JSON or CSV files. We aim to maintain a consistent interface across different data sources while allowing for easy swaps when necessary. In the case of directly interfacing with SQL databases, the data source might simply be an Active Record model.
00:13:20.080 When working with a REST API, we would create classes that fetch data using a REST client, ensuring that both sources share the same interface. This independence from data storage ensures that changes to a data source do not disrupt our business logic. Moreover, implementing error handling for network communication is critical in a microservices environment, where a single failure should not compromise the entire application.
00:14:43.480 Caution is vital when dealing with errors; they need to be manageable and actionable. Monitoring and measuring system performance help identify pain points, ensuring a smooth functioning system. Without metrics, discovering issues becomes challenging. Our error reporting system helps maintain a healthy application without overwhelming us with alerts that lead to alarm fatigue.
00:15:52.880 Additionally, when we pursue scalability, maintaining a balance between consistency and availability is key. For example, in high-traffic scenarios like Amazon's shopping cart, availability is critical, but for Netflix studios, data consistency is paramount. We operate with a manageable user base and need accurate data for decision-making.
00:16:58.520 In conclusion, hexagonal architecture has proven to be a valuable tool for us, compelling us to contemplate coupling layers. If there’s one takeaway from my presentation, it's the importance of delaying decisions. Making early project decisions when knowledge is still limited can be misleading; however, allowing the development of more information over time leads to better choices in terms of architecture and implementation.
00:18:35.920 Thank you all for listening! I hope you gained a foundational understanding of the topic; this overview only scratches the surface. If you'd like to discuss it further, feel free to approach me in the hallway or at the GitHub booth anytime.
00:19:18.480 Thank you, thank you! We have a present for you as well. Thank you very much!