Scaling
Scaling Ruby Applications: Challenges, Solutions and Best Practices

Summarized using AI

Scaling Ruby Applications: Challenges, Solutions and Best Practices

Dmitry Sadovnikov • April 11, 2024 • Sydney, Australia

The video titled "Scaling Ruby Applications: Challenges, Solutions and Best Practices" presented by Dmitry Sadovnikov at RubyConf AU 2024 focuses on the complexities of scaling Ruby applications based on his extensive eight years of experience. The presentation is structured around five key stages of project development, using an online store as a case study.

Key Points Discussed:
- Bootstrap Stage:
- Initial focus on code scaling with one developer managing both front-end and back-end.
- Importance of maintainable code to facilitate future development.
- Emphasis on implementing testing practices using RSpec for a strong foundation.
- Utilization of code linting tools like Rubocop for consistent style.

  • Startup Stage:

    • Transition from a sole developer to a team by delegating tasks and reducing single points of failure.
    • Introduction of a playbook for procedural knowledge sharing and efficient project execution.
    • Establishing on-call schedules for incident management to balance support and feature development.
  • Small Company Stage:

    • Integration of multiple teams (6 to 15 developers) and the emphasis on code isolation and responsibility.
    • Onboarding and offboarding processes to streamline team integration.
    • Use of staging environments and feature toggles for safe deployments and project changes.
  • Medium Company Stage:

    • Scaling teams to 16 to 50 developers with independent management structures (tribes).
    • Recommendation to utilize cloud services and manage infrastructure through codified architecture (Terraform).
    • Exploration of service independence and database scaling for increased efficiency and effectiveness.
  • Big Company Stage:

    • Strategies for vertical and horizontal growth, emphasizing division of workload (Dividing and Conquering).
    • Caution against premature optimizations to avoid hindering growth.

Main Takeaways:
- Invest more in people rather than just new technologies for the most sustainable development.
- Adapt processes and architectures sensibly as company size and project complexity grow.
- Ensure security measures such as data encryption are integral to project development.

The presentation underscores the importance of scaling thoughtfully, focusing on team dynamics, infrastructure management, and maintaining product quality amidst growth.

Scaling Ruby Applications: Challenges, Solutions and Best Practices
Dmitry Sadovnikov • April 11, 2024 • Sydney, Australia

This presentation covers scaling Ruby applications using lessons learned over eight years. First, it touches on the importance of managing on-call duties, such as addressing incidents, fixing errors, and performing updates. The talk then moves to the topic of team structure, detailing roles such as domain lead and architect, and ways to split larger groups into smaller, agile teams.

The presentation also covers application updates, discussing methods like soft updates, force updates, and feature toggles. A focus on secure data management includes best practices for data encryption and protecting customer's data.

The importance of clear, understandable documentation will also be discussed, along with strategies for fostering a testing culture in development teams and applying UML schemas before starting development.

Optimising the developer experience is essential for productivity - so the presentation explores ways to collect and implement developers' ideas. Finally, it explains how to use modern technologies to speed up releases, track success metrics, perform regular stress testing, and ensure ongoing security.

This presentation aims to help you navigate the complexities of scaling Ruby applications, saving both time and resources in your operations.

RubyConf AU 2024

00:00:04.200 Hi everyone! First, I want to thank the organizers, staff members, sponsors, participants, volunteers, and also the Sorty Team for helping me improve this presentation. My name is Dmitry. I am a senior software engineer at Cy, based in Melbourne. Over the last eight years, I have primarily worked with Ruby as my main programming language. I have been involved with companies of all sizes, from small startups to medium and large enterprises across various domains.
00:00:20.320 Currently, I am working at S, which is owned by Sik. I thank them for my sponsorship. If you want to learn more about my work or other personal projects, please follow the links to my GitHub and LinkedIn. In this presentation, I will show the five project life stages. I will use the example of an online store with buyers and sellers, as this pattern can be found in many other companies.
00:00:52.800 We will start with one developer in a company who develops Rails applications and progresses through various stages: the bootstrap stage, startup stage, small company stage, medium company stage, and finish with a big company with multiple domains and teams. The presentation includes insights from my experience working in companies of different sizes, but your experience may differ. Some information may be apparent for experts, but it can still be interesting for engineers in small companies looking to improve their projects and for engineers in large corporations trying to understand the growth stages they might not have been part of.
00:01:31.280 The presentation contains information about scaling code, teams, databases, and services. We will also discuss a bit of DevOps and frontend considerations. In the beginning, we have the Rails new command, and now you, as the owner of a company, think about how to scale your project.
00:02:08.040 The first stage of the project is the bootstrap stage, which focuses on code scaling. In this stage, one developer is responsible for both the front end and back end. Their primary goal is to deliver features that will generate revenue and enable the hiring of another developer. Writing readable and maintainable code is crucial for future scalability.
00:02:31.279 The first developer sets the code foundation, which will be difficult to change in the future. Be cautious, as a future developer might not appreciate your decisions, which could be dangerous if you write bad code. Instead of over-engineering, like adopting a synchronous architecture with Kafka in microservices, think about how to ship code. Do not scale the project before you really need to.
00:03:23.959 At this stage, a key recommendation is to add testing. Sorry for stating the obvious, but at RSpec, it is the best Ruby testing framework. You can quickly write BDD scenarios for tests. In this example, we have a simple testing concept. We have signup logic with two contexts: a failure with an invalid email and a success with a valid email. The success case includes two additional paths with an existing user — we shouldn't create a new user when one already exists, and when a user doesn't exist, we must create one.
00:04:30.200 You can use this success and failure testing pattern with almost any logic. It's easier to read, and maintaining the test coverage is like a quality rating of your project. If you don't have time to write good tests, at least write smoke tests. A good test makes it easier to refactor the code and scale it in the future.
00:05:05.279 To improve the project, make sure you add linting through Rubocop. You can use the default code styles or create your own rules, but do not get too fancy with it. Having a consistent code style helps to improve productivity, as it allows easier code reviews for developers who write and read in the same style.
00:05:54.839 Using namespaces in your code and database is a good practice. When a new developer looks at the app folder, they must have a general sense of what the project is about. Additionally, using namespaces will help you split the application into domains in the future. In this example, we have two namespaces: business and individual.
00:06:37.400 Service objects encapsulate the business logic, leading to easier code maintenance. One crucial aspect is that fat models and controllers usually represent nouns, while service objects can embody verbs, allowing you to write actions in any class. It's also essential to ensure that a good class is a small class, and namespace them to prevent clutter.
00:07:17.520 For example, when a user buys an item, they create an order, which triggers updates to statistics and notifications to the user. In other scenarios, such as when an admin creates an order for a user, there is no need to notify the user, but we must set the admin ID. Avoid writing all the logic in one model or controller.
00:08:00.360 If you have multiple scenarios, it is better to create the order in service objects. Additionally, adding Docker Compose to package all requirements will save your time when booting the project. Continuous integration should be simple, like using GitHub Actions, which will save you hours for shifting the code.
00:08:43.000 Add error tracking and monitoring tools, enabling you to quickly identify bugs using user-friendly tools like Datadog and Honeybadger. Encryption is important to protect your users' personal information. While it is difficult to predict how data breaches can occur, they happen frequently. Attackers often require users' email to send spam and earn money, and having access to their full name makes it easier for them.
00:09:27.040 In this example, I demonstrate how to use Simple Rails encryption for emails. However, we utilize more advanced tools and techniques at S and regularly rotate our keys to ensure maximum data protection. Do not encrypt everything; otherwise, your database will become slow and useless. Use hashing to improve searching but ensure security regardless of the code quality, the number of developers you employ, or the architecture you are using.
00:10:15.520 Failure to protect your data will quickly lead to business failure. Some useful tools I recommend for this stage of the application are ActiveInteractions, Trailblazer (especially operations), and almost all gems from Dry-rb. For books, I recommend Effective Testing with RSpec.
00:11:05.000 The next stage of the project is the startup stage, which focuses on scaling the people in your team. Do not confuse this stage with company startup; some startups can have hundreds of developers. In this stage, the development department will have more than one developer.
00:11:47.160 The first developer begins delegating tasks to reduce the risk of a single point of failure and allows for scalability through teamwork. The lead developer should be responsible for about 80-90% of coding work and around 10% of management duties. Developers on the team should ideally be full-stack developers to ensure they can be replaced if necessary due to unforeseen circumstances, like illness.
00:12:35.600 The first recommendation for this stage is to introduce a good playbook to run projects or perform manual tasks such as password rotations or database updates. Avoid being the only person who knows how to do things. Instead, share information with others through documentation; otherwise, people will constantly ask you how to perform tasks, which can be problematic when you are on vacation.
00:13:02.960 Playbooks should live with the code and must be developer-centric. Add pull request templates to all repositories on GitHub or GitLab, making it easier for code reviews and saving time for everyone involved by providing valuable information such as links to documentation, tasks, screenshots, or lists of changes.
00:13:45.920 Staging is needed for code testing and must be identical to production. Developers, especially newcomers, have valuable ideas on how to improve the project. However, they may not generate good ideas on the spot. Having a designated place for collecting design ideas, like a documentation Slack channel, is essential.
00:14:22.680 Discuss these ideas regularly, at least once a month. Even seemingly silly ideas can lead to genius solutions. Pairing developers can help share knowledge, increase productivity, and foster learning. This involves absorbing insights from the tools people frequently use, such as shortcuts and debugging techniques.
00:15:04.760 Having a first-on-call schedule helps reduce developer distractions. Developers rotate between regular feature tasks and resolving errors and support requests. Each week, a designated person is on-call, which is a good tool for managing support.
00:15:34.880 Pay attention to project incidents and track them to prevent future mistakes and avoid unnecessary costs. I recommend the book Extreme Ownership for this stage of the project.
00:16:07.600 The third stage is the small company stage, which focuses on scaling teams. When you work with multiple teams, you must organize a better environment and ensure code isolation.
00:16:35.520 You will have two or four teams, with about 6 to 15 developers. At our store, we have a tech lead managing two teams. Team one is responsible for developing a mobile application for individuals or buyers, while team two develops a dashboard for sellers, enhancing their business experience. Both teams have their leads, and developers should have a good onboarding process with a checklist to complete before they begin coding.
00:17:20.440 The onboarding process should include information on getting a laptop, setting up accounts, understanding database models, code conventions, and company culture. Additionally, it’s essential to have a well-defined offboarding process with checklists to revoke access and return company property. Revoking access is crucial to avoid security risks.
00:18:02.120 To isolate responsibility in maintenance, utilize code owners on GitHub or GitLab. Anyone who changes code under a specific namespace must have their code reviewed by a different team. This approach helps maintain different code styles and project planning, and it prevents bugs caused by code changes from one team affecting another.
00:18:42.840 Each team requires staging to test features without mixing changes with other teams. Multiple staging environments can cost significant resources but save time in waiting for testing and releases. Feature toggles should be used to avoid issues with large new features that cannot be easily reverted. This approach prevents downtime during releases, as it allows simultaneous application to both front end and back end.
00:19:29.520 This simple technique lets you redesign the entire project and continuously merge complete features into the main branch. However, it’s crucial to cover both enabled and disabled scenarios in your tests. When a feature is released, remember to remove the feature toggle by assigning the responsibility to a specific team.
00:20:16.760 During database migrations, it’s essential to prevent users from sending requests to your servers. Instead of displaying a 500 error, provide a friendly message asking users to return to the website in a few minutes; this can improve user experience. While it's acceptable to have downtime, it is usually less costly than maintaining zero downtime.
00:21:06.440 You can deprecate all API endpoints for mobile apps or remove errors from older app versions that spam your error tracking systems. Additionally, you can prompt users to download a new software version. Soft updates should notify users to update without forcing them, while hard updates are necessary in situations like app hacking or significant vulnerabilities.
00:22:00.000 I recommend books such as Peopleware and The Staff Engineer's Path for this stage of the application. The fourth stage focuses on medium-sized companies, emphasizing the need to scale the teams effectively. Companies tend to separate their infrastructure into multiple domains.
00:22:56.520 In many cases, companies will have between three to five domains, with teams comprising 16 to 50 developers. At our store, we have too many teams and people for one tech lead to manage, necessitating an additional level of management for each domain or tribe. A tribe is similar to a separate isolated company within the larger organization.
00:23:47.440 For instance, we have a tech director who manages two tribes: the individual experience domain and the business experience domain. Each domain is managed by a tech lead or domain lead, who manages team leads that oversee developers. Each specific team in each domain operates independently.
00:24:38.360 For example, a mobile application team in the individual experience domain and a business API team within the business experience domain are both part of their respective designs.
00:25:36.120 At this stage, it would also be wise to start moving to a cloud provider like AWS, Azure, or Google Cloud. The first recommendation is to manage your infrastructure in a more scalable way.
00:26:08.560 It is best to have your project architecture written down and documented in code, such as using Terraform. If you have multiple services and many databases, this makes it simpler to manage. Alternatives like Pulumi exist if you prefer to write infrastructure using the same language as your application.
00:27:31.680 If you have millions of users, you cannot solve this with a single application; you will need to mirror and split the load. During the day, you can set different peak hours based on when users are accessing the application. By using auto-scaling rules, you can define scenarios for scaling—during periods of low usage, reduce instances down to one, but during times of heavy use, scale up to three or even ten instances.
00:29:15.639 Kubernetes is a popular orchestration technology that all cloud providers like AWS and Azure support. However, consider using Kubernetes without relying on a specific cloud provider, as it will facilitate provider changes when prices fluctuate. Despite its benefits, Kubernetes can be challenging without proper knowledge; consider alternatives like AWS Fargate or Azure Container Apps, which support orchestration and auto-scaling. These alternatives are easier for non-DevOps personnel.
00:30:34.760 When your monolith grows large, consider introducing new services. Ensure these new services are independent of each other, avoiding reliance on synchronous communication like HTTP requests. Instead, service calls can be decoupled using event-driven architectures.
00:31:32.880 To manage communication between services, avoid making them dependent on each other's responses. Here are some examples of message brokers you can consider for your architecture. We can discuss different technologies and their advantages, but the best approach is to try them yourself.
00:32:37.760 For instance, if you favor Kafka, which is written in Java and Scala, you might choose RabbitMQ, written in Erlang. If you are locked into AWS, consider using EventBridge. When it comes to database scaling, before you engage in significant scaling efforts, first identify and resolve slow queries by adding proper indexing to the specific fields of your tables.
00:33:58.000 Vertical scaling of databases has its limits. To reduce the load on your main database, utilize read replicas. Each read replica serves as a full copy of the main database but be mindful that it can double the expenses.
00:35:31.640 Database splitting is crucial to segmenting data based on domains. This often involves separate databases for high-loaded parts of your application. Additionally, tables with different namespaces must not have direct access to one another.
00:36:57.760 While different domains sometimes need to access each other, this can be achieved through the use of reference tables. These tables consist of duplicated data and limited fields and should only be updated via synchronization of events from the original domain.
00:37:49.919 The left image illustrates a conventional database without reference tables, where domains directly access one another's tables. In contrast, the right image showcases a database with reference tables implemented. You can start adding reference tables within a single database, and once the data is fully isolated within that database, you can then split it into two.
00:38:44.440 Consider incorporating non-relational databases alongside your relational database, as they tend to be significantly more cost-effective and can efficiently scale horizontally. For example, if you have a table containing millions of analytical data rows, shifting that data to a NoSQL database can result in substantial cost savings.
00:39:43.280 Here are some examples of non-relational databases you might find useful: Elasticsearch can enhance full-text search capabilities; Neo4j is perfect for building graph-based applications, particularly in social networking or anti-fraud systems; and DynamoDB allows for storing terabytes of data at an affordable cost. Redis serves as an efficient caching system and can be utilized as a vector database for your generative AI applications.
00:40:50.960 Remember to prepare for scaling by simulating a high load on your system while maintaining a robust security posture to mitigate potential hacking attacks. Startups and small companies can be challenging for beginners; however, if your company has sound processes in place, consider hiring and training junior developers.
00:41:57.280 Junior developers are often highly engaged individuals who can grow alongside the project and acquire knowledge about every aspect of the application. Senior developers can share insights with junior team members, fostering mutual growth. You might find quality junior developers through Ruby tools.
00:42:54.480 I recommend using strong migrations as a gem, akin to Rubocop but for your database migrations. This can help prevent locks and downtime, which are critical concerns if you have millions of rows in your database. Additionally, consider utilizing Packwerk, which will help to isolate code within your monolith before transitioning to a microservices architecture.
00:43:55.440 For books, I suggest focusing on Learning Domain-Driven Design, which helps with data-intensive applications and transitioning from monolithic systems to microservices.
00:44:50.760 Moving on to the fifth and final stage of scaling applications, which is the big company stage.
00:45:06.960 This phase represents success; you've built a significant company. Now, you can implement additional strategies by scaling vertically through management layers and horizontally by forming new teams and tribes.
00:46:08.960 In conclusion, the principle of Dividing and Conquering is vital for scaling. Without it, you will encounter limitations.
00:47:16.000 There are many discussions around scaling, and this presentation has only touched on a few points. The most crucial advice is to avoid premature optimization, which can damage your business. Instead, invest in people, as this is often more beneficial than merely adding new technologies for developers.
00:48:04.560 If you have any questions or feedback, please reach out to me. You can find this presentation by scanning the QR code and through the provided link. Thank you!
Explore all talks recorded at RubyConf AU 2024
+14