Avoiding Disaster: Practical Strategies for Error Handling

A talk from RubyConfTH 2023, held in Bangkok, Thailand on October 6-7, 2023.
Find out more and register for updates for our next conference at https://rubyconfth.com/

RubyConf TH 2023

00:00:07.200 Hello, my name is Huy Du. I am a software engineer at Rakuten Viki.

00:00:10.400 I attended this Ruby conference last year and was inspired by many awesome speakers and the Ruby community. Consequently, I submitted my proposal this year and it got accepted.

00:00:19.800 You can find my details on Twitter and GitHub, and my full name is Huy Du. My personal website is exponentde.com.

00:00:29.679 So, what is programming? Programming involves solving problems by providing instructions to a computer. If you attended the talk yesterday about delegation, you might understand that we delegate our tasks and solutions to the computer for execution.

00:00:40.640 I am sure everyone here is familiar with the basic concepts of mathematics. If we represent programming in mathematical terms, we can say that f(x) = y, where f(x) represents the solution to a given problem. Here, x is the input to the problem and y is the expected outcome. If we run the function f with the given value of x and the outcome does not equal y, we may get some other value, like z, which means it does not match our expected result.

00:01:02.960 This brings us to the topic of errors. Software errors are unexpected outcomes from computer programs. They occur when a program executes in a way that was not intended. Generally, we have two types of errors: expected errors and unexpected errors.

00:01:18.799 An expected error is a part of program execution that we, as software engineers, have to anticipate. We need to expect that errors can happen during the program's execution and at least predict the possibility of them occurring in the future. For example, if we create a web application to sell tickets for this Ruby conference, one of the expected errors could be that tickets have been sold out. Another could be that ticket purchases cannot proceed after midnight. These errors may also stem from external dependencies, such as using a library incorrectly or encountering network issues.

00:01:41.720 On the other hand, unexpected errors are not anticipated and can interrupt our program's execution. These are also known as defects. Unexpected errors always occur with 100% certainty, meaning that the error encountered was not fully anticipated by the developer. This can stem from developer mistakes, such as implementing the wrong logic or failing to handle specific behaviors that lead to the occurrence of errors.

00:02:06.040 Some errors might be beyond our control, such as database corruption or memory-related issues, which relate to infrastructure. However, we can prevent these situations by improving our monitoring and observability processes in our software.

00:02:30.200 In this talk, we will not cover unexpected errors; we will focus on expected errors and how to handle them effectively. If we manage expected errors well, it will be easier for us to handle unexpected errors, helping us to avoid disasters and prevent incidents from happening in our production environment.

00:02:45.159 Let's quickly look at exceptions in Ruby. An exception is a unique data structure that represents an error that occurs in Ruby programs. It carries two main types of message: error code and error message. I'll provide a piece of code to visualize how exceptions work in Ruby.

00:03:05.680 For instance, consider a method that divides x by y. If the division raises an error, I catch that error and print it out. If the division results in an invalid action, I will raise an exception for that. I will intentionally divide 10 by zero to see what happens.

00:03:29.720 In the console, we can see the exception is raised and contains information about the type, message, and backtrace. Since the result was invalid, an error is generated, leading to the termination of our program's execution. Because Ruby is an interpreted language, when we execute the code, if an exception is raised, it will propagate through the context until it is handled. Otherwise, our Ruby program will be disrupted.

00:03:47.160 With this basic information, let's delve into an example problem. Suppose I want to build an application based on the following user story: a user wants to buy a ticket to attend the Ruby conference. Since we are in the Ruby community and the most successful stories involve creating web applications with Ruby on Rails, I will focus on delivering a web application.

00:04:05.920 Initially, we need to come up with a high-level system design that is very simple. Let's assume we have a front-end, an API server, and a database. The user interacts with the user interface in the front end to purchase a ticket. The front end sends a request with the user ID and ticket ID to create an order in our backend system, which interacts with the database to manipulate data. We also use a third-party payment gateway to manage payment transactions.

00:04:26.960 This presents a straightforward flow to represent the development of this feature. Now let’s consider the basic architecture for our application. We will use the MVC pattern but focus solely on the backend, without concerning ourselves with the front end for now. The model interacts with the database, while the controller handles interactions with the payment gateway.

00:04:48.640 I will propose a simple implementation where I wrap everything within the controller. Do not be afraid to read the code; I will explain it step by step. For buying a ticket, we need to submit the order with the user ID and ticket details. We identify the user and ticket based on the request parameters. If there are no issues, we will create an order with the provided user and ticket.

00:05:07.200 Next, we check if the user's balance is sufficient for making a purchase. If it is, we submit the payment to the payment gateway and return a response to the user once the payment succeeds. This describes the happy path. Now, concerning error handling, if the user's balance is insufficient, we need to return an error.

00:05:27.640 Moving on to the error resolution part, since we are using methods for creating orders and processing payments, exceptions will be raised whenever there is an invalid parameter. We will catch these exceptions and return the relevant error messages to the user. This applies similarly for any errors arising from payment processing. Since this is a naive implementation, it has its flaws. It's simplistic and leads to poor extendability, readability, and reusability.

00:06:11.520 However, we can improve upon this by encapsulating the logic within services rather than expressing everything in the controller. Therefore, we can create a business logic layer while the model continues to interact with the database. This approach helps us keep our business logic localized, which simplifies future changes.

00:06:32.600 This is the implementation of the controller after refactoring. I am breaking down the logic into smaller pieces. I create a create order service, and I wrap the payment submission logic within a checkout order service. I also catch exceptions in these services and return any errors appropriately.

00:06:49.760 For the checkout order, I indicate two types of special values for signaling errors: insufficient balance and payment failure. In case of a payment error, the service returns an error message to indicate the issue.

00:07:09.600 Now we apply this refactoring back into the controller. We can see that in the controller, there is no need to catch exceptions explicitly anymore, as we already manage those in the services. The challenge here is that we're left with too many error types to indicate different errors within the controller.

00:07:32.600 This situation leads to inconsistent error values and a lack of standardization. This aspect is vital because delivering software isn't a solo endeavor; it involves collaboration within a team. Each team member may have a different approach to solving problems, such as returning errors as strings, symbols, or null values, which can lead to conflict.

00:07:49.920 Additionally, we expose too many error types within the controller as we still need to handle many errors. Understanding the logic within the checkout order service also complicates matters, making it feel less reusable.

00:08:03.760 Despite the refactoring leading to better code flow and readability, we still face poor encapsulation issues, leading to a lack of reusability. To resolve this, we can return a monad as a result. A monad is a structure that wraps the return value of a function, acting as a container that holds a value and defines operations to manipulate that value.

00:08:16.679 Simply put, we can view it as either a success or a failure, effectively separating our error handling into two categories. This structure allows us to handle errors in a more straightforward manner. So, how do we apply this concept to Ruby?

00:08:39.440 We need to understand the ‘monad result.’ A monad result is an object that contains either a successful value or a failure error. In Ruby, we can represent this by using an open struct to build the monad result. The struct has two attributes: success and failure, which can be used to check the result of the service.

00:09:12.080 If the service is successful, the result attribute will contain the service's result; if it fails, the error attribute will capture the reason for the failure. Now, let's apply monadic handling to our create order and checkout order services.

00:09:39.600 We begin by making create order a child of the monadic service and wrap it within the monad result. The same goes for the checkout order. This approach allows us to compare the two potential values: success or failure.

00:10:03.760 After applying these changes to our controller, we observe that it handles errors more clearly. Compared to the earlier naive implementation, the structure and flow of the code are significantly improved.

00:10:23.400 Using monadic handling also provides a standardized and consistent way for error handling, allowing for explicit management of success and failure cases. This reduces the complexity of handling different scenarios, making the code more readable and maintainable.

00:10:41.760 Some may still wonder if this approach is enough for integrating into my application or when to use it for error handling. In general, we can manage errors by adhering to two key concepts.

00:11:00.000 First, think back to the initial formula we discussed: f(x), where the input x must meet certain valid criteria before executing. If the x is valid, we can then ensure that the expected outcome will equal y without any issues.

00:11:23.720 The two concepts are validity and execution. Validity refers to ensuring that inputs confirm to expected requirements for the function. For instance, if our problem requires x to be a positive integer, our validity error handling will check this requirement.

00:11:46.800 If the input is invalid, we will return early with an error message. So, we can map validity errors with expected errors discussed earlier. Execution errors refer to issues that arise during the actual execution of the function when all validity checks pass.

00:12:14.760 So, if any unexpected errors occur during this execution, our error handling should catch and handle the exception gracefully, logging details for debugging and reporting the error to the developer.

00:12:37.040 In summary, it's important to question whether an error is truly unexpected before raising it, as unexpected errors can lead to significant disruptions in program flow and execution.

00:12:58.080 To wrap up, in this talk, we have explored the different types of errors in software applications, reviewed how exceptions work in Ruby, created a boundary for business logic layers, encapsulated them within service objects, and employed monadic error handling by combining service objects with monad results.

00:13:20.760 More importantly, you now have practical strategies for handling errors effectively. In the end, preventing your system from crashing requires more than just basic error handling.

00:13:43.360 Thank you for your attention.