00:00:05.000
Hi folks, welcome to my talk. Today, I'm going to be talking about designing APIs and why less data is more. I'm super glad I could present to you all today.
00:00:14.400
I'm sad that we're not all together at a nice location where we could meet up, but at the same time, it's great that we can all join this conference from across the globe.
00:00:26.640
Let me introduce myself. My name is Damir Svrtan, my pronouns are he/him, and I work as a Senior Software Engineer at Netflix where I spend most of my time building APIs. I'm recording this from San Francisco, which is where I'm currently based, having moved here about three years ago. Prior to that, I lived in Zagreb, the capital city of Croatia, a small country in Southeastern Europe. That's where I grew up and spent the first 28 years of my life. I used to organize a local Ruby Meetup called Ruby Zagreb, so a big shout-out to all the members of that community.
00:00:51.480
Now, why is this topic relevant to me? I've been building APIs for the last seven years and have seen a pattern where developers like to expose more data than is actually needed. I want to talk about avoiding overhead when designing APIs and the kind of overhead I mean avoiding building bloated, overly flexible APIs with queries that nobody asks for, endpoints that nobody's using, and generally unused functionality, like extra fields and relationships.
00:01:11.940
All of these things often stem from developers trying to be speculative about what's going to be needed in the future. They build things up front that they may never need, which they end up having to maintain. What kind of APIs am I talking about? I'm referring to HTTP-based APIs, such as REST APIs, JSON APIs, even GraphQL, and so on. However, this talk will be API technology agnostic and applicable to all kinds of APIs.
00:01:43.440
Throughout this talk, we're going to be building a blogging platform, something similar to Medium or Dev.to. It will be fairly simple, featuring authors who can release posts, and each post can have comments. We'll discuss two principles for building APIs: the first being designing the minimal API surface, or how not to overexpose data on your APIs; and the second being designing from strict to loose—how to avoid building extra flexibility that nobody asked for.
00:02:07.200
So let's start with the first principle: designing the minimal API surface. I often see a pattern where developers try to be speculative about what's going to be needed in the future, so they overbuild their APIs. I'll break this down into three patterns of a bloated surface that I usually see: redundant fields, redundant relationships, and redundant input fields.
00:02:40.500
First, let's discuss avoiding redundant fields. Imagine we have a requirement from our product management that we need to show the author of the blog post. Let's say we're storing authors in a database and it includes fields such as ID, first name, last name, email, and an avatar URL. In a design drawing inspiration from Medium, we have fields like avatar, first name, and last name, but notice we're not exposing the email.
00:03:14.400
The friendly API developer might suggest exposing the email because someone might find it useful in the future. They might think, "Why wouldn't we just expose that email field right away?" This actually saves time for the business, as it’s easier to expose now than later. However, what if we later need to remove that field for privacy reasons, such as compliance with GDPR or California privacy laws?
00:03:44.760
If we have to deprecate the email field, we might need to go through a deprecation cycle that involves a lot of communication with clients. This could mean sending out emails to clients to inform them of the change and giving them lead time to adjust. For private APIs with only a handful of clients, this process can be manageable, but for platform APIs with many users, this can become overly complex and time-consuming.
00:04:09.600
Having an API where clients can pick and choose which fields they want can help mitigate this issue, as seen in GraphQL or sparse field sets in JSON API specifications. However, without proper observability, you might not know if a field is being used or not, and unnecessary exposure can lead to complications down the road.
00:04:42.300
The technical aspect of removing a field is straightforward, but the communication and coordination required to inform stakeholders and clients can be exhaustive. Next, let's move on to the second part of the first principle: not exposing redundant relationships. This is similar to avoiding redundant fields but has some nuances.
00:05:17.580
Let's say we have a requirement to indicate whether a post has been reviewed, potentially including a reviewer field for future use cases, such as showing who reviewed the post. However, just like the previous example, this can lead to unnecessary complexity and maintenance work.
00:05:45.480
What happens if we later decide that instead of one reviewer, we now need to show multiple reviewers? This would involve another round of deprecation communication, further complicating the client experience and possibly breaking existing implementations. Thus, it's critical to delay decisions where possible and avoid overengineering.
00:06:13.620
Finally, in this first principle, let’s talk about avoiding redundant input fields. This pertains particularly to the payload your API accepts when mutating or changing data, such as in a REST API. An example scenario could involve enabling readers to create and update comments.
00:06:38.160
In this situation, we might define an input for creating a comment that requires two fields: post ID and body. The friendly API developer might suggest mirroring this for the update comment input, which would lead to exposing an ID field as well, even though we only needed the ID and the body.
00:07:07.680
This approach unnecessarily complicates the API's schema, requiring clients to handle logic that shouldn't be their concern. Instead, we should ensure that the logic of our application remains on the server side rather than on the client's side.
00:07:47.880
It's far easier to add things in the future than to remove them. When you start from a strict definition, it allows for flexibility in the future without complicating the client interaction. On the contrary, if we start with flexibility that later requires strictness, you're imposing breaking changes on your clients.
00:08:18.000
The key takeaway here is to avoid exposing redundant fields, relationships, and input fields to minimize bloat in your APIs. Now let's move on to the second principle: moving from strict to loose in API design. This principle is about understanding the balance between flexibility and stability in your API.
00:08:57.960
The first step is to avoid unnecessary flexibility. Your APIs should be designed with a clear understanding of the needs of your clients. This means that if an input is required, make it required, rather than overly flexible.
00:09:28.680
An example of this would be an endpoint designed to fetch comments on a post, which should ideally accept a post ID. The friendly API developer might make the post ID optional, thinking that clients might want to fetch all comments in the future without specifying a post. However, this strategy introduces unnecessary complexity to the application logic.
00:10:02.160
This leads to increased maintenance work when changes arise. More code translates to a higher chance of bugs and performance issues. Therefore, it’s crucial to develop a coherent logic that maintains consistency and effectively serves your clients.
00:10:40.680
Next, let’s talk about defensive programming. It’s essential to ensure your API is built to prevent abuse. For instance, if an API endpoint is designed to fetch comments, consider implementing pagination to avoid potential performance issues.
00:11:06.180
It's better to limit the number of comments returned per request than to push the server capacity beyond its limits. By establishing these limitations early in the process, it prevents larger scaling issues later on.
00:11:30.780
By focusing on these two principles—minimizing exposure of unused data and maintaining a strict yet flexible API structure—you can ensure a smoother experience for both API developers and clients. Additionally, you can save time and resources in the long run.
00:12:12.120
In conclusion, I hope you take away the importance of avoiding redundancy in API design. Redundant structures slow down progress on more critical features. By prioritizing effective API documentation and streamlining queries, you'll optimize your API's performance.
00:12:36.000
Less is more when it comes to data exposure. Although we often think we are helping our clients by building unnecessary features, we may inadvertently hinder development. Thus, it's crucial to communicate effectively with your clients while maintaining control over API structures.
00:13:12.840
Thank you all for listening. If you want to discuss this topic further, feel free to reach out to me afterwards or connect with me on Twitter.