Optimizing for API Consumers with GraphQL

Optimizing for API Consumers with GraphQL by Brooks Swinnerton

GraphQL is an exciting new query language that's transforming the way we think about APIs. Used in production by Facebook, GitHub, and Shopify, it challenges RESTful API design by empowering consumers to query for exactly the information they need. In this talk, I will give an introduction to the query language, how GitHub uses it internally with Ruby and Rails, and the lessons they learned launching their GraphQL API externally.

GoRuCo 2017

00:00:16.470 Hi everyone, my name is Brooks Swinnerton. I am a platform engineer at GitHub, where the team I work on focuses on our APIs and webhooks.

00:00:22.869 Today, in the theme of developer happiness, I'd like to talk about optimizing APIs for consumers with GraphQL. Before we dive into GraphQL, let's start with APIs, specifically REST APIs.

00:00:30.130 I'm curious, by a show of hands, how many of you have written code that consumes a REST API? Awesome! Cool! How many of you have written your own REST APIs? Dang, awesome. And how many of you, before today, have heard of GraphQL? Dang, that's amazing! Alright!

00:00:44.100 Throughout today's talk, I'd like to use a particular API as a point of reference. This is an API that I put together about a year ago, and it's a wrapper around an NYC open data set for health inspections of restaurants here in New York City. For anyone who is visiting New York City for the conference, whenever you walk into a restaurant, they are required to post their letter grade, which is the health rating or the letter grade from the last inspection they received.

00:01:06.790 When looking at a REST API, there are a few components to it. The first is an HTTP verb or method that specifies the operation we want to perform against the API. The second is an endpoint, which is an identifier that allows us to specify which part of the API we'd like to interact with. If we execute this request, we'll get back a response that looks like this. In this case, the API returns JSON comprised of resources.

00:01:31.210 If we look at what one of those resources might look like, we can see that there are a few components to the response. The first are attributes, such as the name and address of the restaurant, the cuisine that it's categorized as, and sometimes, there are hypermedia components, which are machine-readable links to other places in the API about information related to the resource. For example, there may be an inspections URL that, if we follow the link, will provide more information about the inspections that the first restaurant has received.

00:02:06.330 As we execute more requests, we can build a fuller picture of the data we need to create our application. However, if you imagine us as a desktop application, you can see that it takes numerous requests to gather the information required by the client. This is especially painful when working with a mobile client, which might be on a very slow connection.

00:02:30.050 So, REST APIs are optimized around servers and not necessarily clients. They are modeled around the resources they return, which makes them general-purpose and reusable from client to client. Honestly, that's one of REST's biggest strengths. However, I pose a question to us as API designers: What if we wanted to put our clients first? What if we wanted to create a better developer experience for those consuming these APIs?

00:03:16.350 One pattern for achieving this is through the backends for frontends approach popularized by Netflix. The general idea is that each specific device accessing an API, like Netflix's API, has its own separate endpoint. However, this can become problematic when dealing with deep pagination or complex object relationships returned from a single endpoint. For example, you'd have to iterate through various restaurants, their inspections, and their violations. This can be especially difficult if a new device is released that needs to handle all of this new information.

00:04:02.770 This approach only really works if the API designer fully understands the client and its data requirements. What about public APIs, where you may not have insight into the client's data needs? That's where GraphQL comes in. It is a data query language for your API. Think of it like SQL, which is the lowest common denominator among various technologies like MySQL and PostgreSQL.

00:04:58.370 GraphQL is a query language specification that has been implemented in various languages, including Ruby, JavaScript, Python, and more. The language was originally created by Facebook to address the data access needs of their mobile clients. Here’s the "hello world" of GraphQL. You'll notice that it resembles JSON; it looks almost like JSON but lacks the values. This is on purpose, and we will see why shortly.

00:05:50.080 There are a few different components to a GraphQL query. The first is what we call a selection set, and it represents the currently authenticated user. For instance, for the currently authenticated user, I'd like to fetch the first name, last name, and email. We call these fields. If you execute this query against a GraphQL server, you will receive a response that looks like this, which is also JSON. The query defines the structure of the response, and that's a crucial aspect of GraphQL.

00:06:23.760 Returning to the NYC restaurant grades example, we still have an HTTP verb and an endpoint, but that's pretty much where the similarities with REST end. The reason we use an HTTP POST method is that the client must send information to the server—in this case, it's the query to be executed.

00:07:12.570 If we look at an example for this particular API, you'll see that it resembles the hello world example, but now we're discussing REST, and there's a new concept in GraphQL called an argument. We're saying we’d like to fetch the name of a restaurant called "Cafe Jia." An argument is a way for us to pass relevant information to the server so that it can modify the response based on what we're searching for. Executing this again results in a structured response mirroring the query.

00:07:52.680 Let's explore some features of the query language. First, GraphQL is typed, and the type system is a product of your application code. Looking back at the original hello world example, there's actually a schema behind the scenes—the schema that GraphQL queries. This is known as an IDL. You can think of it as a database schema, except that instead of describing the structure of your database, we're describing the structure of what can be queried.

00:08:54.520 If we break down the query, we can see that each part has a corresponding entry in the schema. The first is "me," which returns a user type, defined below in the schema. The user type implements three fields: first name, last name, and email—all of which return a scalar value (a plain string). Returning to the NYC restaurant grades example, here's the corresponding schema for it, breaking down each part. It begins with the top-level restaurant, which accepts a name argument of type string.

00:09:53.150 One neat feature of having a type system in place with GraphQL is that you can immediately determine if a query is valid. If someone sends anything other than a string for this name argument, we can simply fail the request and say, "No, sorry, I don't support that." The restaurant type, defined below, returns both a name and a cuisine, both of which are strings. You'll notice there's an important distinction with the name field denoted by a bang, indicating that you will always receive a value for it—it's non-nullable.

00:10:32.640 What if we wanted to return a list of objects? In this case, we can utilize the plural "restaurants," which this particular API supports. You'll see that the argument here is different from the previous example where we had borough in uppercase. This query uses a GraphQL feature called an enum, where we specify a list of possible values. The corresponding schema illustrates this at the top level: we return restaurants that take an argument of restaurant borough. Using an enum here ensures that we can specify exactly what’s supported; it makes no sense for this application to accept any input other than these five boroughs.

00:11:39.710 Finally, we can query a list of restaurants, each of which has fields for name and cuisine. If we query two restaurants by their names, the expected response would look similar to this. However, you may notice a key conflict, which can be technically legal JSON but could cause issues in programming languages like Ruby. GraphQL allows us to alias field names, renaming a particular subset of the query to avoid conflicts.

00:12:14.550 Additionally, there’s some duplication in the field names, which isn’t ideal. A useful feature of GraphQL is the ability to create a fragment. Below, we define a fragment called "restaurant info" specifying the fields we want: name and cuisine. We can reuse that fragment with a dot-dot-dot notation in the query to prevent duplication.

00:13:00.780 Furthermore, GraphQL is introspective, meaning that it can reveal all the different types of data it can return. This capability allows documentation and client generation to happen automatically. A tool that harnesses the power of this introspection is GraphiQL.

00:13:46.370 Now, during this part, I'm going to attempt a live demo. Let's see how this goes. This is GraphiQL, which consists of two main parts. On the left, we can craft a query, and on the right, we can preview what the response would look like. I've configured this to connect to a local Rails server, and let’s give it a shot.

00:14:54.000 Let's say we want to fetch a particular restaurant. As I start typing, you’ll see GraphiQL offers suggestions—this is due to its use of introspection, which allows it to be context-aware. So, I'll autocomplete my search for a restaurant name. Let’s say I want to fetch just the address as a starting point. After executing the query, I can hit enter, and the response will return just the address.

00:15:36.700 However, I might want to add more fields. If I don’t know what’s available to me, GraphiQL has a built-in documentation explorer on the right-hand side, also leveraging GraphQL's introspection features. By diving into the root query, I can see all the data points available to me. Exploring the restaurant object shows the IDL structure we just discussed.

00:16:51.140 Let's say I want to fetch more information about the restaurants, such as their cuisines and addresses in Manhattan. Since we’re requesting restaurants in plural, I can expect to receive a list in return. Notice also the potential to dive into various inspection types which can provide detailed insight on each inspection. Let's execute a query to fetch all the inspections and their grades alongside other relevant data.

00:18:48.140 We can dive deeper and see violations during inspections, providing a comprehensive view. After reviewing my query’s structure, I see it returns structured data based on requests, allowing for versatile client-side application development.

00:19:26.420 Shifting back, one of the key aspects I wanted to highlight is the ability to fetch multiple resources in one roundtrip. In contrast to earlier examples where a desktop client had to make several requests to build its data picture, each device can now utilize a single GraphQL endpoint.

00:20:07.220 Now, let’s discuss what it takes to implement a GraphQL server. The NYC restaurant grades website is essentially a Rails app. When a request is made to /graphql, it follows a familiar URL structure. The controller takes the query from the parameters (as a plain string) and passes it to our schema for execution.

00:20:36.300 Once we get back the result from the schema, we render it as JSON and respond to the client. To accomplish this, we need a GraphQL implementation. A commonly used gem in Ruby for this purpose is 'graphql-ruby,' created by Rob Machado, which allows us to define our GraphQL schemas in Ruby.

00:21:35.300 Focusing again on the restaurant query, we can visualize its schema definition, starting with the root query. This comes with a restaurant field that takes an argument named "string." In Ruby, setting this up involves defining a GraphQL object type, which is fairly straightforward.

00:22:09.490 While I've skipped over documentation for brevity, it’s essential. Using description DSL allows us to annotate our schema, which is utilized in the documentation explorer. Now, moving on to the specific restaurant fields, we start by defining their names, types, and resolvers to return the expected data from the schema.

00:23:09.300 Every field definition exhibits a resolvable structure, illustrating how to retrieve necessary information based on user queries. Crucially, GraphQL functions as a facade, providing a clear interface to your database, cache, and services. In our era of microservices, GraphQL helps encapsulate an application's complexities behind a single interface.

00:24:01.320 One common misconception is that GraphQL requires a complete shift to a graph database, which is not the case. Instead, we build a query language on top of existing application code. At GitHub, this means our APIs age over time with consistent functionalities while evolving incrementally to meet user needs.

00:24:31.140 As GitHub celebrates nearly nine years, our v3 REST API has driven services for six of those years. While we introduce new resources with new features, we aim to limit disruptive changes, as they force users to modify their existing code.

00:24:57.540 A year ago, we started conversations about what the next API version could look like.

00:24:59.970 In March 2016, the original proposal to implement GraphQL at GitHub submitted ideas, leading to a proof-of-concept that allowed querying repositories and user information—to the easy-to-understand objects in GitHub's system.

00:25:23.370 After reviewing our proof-of-concept, we all saw the potential for tremendous development over the next year. A week later, we organized a new team, and eight months later we released GraphQL to the public at GitHub Universe, a conference held annually in San Francisco.

00:25:48.180 This event was our chance to engage with users, receiving vital feedback on the API we built. Currently, we manage over a hundred million GraphQL queries every day.

00:26:12.630 Our usage of GraphQL extends both to external functions and internally; we leverage a gem called GraphQL client that allows an exchange of queries for easily readable Ruby objects. This enhancement allows for co-located queries directly within our Rails views. For those versed in React, this pattern may sound familiar.

00:26:42.640 Investing in tooling has proven fruitful. A gem named GraphQL Docs allows for taking introspective query results to create clean and friendly HTML documentation, showcasing types and fields with descriptions peoples may find helpful.

00:27:08.160 We apply schema-driven development. Previously, we developed features starting with the UI's needs—working from models to views.

00:27:34.870 With the new model, every added feature utilizes GraphQL from the outset, forming a live connection with both product teams at GitHub and external users utilizing our API.

00:27:57.270 Returning to the earlier question about putting clients first, I believe GraphQL represents a reliable approach toward achieving that goal. Thank you. If you'd like to find more information about GraphQL, visit NYCrestaurantgrades.com and GraphQL.org. Thank you!