00:00:24.680
Hello everyone, I'm Robert Mosolgo.
00:00:26.880
I work from home as a Ruby developer for a web startup called GitHub.
00:00:30.570
I just started there at age five, and I live in Charlottesville, Virginia.
00:00:36.480
I know I have one other Charlottesville person here.
00:00:41.460
Anyone else from Charlottesville? Sweet! So, there are three of us.
00:00:43.890
Peter and I are old neighbors, but we didn't know we were going to be here until now.
00:00:48.930
I wanted to talk a little bit about GraphQL, which we use a lot at GitHub.
00:00:52.289
I've been working on it a lot for the last two years. So, imagine, for a moment, that you are a Ruby on Rails developer.
00:01:02.250
You have an awesome website that people love using, and somebody says that you should build an API. Why? Why would you do that?
00:01:06.240
What do people do with an API? Why should you go through the trouble? You already have a great website that people like to use.
00:01:18.659
I think the reason you build an API is for other developers to build on your platform.
00:01:23.220
Maybe people want to keep the data from your system in sync with data from another system.
00:01:29.009
Or maybe they want to create their own views using the data from your system, combining things in a different way.
00:01:35.490
Historically, there’s a traditional way of programming that involves a kind of resource-based or REST API.
00:01:46.799
For every object in your system, there’s a URL that identifies it. The API allows people to go to that URL and get information representing that object.
00:01:54.899
What exactly represents the object is up to the server, and whenever you receive it, that’s when you find out.
00:02:03.659
By the way, my favorite hobby is playing Magic: The Gathering, so all of my examples are related to that.
00:02:19.019
So, in a perfect world, you have your API, other developers are interacting with your system, making great new things.
00:02:25.170
Everything is awesome until you check your performance metrics, like New Relic, and wonder what is going on at midnight every day.
00:02:34.769
Why does our system almost crash, or why is our database being so overworked? It’s because people are using our API.
00:02:45.810
They’re doing unusual things, sending unexpected inputs, and as the server developer, it’s a bad situation.
00:02:51.409
As a server developer, you can console yourself with the knowledge that the job of your API clients is also really difficult.
00:03:01.849
So, if you want to build a view, for example, a deck of Magic cards, you need to load a lot of objects to start.
00:03:17.310
First, you need the deck itself to get the name and some quantities, and then for each card, you need detailed data about it.
00:03:39.599
You end up having to fetch all of these resources, join them together, and keep track of what you have and don’t have.
00:03:48.900
I believe there must be a better way. In fact, Facebook was thinking this way about five years ago and invented GraphQL.
00:04:05.280
They’ve been using it for several years and open-sourced the idea a couple of years ago. Now, it's implemented in numerous applications.
00:04:27.750
Let’s take a look at what GraphQL is. This is a screenshot of GraphiQL, the GraphQL IDE. Here, you can write a GraphQL query.
00:04:35.610
You send it to the server and can see the response. I’ll describe GraphQL by giving an example. The first thing we will do is send a query.
00:04:53.160
A query means we are fetching data from the server; it’s like a read operation. The deck we were previously looking at has ID number two.
00:05:09.030
You can see the structure looks a lot like JSON. You have curly braces, and inside them are attributes that you can request by name.
00:05:12.449
In this query, we are looking for a deck by its ID number and asking for some properties: the name and the cards contained in it.
00:05:41.400
When you check the response, you will see that the structure matches the structure of the query. The first thing you receive is 'data', and if there are errors, there's a top-level key called 'errors'.
00:06:00.030
We asked for a deck's name, and for each card of that deck, we want the card name. Sure enough, that’s what we got.
00:06:30.620
Here’s a simple GraphQL query that fetches a little bit of data. Query is how you perform a read operation, while mutation is used to create, update, or destroy data.
00:06:40.800
Let’s write a bigger query. This query renders the view I showed previously. First, we need information about the deck since we need both the card and the quantity.
00:07:20.080
We might refer to it as 'splat', where your deck has all these slots for the cards. You may notice autocomplete is available; that's because the GraphQL system is strongly typed and self-documenting.
00:07:52.680
Every type in the system has built-in documentation known ahead of time for each of its fields, allowing a nice IDE experience.
00:08:04.080
Again, the response structure matches the same structure we requested. This was a fast introduction to GraphQL!
00:08:32.020
You can fetch these objects and make nested requests; for each object, you can reach through relationships to access other data.
00:08:54.590
Now, I want to show another feature of GraphQL called query arguments. Currently, we have a hard-coded value, which we can replace with a variable.
00:09:01.300
Whenever we send queries, we can include the value for that variable like this. The result of the query is somewhat like a function—it has a name, arguments, and returns a bunch of data.
00:09:12.680
GraphQL allows for code reuse with fragments. A fragment is a set of fields applied to a certain type, allowing for consistent output.
00:09:22.320
If views require the same fields over and over, a fragment helps keep everything in sync.
00:09:38.560
If you've read anything about GraphQL, you've probably heard about the comparison of traditional resource-based APIs and GraphQL.
00:09:58.150
In a resource-based API, every object has its own URL. You go to get it or maybe update it, and the server responds with some output.
00:10:07.460
With GraphQL, there’s only one endpoint. Every time you interact, whether you're creating or fetching, you go to the same URL.
00:10:21.320
The structure of the response is defined by the client. The objects are strongly typed and remain consistent, allowing you to resend the same query.
00:10:30.160
Another difference is in how related resources are accessed. In a REST API, each resource has its own endpoint, while GraphQL allows related resources to be accessed through object properties or fields.
00:10:48.000
As a client developer, it’s easier because you only make one HTTP request and can gather all the necessary data at once.
00:11:14.410
However, we cannot overlook some of the downsides of GraphQL, especially if you're considering your own implementation. A malicious query can overload the system.
00:11:36.920
Here’s an example of a previously mentioned query that aims to fetch a massive amount of data in one go, and that could crash or bog down your server.
00:12:16.240
The system's flexibility means that you as a server developer have to manage potential abuse, which is what I’ll discuss in the next few slides.
00:12:34.580
This slide may appear as a joke, but it highlights a real concern. Sometimes people hear about the flexibility of GraphQL and suggest just giving clients credentials to your database.
00:12:52.730
While it's an interesting perspective, there are several important differences with GraphQL. The logic for fetching data is implemented in Ruby.
00:13:07.610
Whatever models or business logic you already have in your app can be reused inside your GraphQL system.
00:13:27.160
Additionally, GraphQL is backend agnostic. This means that regardless of whether your data is in SQL databases or Redis or external APIs, everything is accessible through your GraphQL interface.
00:13:39.279
In GraphQL, even if you change your storage, you can maintain the same structure, which is a significant benefit.
00:13:55.429
I should mention that GraphQL is implemented in Ruby. A disclaimer here is that I'm biased because I worked on the implementation.
00:14:09.160
I’m aware of its flaws, so I'll focus on the positive aspects for now. GraphQL has an object type that lets you define structure in Ruby.
00:14:20.750
Each object in your application corresponds to a GraphQL type, which can be linked to an ActiveRecord model.
00:14:34.020
For example, a GraphQL type named 'Card' can mirror an ActiveRecord model of the same name, offering its own description.
00:14:47.060
For each field in the GraphQL type, you can define the kind of data it represents, such as ID, name, or other attributes.
00:14:53.000
These fields are linked to the methods on the ActiveRecord model, making it convenient to access data.
00:15:11.260
Similarly, objects can reference each other. In Magic: The Gathering, every card is part of a certain expansion with its own set of cards.
00:15:26.450
When you query a card, you can call its 'expansion' method to get details about the expansion it belongs to.
00:15:47.220
Occasionally, you'll require some custom logic, such as when fetching an image path for a card, which can involve navigating through relationships.
00:16:04.760
Now I’d like to discuss how we utilize GraphQL at GitHub. I mentioned earlier that having an API allows people to integrate with your system.
00:16:27.260
This integration is a significant part of how people interact with GitHub, whether through CI servers, project management tools, or various integrations.
00:16:35.490
Our goal is to simplify these interactions, which is why GraphQL is a preferred choice for many projects.
00:16:48.890
Interestingly, many of our Ruby on Rails views run GraphQL queries in the controller to fetch the data for those views.
00:17:08.660
This approach centralizes data access, allowing us to streamline database calls throughout our views.
00:17:21.780
By using GraphQL, we can monitor performance metrics and optimize slow queries more efficiently.
00:17:29.800
It's been a fascinating project to join, but giving users flexibility does come with some security trade-offs, which I want to address.
00:17:41.360
The first part of serving an API request is authentication. This involves confirming that someone accessing your system is who they say they are.
00:17:55.420
For GraphQL in Ruby on Rails, the authentication process is similar to that of other Rails controllers.
00:18:09.050
You begin by checking the user—maybe looking them up using a token. If they are authenticated, you proceed with the controller action.
00:18:27.510
After that, executing a GraphQL query involves taking the query string and running it, resulting in some JSON output.
00:18:39.720
There’s also a context that you can create once the user is logged in, which is then passed through the query.
00:18:48.850
This context object enables tracking what’s going on in queries, verifying permissions and access rights.
00:19:05.660
Authorization plays a key role here. Once the user is logged in, we need to check whether they can access the requested data.
00:19:19.490
This involves ensuring that the fetching process is scoped correctly. If a user is searching their own cards, we align the search to return only their cards.
00:19:37.100
Authorization can also involve a last-minute check after query execution to confirm access rights before sending data back.
00:19:53.020
You might raise an error or log the event if unauthorized data access happens, creating a system for tracking permissions.
00:20:04.960
An interesting approach involves wrapping an object, like a card, along with the user in a proxy object that includes both.
00:20:17.520
Every time you call a method on that object, it automatically checks authorization, acting as an authorized proxy.
00:20:33.640
Another challenge faced at GitHub and any GraphQL service is preventing abuse of the system. With a REST API, you can increment a counter for each request.
00:20:47.990
With GraphQL, however, we have a concept called nodes. Every object returned in the response is considered a node, allowing us to estimate the total response nodes upfront.
00:21:09.800
For instance, when querying a specific deck, the maximum possible nodes returned can be calculated based on the query structure.
00:21:30.560
If you ask for a deck, its cards, and their properties, you can determine the total count of nodes to manage rate-limiting effectively.
00:21:49.290
For each query that comes in, we count the nodes it could return and apply rate-limiting measures based on that information.
00:22:03.880
I see I have a couple more points left that I didn't get to cover in the talk, so I will list them here.
00:22:16.470
One concern is how to ensure efficient database access while executing complicated queries. Shopify has a library for that.
00:22:34.210
Additionally, you can analyze incoming queries to ensure that users have the correct scopes to access specific data.
00:22:52.440
Another method of securing the system involves configuring a timeout for queries that take too long, which is a last-ditch effort.
00:23:07.060
To keep everything in check, you can add instrumentation during field resolution to include additional logic or checks.
00:23:17.750
If you're interested in GraphQL and want to talk more about future strategies, feel free to approach me afterwards.
00:23:32.680
I work from home, so I welcome the opportunity to discuss these topics. Before I end, I’ve provided some URLs for further reading on GraphQL.
00:23:53.690
Lastly, I brought some stickers with the GraphQL logo on them. If you'd like one, just come up and ask. You don’t even need to talk to me!
00:24:05.790
Thank you so much for your time and attention!