00:00:15.020
Thanks for being here, either in person or watching online. My name is Gui Vieira, but you can call me Gui. I'm a software developer at Shopify, working with the API patterns team in Vancouver, Canada.
00:00:20.460
I'll make some assumptions that may or may not be related to your experience with GraphQL. GraphQL could be something new to you, and you may want to learn more about it. I'll break down some GraphQL queries, especially in the first half of the presentation, so hopefully, everyone can enjoy it. Maybe you're considering using GraphQL in a project and want to visualize how it might look, but you haven't implemented it yet. Or you may already be maintaining a GraphQL project and want to enhance it.
00:01:01.379
I've been in both of these positions, and there's a common question that arises. This question can come from yourself, a teammate, a manager, or from the internet: How does GraphQL compare to REST? I've read and discussed how GraphQL compares to REST multiple times. Is it a fair comparison? Are we comparing apples to apples? Is GraphQL just a replacement for REST? So, let's explore what GraphQL can do beyond HTTP APIs, and we can revisit this question at the end.
00:02:00.540
Let's view that API now and explore it. I present to you the Space Trips API, which is a fictional API for the space tourism industry. With this API, websites can book space flight tickets, manage passengers (who are the customers), and companies operating space flights can manage their flights and change schedules.
00:02:06.000
Let’s explore some queries we can perform here. We can start with a discovery query that will return the scheduled flights, showing the upcoming flights. We initiate this with the flight field and pass the schedule status, which will simply remove flights that were canceled or that have already occurred. This field returns a connection, which in GraphQL is a specification that's very common in GraphQL APIs. It helps manage arrays of objects and features like pagination and cursors. Within that connection, we have the nodes object that represents what we're returning—an array of flights.
00:02:44.879
For each flight, we want information such as the company operating the flight, ticket availability, and departure as well as arrival times. So, we are ready to make our first GraphQL request to our API. The response returns two flights: one from Space Y and another from Blue Origin. These are our space flight companies.
00:03:06.119
Let’s gather more information about the Space Y flight. The significant detail here is that the response structure mirrors the query structure. This means that if the query starts with flights, the response starts with flights too, followed by nodes and the respective fields for each flight. The predictable structure of a GraphQL response is crucial because it allows developers to write confident code without having to check for various scenarios.
00:03:23.040
Now, we will perform another GraphQL query to return the passengers for the Space Y flight. We start with the flight ID and call the flight field, passing the ID for this particular flight. We want to retrieve the passengers, which is also a connection field. For each passenger, we want to return their name, email, and text. This structure not only aids clarity but also enhances reusability in the queries.
00:04:00.540
This is nice, but can we make this query more reusable? Yes, we can use variables and extract hard-coded values into variables. Instead of modifying the entire query, we simply change the variables. We wrap the query in an operation, which can be named anything as it is user-provided, allowing clients and API consumers to organize their queries effectively. Next, we declare the variable name and its type; for example, flight ID is the variable name, and its type is ID. After replacing the hard-coded value with the variable name, we pass the variable in a Json object.
00:04:41.520
Now we are ready to run this query. The response includes two passengers for display. But what is happening behind the scenes? The GraphQL client takes your query and converts it into an intra-Json payload, making a POST request to your GraphQL endpoint. The Json payload has a query key with a string of your query and a variables key with an object containing the variable names and values.
00:05:01.200
On the server side, we have a controller that calls the schema to execute the query. For execution, we need to provide the query name, the variables, and the context. The context is particularly significant as it’s determined by the application. We can include any information there, typically something useful for executing the query, such as passing the current user for authorization purposes.
00:05:21.000
Both the query and variables come from the user, while the context is provided by the application. This gives us total control over the context. After execution, the result is returned as Json to create a Json response. Consider this: all values are simply literals—query is a string variable, variables is a hash, and the response is also a hash that we can encode to Json for the response.
00:05:56.220
There’s nothing here related to HTTP. There are no HTTP methods, paths, query strings, headers—nothing at all necessitating HTTP. Therefore, GraphQL doesn't depend on HTTP; we can execute it in different contexts. We can say that GraphQL is a transparent layer, agnostic to the transport method.
00:06:14.140
You can execute GraphQL in a controller, in a background job, or even execute GraphQL queries stored in the database. This opens various possibilities for using GraphQL. Since GraphQL has a schema and context, we have total control over authorization for anything we need. The responses are very predictable, and GraphQL requires very simple data structures; thus, there's a lot of potential here for various uses beyond executing from a controller.
00:06:39.000
Next, we have webhooks. Let's consider a fictional company called 'Orbital,' which sells space flight tickets. Orbital uses the Space Trips API to book these tickets and manages their customers. They wish to improve how they manage customers, which are the passengers. Whenever a flight ends, they want to add the astronaut tag to each passenger booked for that flight.
00:07:03.540
There’s an event that triggers this workflow, and for these events, they need to subscribe to a webhook, which is essentially a GraphQL mutation that subscribes to the 'flight finished' topic. So, every time a flight finishes, it triggers a webhook, and they must set up a web server to receive these webhooks. This involves exposing an endpoint and maintaining this server to ensure they don’t miss any webhooks.
00:07:30.260
When the flight ends, they receive their first webhook, which contains a lot of information about the flight but no information about the passengers. Without the passenger data, they cannot add tags. They need to make an API request to get the data they require. So, in this case, they would get the flight ID and create a new query using that flight ID to query for the passenger IDs.
00:08:04.580
Now, the response contains the data they need. Finally, they can add the astronaut tags to those passengers. It’s a lot of work, and they receive a webhook that doesn’t give them the information they need. This whole workflow consists of receiving a webhook, making an API request to get the passenger IDs, and then adding the tags that they need.
00:08:33.480
Can we improve this workflow? Is it possible to let API clients choose the webhook payload? Yes, we can achieve this with GraphQL because we define queries on every hook subscription, creating a graph-based webhook, and the client provides a GraphQL query with the information they need.
00:09:07.040
In this case, the client provides the query, and the server supplies the variables like the flight ID variable. They subscribe to the webhook, and on the server side, we can have a background job that previously just serialized a simple webhook. Now, our background job executes GraphQL, retrieves the information from the webhook subscription, and runs a GraphQL query in the background job.
00:09:29.000
At the end of this process, it sends the results of that query back to Orbital. With the new webhook, Orbital receives only the information they need, nothing more, nothing less. This improvement eliminates the need for the additional request they previously required.
00:09:52.380
In this entire process, they don't require any extra requests anymore; they now have all the information they need in the webhook payload, making it easier to add the astronaut tags to the passengers. This approach is beneficial because instead of a one-size-fits-all model, it allows clients to define precisely what information they want, improving value, efficiency, and ease of maintenance.
00:10:26.080
Now, let’s shift to something different. Orbital’s initial workflow is functional, but they still have to maintain a web server just to receive the webhooks. This demands resources for security updates, monitoring, network issues, and handling missing webhooks in the event of an outage. Can we move the logic of receiving the webhook and making the API request to the server?
00:10:46.080
Yes, we can leverage WebAssembly! WebAssembly allows us to write code in various languages, compile it, and run it in a secure environment. Using WebAssembly, we eliminate the networking overhead between Orbital’s systems and the server, meaning Orbital will no longer need to maintain a separate web server.
00:11:13.020
This allows for straightforward logic to be executed on Orbital's side. Moreover, several languages compile to WebAssembly, and as WebAssembly support is coming to Ruby soon, Orbital can utilize Ruby for these scripts. They can define a method called 'perform' that receives the webhook, allowing for predictable arguments since that’s the GraphQL response.
00:11:39.120
This method extracts the passenger IDs and runs the 'add tag' mutation passing in the passenger IDs. Once Orbital compiles and uploads this WebAssembly code to Space Trips, they’ll receive an ID for this script. For the new webhook, instead of delivering to an HTTP endpoint, Orbital will pass the ID of the script to be executed.
00:12:06.760
So, under server-side execution, this allows the background job for webhooks to execute the necessary query, loading in the WebAssembly script while providing context crucial for creating a secure environment to run this script. It’ll add the current user context needed for authorization.
00:12:43.580
The execution process will be similar to how execution with controllers functions; in this case, the background job executes the provided query, loads the WebAssembly script, and ensures appropriate context. Now, the entire flow occurs within the Space Trips servers without the back and forth network requests. An added benefit is that Orbital no longer needs to maintain a server just for receiving webhooks.
00:13:06.540
Another interesting use case is bulk operations. Orbital now has a new requirement where they need to export all their passenger data to synchronize with external systems. With a growing customer base of 5,000 that continues to expand, they require an efficient system.
00:13:39.240
This is a common use case as many API clients want to synchronize data. Often, they iterate through pages of records for processes like marketing, accounting, data analytics, and more. However, servers need to ensure stability and efficiency in response times, often creating pagination limits and rate-limiting strategies.
00:14:04.440
In typical use cases, queries use cursors that return a boolean indicating whether there's a next page. This approach can be inefficient, as Orbital’s 5,000 customers with pagination limits of 50 may need to perform 100 requests to retrieve all passenger data, and this number may further grow.
00:14:30.600
The situation worsens when exporting detailed data, such as flights for each passenger, leading to complex nested connections and paginations that rapidly become unsustainable. So, is there a way to export large amounts of data while keeping the server stable, maintaining good responses without overwhelming the server, and doing this efficiently?
00:14:52.920
For this, we can utilize bulk operations. Bulk operations are queries processed as background jobs, allowing us to split a request into smaller jobs running in parallel, whenever possible. The server handles the pagination, and Orbital can create discovery mutations as bulk operations, passing their intended query to run.
00:15:15.660
Now, this query runs as a background job on the server. The server parses the query, detects connections, and divides the job into smaller segments that will eventually be assembled and returned to the user. Since these jobs run as background processes, they can proceed in parallel, providing greater control over pacing without affecting overall API performance.
00:15:41.280
Upon completion of the bulk operation, the client—in this case, Orbital—can download the query result as a Json file. This improvement eliminates complexity, streamlining the process of exporting large data sets from the client to the server, making it more efficient for both parties.
00:16:04.680
Shopify employs bulk operations as well. If you're interested, you can explore the admin API documentation to experiment with these bulk operations. Overall, we've successfully covered various use cases of GraphQL beyond HTTP APIs, including webhooks, WebAssembly, and bulk operations.
00:16:28.479
But potential use cases extend further into realms like WebSockets and event messaging. Circling back to our initial query: GraphQL vs REST—GraphQL isn't merely a replacement for REST. The possibilities and capabilities of GraphQL reach far beyond what REST offers. When comparing the two, it’s essential to recognize that REST represents only the tip of the GraphQL iceberg.
00:17:03.960
There's much more to explore within GraphQL, and I encourage you to delve deeper into it. There’s a wealth of features and capabilities to discover, opening up avenues for innovative solutions. Thank you all for joining this presentation!