RailsConf 2022

GraphQL and Rails beyond HTTP APIs

GraphQL and Rails beyond HTTP APIs

by Gui Vieira

In the presentation titled "GraphQL and Rails beyond HTTP APIs," Gui Vieira, a software developer at Shopify, explores the versatile applications of GraphQL, extending well beyond traditional HTTP APIs. The session, held at RailsConf 2022, delves into how GraphQL can serve as a secure and structured data layer in Rails projects, discussing several innovative use cases and implementation strategies.

Key Points Discussed:

- Introduction to GraphQL: Vieira begins by addressing the common question of how GraphQL compares to REST. He notes that while it's often seen as a replacement, GraphQL offers much more functionality that transcends the limitations of HTTP APIs.

- Case Study - Space Trips API: Using a fictional API for space tourism, Vieira demonstrates various queries to book flights and manage passengers, emphasizing how GraphQL queries return predictable structures that enhance developer confidence.

- Execution Contexts: He explains that GraphQL is agnostic to HTTP, allowing it to be executed in diverse environments like background jobs or even from database-stored queries, enabling a wide range of use cases.

- Webhooks: The presentation discusses how organizations can improve event-driven workflows using GraphQL webhooks. For example, an organization called "Orbital" automates passenger tagging processes post-flight, showcasing how GraphQL allows clients to specify the exact data they need in webhook responses, ultimately simplifying operations.

- WebAssembly Integration: Vieira introduces WebAssembly to replace the need for maintaining server endpoints for webhooks, facilitating function execution in a secure environment without excessive networking overhead.

- Bulk Operations for Data Export: The topic extends to handling large-scale data exports through GraphQL's bulk operations feature. This allows requests to be processed as background jobs, significantly reducing the strain on both the client and server.

- Comparison of GraphQL and REST: The conclusion emphasizes that GraphQL is not just an HTTP API alternative; it presents a wealth of capabilities, making it vital to explore its full potential.

Conclusions and Takeaways:

- GraphQL provides a flexible framework for various data handling needs in Rails applications.

- Beyond HTTP, GraphQL supports real-time data communication, minimizes unnecessary API calls, and allows for effective data query customization.

- Developers are encouraged to dive deeper into GraphQL to discover its extensive functionalities and adapt them to their project requirements.

00:00:15.020 Thanks for being here, either in person or watching online. My name is Gui Vieira, but you can call me Gui. I'm a software developer at Shopify, working with the API patterns team in Vancouver, Canada.
00:00:20.460 I'll make some assumptions that may or may not be related to your experience with GraphQL. GraphQL could be something new to you, and you may want to learn more about it. I'll break down some GraphQL queries, especially in the first half of the presentation, so hopefully, everyone can enjoy it. Maybe you're considering using GraphQL in a project and want to visualize how it might look, but you haven't implemented it yet. Or you may already be maintaining a GraphQL project and want to enhance it.
00:01:01.379 I've been in both of these positions, and there's a common question that arises. This question can come from yourself, a teammate, a manager, or from the internet: How does GraphQL compare to REST? I've read and discussed how GraphQL compares to REST multiple times. Is it a fair comparison? Are we comparing apples to apples? Is GraphQL just a replacement for REST? So, let's explore what GraphQL can do beyond HTTP APIs, and we can revisit this question at the end.
00:02:00.540 Let's view that API now and explore it. I present to you the Space Trips API, which is a fictional API for the space tourism industry. With this API, websites can book space flight tickets, manage passengers (who are the customers), and companies operating space flights can manage their flights and change schedules.
00:02:06.000 Let’s explore some queries we can perform here. We can start with a discovery query that will return the scheduled flights, showing the upcoming flights. We initiate this with the flight field and pass the schedule status, which will simply remove flights that were canceled or that have already occurred. This field returns a connection, which in GraphQL is a specification that's very common in GraphQL APIs. It helps manage arrays of objects and features like pagination and cursors. Within that connection, we have the nodes object that represents what we're returning—an array of flights.
00:02:44.879 For each flight, we want information such as the company operating the flight, ticket availability, and departure as well as arrival times. So, we are ready to make our first GraphQL request to our API. The response returns two flights: one from Space Y and another from Blue Origin. These are our space flight companies.
00:03:06.119 Let’s gather more information about the Space Y flight. The significant detail here is that the response structure mirrors the query structure. This means that if the query starts with flights, the response starts with flights too, followed by nodes and the respective fields for each flight. The predictable structure of a GraphQL response is crucial because it allows developers to write confident code without having to check for various scenarios.
00:03:23.040 Now, we will perform another GraphQL query to return the passengers for the Space Y flight. We start with the flight ID and call the flight field, passing the ID for this particular flight. We want to retrieve the passengers, which is also a connection field. For each passenger, we want to return their name, email, and text. This structure not only aids clarity but also enhances reusability in the queries.
00:04:00.540 This is nice, but can we make this query more reusable? Yes, we can use variables and extract hard-coded values into variables. Instead of modifying the entire query, we simply change the variables. We wrap the query in an operation, which can be named anything as it is user-provided, allowing clients and API consumers to organize their queries effectively. Next, we declare the variable name and its type; for example, flight ID is the variable name, and its type is ID. After replacing the hard-coded value with the variable name, we pass the variable in a Json object.
00:04:41.520 Now we are ready to run this query. The response includes two passengers for display. But what is happening behind the scenes? The GraphQL client takes your query and converts it into an intra-Json payload, making a POST request to your GraphQL endpoint. The Json payload has a query key with a string of your query and a variables key with an object containing the variable names and values.
00:05:01.200 On the server side, we have a controller that calls the schema to execute the query. For execution, we need to provide the query name, the variables, and the context. The context is particularly significant as it’s determined by the application. We can include any information there, typically something useful for executing the query, such as passing the current user for authorization purposes.
00:05:21.000 Both the query and variables come from the user, while the context is provided by the application. This gives us total control over the context. After execution, the result is returned as Json to create a Json response. Consider this: all values are simply literals—query is a string variable, variables is a hash, and the response is also a hash that we can encode to Json for the response.
00:05:56.220 There’s nothing here related to HTTP. There are no HTTP methods, paths, query strings, headers—nothing at all necessitating HTTP. Therefore, GraphQL doesn't depend on HTTP; we can execute it in different contexts. We can say that GraphQL is a transparent layer, agnostic to the transport method.
00:06:14.140 You can execute GraphQL in a controller, in a background job, or even execute GraphQL queries stored in the database. This opens various possibilities for using GraphQL. Since GraphQL has a schema and context, we have total control over authorization for anything we need. The responses are very predictable, and GraphQL requires very simple data structures; thus, there's a lot of potential here for various uses beyond executing from a controller.
00:06:39.000 Next, we have webhooks. Let's consider a fictional company called 'Orbital,' which sells space flight tickets. Orbital uses the Space Trips API to book these tickets and manages their customers. They wish to improve how they manage customers, which are the passengers. Whenever a flight ends, they want to add the astronaut tag to each passenger booked for that flight.
00:07:03.540 There’s an event that triggers this workflow, and for these events, they need to subscribe to a webhook, which is essentially a GraphQL mutation that subscribes to the 'flight finished' topic. So, every time a flight finishes, it triggers a webhook, and they must set up a web server to receive these webhooks. This involves exposing an endpoint and maintaining this server to ensure they don’t miss any webhooks.
00:07:30.260 When the flight ends, they receive their first webhook, which contains a lot of information about the flight but no information about the passengers. Without the passenger data, they cannot add tags. They need to make an API request to get the data they require. So, in this case, they would get the flight ID and create a new query using that flight ID to query for the passenger IDs.
00:08:04.580 Now, the response contains the data they need. Finally, they can add the astronaut tags to those passengers. It’s a lot of work, and they receive a webhook that doesn’t give them the information they need. This whole workflow consists of receiving a webhook, making an API request to get the passenger IDs, and then adding the tags that they need.
00:08:33.480 Can we improve this workflow? Is it possible to let API clients choose the webhook payload? Yes, we can achieve this with GraphQL because we define queries on every hook subscription, creating a graph-based webhook, and the client provides a GraphQL query with the information they need.
00:09:07.040 In this case, the client provides the query, and the server supplies the variables like the flight ID variable. They subscribe to the webhook, and on the server side, we can have a background job that previously just serialized a simple webhook. Now, our background job executes GraphQL, retrieves the information from the webhook subscription, and runs a GraphQL query in the background job.
00:09:29.000 At the end of this process, it sends the results of that query back to Orbital. With the new webhook, Orbital receives only the information they need, nothing more, nothing less. This improvement eliminates the need for the additional request they previously required.
00:09:52.380 In this entire process, they don't require any extra requests anymore; they now have all the information they need in the webhook payload, making it easier to add the astronaut tags to the passengers. This approach is beneficial because instead of a one-size-fits-all model, it allows clients to define precisely what information they want, improving value, efficiency, and ease of maintenance.
00:10:26.080 Now, let’s shift to something different. Orbital’s initial workflow is functional, but they still have to maintain a web server just to receive the webhooks. This demands resources for security updates, monitoring, network issues, and handling missing webhooks in the event of an outage. Can we move the logic of receiving the webhook and making the API request to the server?
00:10:46.080 Yes, we can leverage WebAssembly! WebAssembly allows us to write code in various languages, compile it, and run it in a secure environment. Using WebAssembly, we eliminate the networking overhead between Orbital’s systems and the server, meaning Orbital will no longer need to maintain a separate web server.
00:11:13.020 This allows for straightforward logic to be executed on Orbital's side. Moreover, several languages compile to WebAssembly, and as WebAssembly support is coming to Ruby soon, Orbital can utilize Ruby for these scripts. They can define a method called 'perform' that receives the webhook, allowing for predictable arguments since that’s the GraphQL response.
00:11:39.120 This method extracts the passenger IDs and runs the 'add tag' mutation passing in the passenger IDs. Once Orbital compiles and uploads this WebAssembly code to Space Trips, they’ll receive an ID for this script. For the new webhook, instead of delivering to an HTTP endpoint, Orbital will pass the ID of the script to be executed.
00:12:06.760 So, under server-side execution, this allows the background job for webhooks to execute the necessary query, loading in the WebAssembly script while providing context crucial for creating a secure environment to run this script. It’ll add the current user context needed for authorization.
00:12:43.580 The execution process will be similar to how execution with controllers functions; in this case, the background job executes the provided query, loads the WebAssembly script, and ensures appropriate context. Now, the entire flow occurs within the Space Trips servers without the back and forth network requests. An added benefit is that Orbital no longer needs to maintain a server just for receiving webhooks.
00:13:06.540 Another interesting use case is bulk operations. Orbital now has a new requirement where they need to export all their passenger data to synchronize with external systems. With a growing customer base of 5,000 that continues to expand, they require an efficient system.
00:13:39.240 This is a common use case as many API clients want to synchronize data. Often, they iterate through pages of records for processes like marketing, accounting, data analytics, and more. However, servers need to ensure stability and efficiency in response times, often creating pagination limits and rate-limiting strategies.
00:14:04.440 In typical use cases, queries use cursors that return a boolean indicating whether there's a next page. This approach can be inefficient, as Orbital’s 5,000 customers with pagination limits of 50 may need to perform 100 requests to retrieve all passenger data, and this number may further grow.
00:14:30.600 The situation worsens when exporting detailed data, such as flights for each passenger, leading to complex nested connections and paginations that rapidly become unsustainable. So, is there a way to export large amounts of data while keeping the server stable, maintaining good responses without overwhelming the server, and doing this efficiently?
00:14:52.920 For this, we can utilize bulk operations. Bulk operations are queries processed as background jobs, allowing us to split a request into smaller jobs running in parallel, whenever possible. The server handles the pagination, and Orbital can create discovery mutations as bulk operations, passing their intended query to run.
00:15:15.660 Now, this query runs as a background job on the server. The server parses the query, detects connections, and divides the job into smaller segments that will eventually be assembled and returned to the user. Since these jobs run as background processes, they can proceed in parallel, providing greater control over pacing without affecting overall API performance.
00:15:41.280 Upon completion of the bulk operation, the client—in this case, Orbital—can download the query result as a Json file. This improvement eliminates complexity, streamlining the process of exporting large data sets from the client to the server, making it more efficient for both parties.
00:16:04.680 Shopify employs bulk operations as well. If you're interested, you can explore the admin API documentation to experiment with these bulk operations. Overall, we've successfully covered various use cases of GraphQL beyond HTTP APIs, including webhooks, WebAssembly, and bulk operations.
00:16:28.479 But potential use cases extend further into realms like WebSockets and event messaging. Circling back to our initial query: GraphQL vs REST—GraphQL isn't merely a replacement for REST. The possibilities and capabilities of GraphQL reach far beyond what REST offers. When comparing the two, it’s essential to recognize that REST represents only the tip of the GraphQL iceberg.
00:17:03.960 There's much more to explore within GraphQL, and I encourage you to delve deeper into it. There’s a wealth of features and capabilities to discover, opening up avenues for innovative solutions. Thank you all for joining this presentation!