Rapidly Mapping JSON/XML API Schemas in Ruby

by Adam Cuppy

The video titled 'Rapidly Mapping JSON/XML API Schemas in Ruby' features Adam Cuppy at RubyConf 2014, focusing on the integration of JSON API services into Ruby applications. Cuppy discusses the necessity of building adapters to map APIs effectively, emphasizing an approach that prioritizes flexibility and adaptability in application architecture.

Key Points Discussed:
- Introduction to the Speaker and Company: Adam Cuppy introduces himself as part of Zeal, a software company, and invites attendees to connect with him on social media. He mentions the importance of the content, which is available on GitHub.
- Focus on JSON and XML: While primarily focusing on JSON, Cuppy also addresses XML and YAML, explaining that the core principle is the adaptability of data structures into Ruby objects.
- Assumed Knowledge Level: Cuppy assumes the audience has a basic understanding of Ruby’s standard library and programming patterns like decorators.
- Building API Adapters: He highlights the importance of creating a connection class that handles data retrieval, using the Google Civic API as a practical example. This connection manages requests using HTTP methods while keeping the application insulated from underlying details.
- Separation of Responsibilities: Cuppy stresses the need to separate the connection from the client object, with the client functioning like a controller in Rails. The client is responsible for the logic surrounding requests and responses between the application and API connection.
- Data Handling and Coercion: After fetching the data, he discusses transforming it for usability in the application, advocating for an instance-oriented architecture. He proposes the use of OpenStruct for handling representations of data within Ruby, promoting flexibility to accommodate changes in API structures.
- Adapting to API Changes: The discussion includes strategies on how to accommodate potential changes within the API without major disruptions to the existing architecture. This approach involves creating structured classes that reflect API responses.
- Flexibility in Representations: For applications dealing with multiple data formats, Cuppy mentions gems like Representable, supporting the adaptability to render data in formats like XML or YAML.
- Conclusion: Cuppy concludes by asserting that an effective architecture is instrumental in quickly adapting to changes and debugging, thereby enhancing the usability of services like APIs. He encourages ongoing engagement with the material and resource links provided.

Overall, the talk emphasizes the principles of designing adaptable Ruby applications capable of efficiently communicating with varying API schemas.

00:00:17.920 My name is Adam Cuppy, and I am a zealot, let it be known. I am part of a company called Zeal, which is actually located in southern Oregon. However, come January, we are opening an office down here in the San Diego area of Southern California. If any of you Ruby enthusiasts are in the area, please talk to me.

00:00:22.960 You can find us on our website, codingzeal.com. Of course, my Twitter handle is @adamcuppy, so please follow me.

00:00:28.080 Feel free to heckle me there, just not here. You know what I'm saying.

00:00:35.840 I don't entirely encourage you to get this information now, but all the content, including a lot of code that I will be presenting today, is located at this URL on GitHub. You can access it now.

00:00:43.040 There are some modifications, and they are intentional, so please be kind and acknowledge that there's a little bit of a difference. I subscribe to the Sandy Mets mentality of using really small classes. In this talk, I'm going to be presenting things that are more consolidated in one large class, but I will break it apart into its more subsequent parts.

00:00:54.079 Again, if you didn't hear me the first time, please follow me on Twitter. By all means, heckle me there—it's perfectly acceptable. Now, moving on, this talk is part of the system architecture track.

00:01:06.080 In this talk, I'm focusing on rapidly mapping APIs. In the talk description, I mentioned JSON and XML. I will primarily focus on JSON, but I include XML because one of the core fundamental ideas is that you should be able to adapt any data structure into the appropriate Ruby objects and start working with them.

00:01:12.159 So again, while I will focus a lot on JSON, just know that XML and YAML will also be addressed later in the talk. I will provide some great examples of tools you can utilize to interchange that schema into whatever you need.

00:01:18.720 Oh, I forgot something! This is really important—I need to do a presenter selfie. I need you all to scoot in closer.

00:01:23.920 Okay, ready? Oh wait, I need to do this... presenter selfie! Woo! Come on, give me some energy! Okay, that's all I needed, thank you!

00:01:29.439 Moving on. I want to make a couple of assumptions about your skill level as a Ruby developer: you should be familiar with a good amount of the standard library, specifically with decorators and the Forwardable module.

00:01:35.840 I won't be going into these topics too heavily, but just know they are major constructs. If you're not familiar with the decorator programming pattern, I recommend looking it up online. It's quite popular, as it allows you to assign additional attributes onto an object at runtime.

00:01:42.880 Now, let's start this off by discussing the three C's, one of which particularly relates to mapping an API: we need to build adapters.

00:01:48.079 The first step is to consume some kind of dataset. If you're familiar with standard databases like PostgreSQL, you'd have something like ActiveRecord to hit the endpoint, pull in data, and build that into objects. Then you interact with those objects.

00:01:54.399 In this talk, we'll go through the process of building a small adapter to connect to the resource itself. The goal of this adapter is to consume the data. Let's first talk about the connection.

00:02:05.360 Think of the connection as the relationship manager, a lock and key. It has strong separation between our resource and our application. The furthest point to the resource will be our connection.

00:02:11.840 This connection holds the translation table, so our application does not need to know how it gets the data or where it's coming from. It's completely isolated and separated.

00:02:18.160 Now, let's look at a simple diagram of our system. At the bottom, there is a resource, and at the top, we have our connection.

00:02:24.160 The primary function of the connection module is to handle communication between the two, primarily via HTTP, using GET and POST requests.

00:02:29.360 The connection should manage the translation table, knowing how to connect and what to do when it connects. The rest of your application should be unaware of these details.

00:02:34.879 We're going to use the Google Civic API as our use case. Here's a very simple endpoint. The opening component indicates that we have a schema of some kind.

00:02:40.000 Our protocol will be a secure connection using HTTPS. We will hit the Google API's domain, specifically targeting civic info. This setup is very common with APIs.

00:02:47.280 Whenever our consultancy builds out an API, we encourage versioning for obvious reasons. The connection needs to be aware of the versioning to route properly.

00:02:54.079 In this example, we are currently working with version one. To build out the connection, we're going to start with a very simple connection class.

00:03:01.599 I recommend using the HTTP library, which will handle the basic HTTP connection. While it's very simple, there are many other gems out there, like Faraday, which can be more configurable.

00:03:07.040 But in this situation, we're going to create a basic connection class and include the HTTP library. The connection's primary awareness will be the version it is on and the root route for the data.

00:03:14.000 This means that the rest of the application does not need to concern itself with those details. It simply requests the information.

00:03:20.320 With that established, let’s extend our connection class. We can introduce a default query, which will be an empty parameters hash.

00:03:28.160 In the initializer, if a query is passed, we will use it; otherwise, we'll apply defaults. This allows us to bind additional attributes into the query and persist the API key.

00:03:35.120 Our first implementation of this will look something like this. We will instantiate a connection object, passing in an API version. Even though our default version may default to two, we can exclude it if we choose.

00:03:43.840 With the connection object, we can hold all values necessary to manage the request. Next, we need a routing and translation table between our application and the data that is retrieved.

00:03:51.680 The client object often gets mistaken for the connection object, and while they're generally combined, I believe that separating them is highly beneficial.

00:03:58.320 The client should contain all the logic that knows how to take a request, make it to the connection, retrieve the results, and return it to the application.

00:04:04.480 Think of the client as functioning similarly to the controller in a Rails application.

00:04:11.040 The client's primary role is to route and collect data between the application and the connection. The application will make a request, and the client will communicate with the connection to retrieve the data.

00:04:18.560 To bind these two together, we'll set up a basic client object, passing in the connection as well as a routing table.

00:04:25.040 The routing table will define the scope of the endpoint, including HTTP methods and paths. The connection does not care about what the application wants to do with the data.

00:04:30.800 Its responsibility lies in fetching that data based on the requests. As an example, our routing table can be structured simply.

00:04:37.440 If we define a key for elections with a GET request method and the specific path, we have the basis of our routing.

00:04:44.000 Now that we have established our connection and defined routes, we can tell our client what we want and how we want that information returned.

00:04:50.639 We would expect to call a method on our client, for example, elections, which should return the desired data.

00:04:56.800 The simple answer would be to define an elections object directly on the client, but this approach lacks flexibility.

00:05:03.840 To maintain dryness, we can define a method missing functionality. This function can be set up to retrieve the key requested, such as elections.

00:05:11.200 The routing map will then extract the HTTP method and path, and we can call on our connection object to retrieve the results.

00:05:19.040 By doing this, our application does not need to know about versioning or base properties; it simply requests the data it wants.

00:05:28.320 Now, when we execute the method to get elections, it will return a JSON payload. HTTP Party will parse this payload into a Ruby hash.

00:05:35.200 This hash may have attributes like an array of elections and corresponding attributes such as id, name, and election date.

00:05:42.240 Once we have this data back as a Ruby construct, we can start interacting with it.

00:05:49.520 However, by using a raw hash, we can encounter limitations. For example, if we try to access a key that does not exist, Ruby will return nil, which can lead to complications.

00:06:03.680 Many developers dislike dealing with nil, which complicates safety and validation in our code.

00:06:10.080 Thus, the goal should be to extract the raw data structure into a more usable format, allowing for additional methods and potential validations.

00:06:17.919 Monkey patching or decorating a hash is often suggested, but there are better ways to interact with the data.

00:06:24.640 The next step in this process is coercion, which involves transforming data into a format understood by our application.

00:06:34.640 We need to maintain separation between our resource data, our connection, and our application to ensure that changes in data structures do not impact other components.

00:06:42.080 The idea is to create an instance-oriented architecture, pulling data out of its primitive constructs into a structure that our application can easily manipulate.

00:06:50.080 To do this, we will create a representation which acts as an entity mapping. These representations will handle hash instances and correspond with our client.

00:06:58.080 For example, our representations can interchange between Ruby instances and their JSON counterparts.

00:07:05.679 A standard library called OpenStruct allows this mapping functionality with minimal overhead. While not always the fastest, it's a simple solution.

00:07:13.440 It is crucial for our application to focus on representation, avoiding dependency on how data is structured at the resource level.

00:07:20.159 We want to be able to request data without worrying about any underlying structure changes.

00:07:28.080 Let’s implement our representation class, inheriting from OpenStruct, allowing us to build on top of that.

00:07:33.840 For instance, we can pass a representation parameter and a parent parameter which will default to nil.

00:07:40.000 By running the initializer, we will set up the necessary structures that allow our data to be effectively utilized.

00:07:47.360 Our method for representing children will allow us to handle our hash representations smoothly.

00:07:54.239 In this rapidly evolving mapping, we can effectively convert our data into structured classes that can easily adapt.

00:08:01.600 The core part of this revolves around defining types and being able to coerce data into recognizable formats for our application.

00:08:08.640 Once we define a class, we can instantiate it as a representation, allowing the data to follow a recognized structure.

00:08:15.919 This process not only maps relationships but creates functionality that effectively mirrors the API's response.

00:08:22.080 As we begin to map and work with these classes, we must also consider the potential for changes to the API and how it impacts our system.

00:08:30.560 For instance, if we add an attribute like locations, adapting our code must be straightforward within the representation.

00:08:38.240 With representations, we can easily add new attributes without extensive modifications to the underlying architecture.

00:08:45.040 Now that we have classes and object interactions, we can add more libraries or functionality, enhancing our data handling capabilities.

00:08:52.000 As for responding to XML or YAML data, there are gems such as Representable that can help by decorating our models to render in various formats.

00:09:00.960 This flexibility benefits our architecture by allowing diverse representations without altering the application logic.

00:09:07.840 To summarize, effective architecture adapts quickly to changes while enclosing data in appropriate constructs.

00:09:14.240 Quick adaptations allow us to catch bugs and deploy solutions rapidly—in essence, establishing a system that can absorb change effortlessly.

00:09:22.000 Thank you all for your time and attention! Please feel free to reach out to me or connect online.

00:09:27.840 If you have any questions or interest in the slides or code, visit the provided link where everything is available.

00:09:34.080 I appreciate your engagement and hope you have a wonderful rest of your conference!

00:09:41.120 Thank you again!