RailsConf 2018

“API?” – How LendingHome Approaches “Legacy” Technologies

“API?” – How LendingHome Approaches “Legacy” Technologies

by Sam Aarons

In this talk titled "API?" – How LendingHome Approaches "Legacy" Technologies, Sam Aarons discusses LendingHome's mission to simplify the mortgage process through technological innovation. The primary theme revolves around integrating modern systems with legacy technologies, particularly the challenges presented by third-party vendors utilizing outdated protocols like FTP and fixed-width files.

Key points discussed include:

  • Understanding the Landscape: Aarons introduces LendingHome as an online mortgage platform that has eliminated traditional paperwork by utilizing various technological solutions. He highlights their use of a Ruby on Rails application and numerous vendor integrations.

  • Integration Challenges: A major issue for LendingHome is dealing with legacy APIs from vendors. Aarons emphasizes the importance of recognizing where these vendors are coming from, acknowledging that their definitions of modern technologies might differ significantly from the expectations of a tech-savvy company like LendingHome.

  • Lessons Learned: Aarons outlines three critical lessons from their experiences:

    • Build the Interface You Want: Instead of directly integrating with a vendor’s legacy API, build an internal API that abstracts the complexities for your team. This helps mitigate difficulties associated with unfamiliar or cumbersome vendor interfaces.
    • Plan for Failure: Acknowledge the high likelihood of failures when working with legacy systems. The talk stresses the need for robust logging practices and maintaining copies of sent and received files to troubleshoot issues better.
    • Be Efficient When Reading and Writing: Aarons advocates for treating files as they are—utilizing efficient streaming parsers and methods to avoid memory overload when processing large files, as many legacy systems use outdated file formats like XML and CSV.
  • Case Studies: Aarons shares anecdotes about creating microservices, such as "Rainmaker" and "Grand Central," which efficiently manage interactions with various FTP vendors while adhering to the principles discussed. These systems allow for streamlined file transfers, minimizing manual intervention and errors.

  • Finally, Aarons concludes with a reinforcement of the importance of compassion and understanding in technology integrations. He encourages considering the human elements of vendor relationships and emphasizes that efficiency not only optimizes system performance but also fosters better collaboration between teams.

Overall, the talk serves as a practical guide for modern tech companies trying to integrate with older systems, illustrating the combined value of technical solutions and empathetic practices in overcoming integration hurdles.

00:00:10.610 All right, I think we'll get started. I just want to thank everyone for coming today. For those of you who don't know me, my name is Sam Aarons. I'm a senior software engineer at LendingHome.
00:00:16.830 Quick plug for LendingHome: we're hiring! If you're in San Francisco, Pittsburgh, or even remote, we're looking for workers for all our offices. If you like some of the topics discussed in this talk, come by and talk to me afterward, and I can point you in the right direction.
00:00:34.770 The title of this talk is "API? How LendingHome Approaches Legacy Technologies." Before I dive into what that means, let me explain a bit about what LendingHome is.
00:00:46.649 LendingHome is rethinking the mortgage process from the ground up using technology. Essentially, it's a 100% online, simple, and elegant way to get a mortgage. All document uploads are handled through the website—no more faxing someone, no more sending a FedEx envelope full of documents, and no more needing wet signatures and going back and forth with your bank while not knowing where you stand.
00:01:11.280 LendingHome essentially takes the entire mortgage process online. We lend off our own balance sheet, meaning you receive the money much quicker than with other platforms, like crowdfunding websites. It's really a technology-first solution.
00:01:29.400 We've built this using a vast array of technologies and vendor integrations. This is only a small sample of our integrations and the various technology solutions we employ. We're a Ruby on Rails application, which is why we’re at RailsConf. You can see a lot of the technologies here, alongside a few of the vendors we integrate with.
00:01:46.530 In reality, LendingHome has hundreds of vendor integrations. We have partnerships for ordering appraisals on homes; there's even a literal API where you send a request and someone leaves their office, takes pictures of a house, writes up a report, and sends it back to you.
00:02:06.450 When I say we've seen everything, I really mean it. A majority of our API and vendor integrations are through standard RESTful JSON APIs. Many of you are likely familiar with this—you submit a JSON request and receive a JSON response. You may even have some webhooks if it's an asynchronous call. We've encountered this format quite frequently.
00:02:29.939 However, when I say we've seen everything, I mean it in the technological sense. This is an anonymous employee at a vendor highlighting a common response we get back from some of the vendors we work with. It's easy to poke fun at our vendors for this, but it’s important to understand where they're coming from. In their world, an FTP file transfer API is their definition of modern technology to serve their clients.
00:03:10.799 It's crucial to recognize this as the standard in the financial services industry and acknowledge this when integrating with them. Criticizing vendors or speaking negatively about them won't build an integration faster, nor will it help them develop an API. We must meet them where they are, with compassion and understanding. This is the difficult aspect of LendingHome's work—integrating with legacy APIs provided by these vendors.
00:03:42.000 This talk will cover what we have learned over the years from integrating with a wide variety of legacy API vendors. If I had to break it down into three major points, I would say: 1. Build the interface you want. 2. Plan for failure. 3. Be efficient when reading and writing.
00:04:01.349 Let’s start with the first point: Build the interface you want. The emphasis here is on you—you as an individual, or if you're working in a larger company, you as your team. It's crucial to build the interface you wish your vendor had created. This is important for several reasons. You don't want to train your teammates to build directly to a vendor's API.
00:04:48.050 Instead, it’s much simpler to develop an internal API or some kind of abstraction layer that simplifies the task of generating files and vendor-specific formats. At LendingHome, we follow certain best practices every time we integrate with a vendor. Some key points include: clean interfaces benefit people, and machines don't really care how you integrate—machines will integrate based on your instructions.
00:05:35.599 The clean interface created by an internal API primarily benefits you and your teammates. By constructing a clear internal API, you can help your colleagues interact with the vendor without needing to comprehend the specific file formats and other technical details required.
00:06:12.110 It's important to foster empathy for your teammates who may have never heard of a particular vendor or how to integrate with them. To aid them, especially if a vendor will be widely used, make it as simple as possible by building the interface you wish your vendor had provided.
00:06:41.969 Additionally, ensure you use technologies that align with the vendor's synchronous or asynchronous interfaces. This is particularly important when internal APIs are deployed as microservices. Many file-based legacy vendors operate asynchronously by nature; you might drop a file into their FTP server, and they may respond anywhere from five minutes to an hour, or even the next day.
00:07:28.469 You wouldn't want to map a synchronous HTTP microservice interface over an inherently asynchronous one. Understanding the system and using technologies that match its operational flow are vital. At LendingHome, we use AWS, where SQS provides a great asynchronous messaging solution, and many of our microservices that integrate with FTP vendors utilize SQS for communication.
00:08:02.729 Around 2015, LendingHome was funding a lot of loans—we were processing vast amounts of money, sending and receiving funds from borrowers while doing it all manually. We logged into the banking portal, manually entered numbers, and verified their accuracy. However, that approach is not scalable and is unreliable.
00:08:58.350 When we reached a certain scale, we needed programmatic interactions. We approached one of our banks and requested to set this up, and they agreed, but the interface they offered was over FTP and they did not have an API.
00:09:40.530 At LendingHome and many other companies, we had to accept this reality. We built a microservice called Rainmaker, which exposes a straightforward internal API for the platform—just a few fields indicating the amount, the direction (credit or debit), the source, and the destination of the funds.
00:10:31.540 Internally, Rainmaker transforms this into a proprietary XML file that we drop into the bank’s FTP folder. On a set schedule, we receive responses, and Rainmaker parses those files and sends messages back to the platform.
00:11:00.110 This setup works well for us. We've built the interface we wanted, employing an asynchronous, SQS-based microservice connecting directly to our vendor. However, this solution does not scale, especially considering multiple FTP vendors. We needed a system that could work with various FTP vendors without duplicating connection logic in multiple microservices or the main platform.
00:11:43.530 This prompted us to ask what kind of interface we desired. We iterated and devised a better solution, creating a distinct microservice named Grand Central. This service pulls files from specific folders in S3 and processes them by sending them to the appropriate FTP vendor integration.
00:12:33.670 Simultaneously, it pulls from all FTP vendor integrations, processes new files, and deposits them in the S3 bucket that the platform monitors. This interface makes the most sense for us by maintaining a file-based structure in line with expected integration technologies.
00:13:10.300 Additionally, Rainmaker no longer needs to learn how to connect directly to FTP, nor does the platform; we have one abstraction point for FTP and the underlying file connection details. This allows the platform, Rainmaker, or any future microservices to focus on the specific vendor format.
00:13:55.300 In keeping with our theme of building the interface you want, remember that UI and UX constitute interfaces as well. It's important to build clean interfaces, not just from a technical standpoint but also from a design perspective, ensuring user focus and usability.
00:14:24.250 This is a screenshot from Grand Central. This demonstrates the page where we create a file transfer, which is simply an FTP integration. You can see how straightforward it is; we can now establish this in just five minutes. LendingHome has 15 distinct FTP vendor integrations, all of which can be set up in a matter of minutes with basic information.
00:15:12.240 You just provide a name, specify the protocol, a host, port, username, and process direction (whether it’s an inbound file connection or an export). This efficiency has significantly reduced the time required to create a new FTP vendor integration. Now, when vendors approach us saying they only have an FTP file integration, our engineers no longer dread the implication of heavy lifting; building the right abstractions simplifies the process immensely.
00:16:29.660 The next point I'd like to discuss is planning for failure, which can manifest in several ways. One illustrative example involves the National Mortgage Licensing System (NMLS), which regulates mortgage companies. Their website has specific office hours, and the hours differ on weekends.
00:17:22.340 This slide exemplifies a reality LendingHome faces—vital systems can go offline at unpredictable times. If the database or server you rely on is temporarily unavailable, you need to account for these failure modes. Leadership must understand that file-based transfer APIs are prone to interruptions; sometimes, vendors literally shut down every night.
00:18:08.880 For example, there is a particular vendor where our integration relies on the fact that every morning when they arrive at work, they power on the server at their desks and shut it down at the end of the day. If you need to send files, you must do so before they leave for the evening.
00:18:56.220 It’s important to think about scheduling and be aware that this is the environment we are working with. It’s essential to have empathy for vendors like this as it is quite standard in their operations, and it’s crucial that we build abstractions to compensate for their perceived deficiencies.
00:19:35.820 To summarize this point: everything will go wrong, so make your system robust. At LendingHome, we follow several key strategies to simplify integration with file-based APIs. First is to log everything. Every time you connect to an FTP server, log that interaction. Every connection and disconnection must be recorded, along with the time and date of any files sent.
00:20:47.260 This practice enhances debugging capabilities and allows quick identification of problematic periods versus stable periods in the system. Logging is not expensive, and it is critical when dealing with systems that may not function as expected.
00:21:36.720 Next, save a copy of every file or message sent and received; don't rely on your vendor to keep an archive of the files they receive. Store copies securely, such as in an S3 bucket or equivalent. This facilitates debugging significantly, as you can recreate scenarios with precise details and prove that the file was sent.
00:22:40.970 Additionally, a static IP address is often a requirement in many of these FTP vendor integrations. Most use FTPS or SFTP for secure connections rather than standard FTP, which isn’t secured. Many vendors also demand a static IP for connections.
00:23:50.320 It’s essential to utilize a provider or setup that ensures your system maintains a consistent static IP address for all file transfers.
00:24:31.120 Understand the exact transport system required; whether it be FTP, FTPS, SFTP, or any other proprietary protocol. Ask vendors directly about the specific port, host requirements, and whether you need to connect via a proxy. Clarifying these details up front can save hours of frustration.
00:25:16.840 Use reliable libraries and confirm their efficacy. Ruby's FTP libraries may not always be dependable and can fail mysteriously, particularly when it comes to socket timeouts and connection issues. At LendingHome, we discovered the Java FTP libraries were highly effective, so Grand Central is built as a JRuby application utilizing these libraries.
00:26:03.590 Remove files after they have been read, or ensure your vendor handles that task. Each vendor manages this differently. Sometimes, they automatically delete files after processing, while in some cases, the sender is required to delete them. Misunderstanding these protocols can lead to duplicates erroneously processed.
00:26:53.490 Don't load files into memory. This point will come up in detail in the third section of this presentation, but it’s crucial to manage files properly—especially to avoid loading excessively large files into memory to prevent crashes. Treat the files appropriately; process them efficiently without causing unnecessary memory use.
00:27:48.110 Also, remember that soft skills are essential when dealing with failures, not just technical abilities. I once locked us out from an FTP vendor on a Friday evening when most of our staff had left for the day, leaving a scramble to rectify the issue. I reached out to all the vendor contacts, apologized profusely, and working collaboratively led us to re-enable the integration within an hour.
00:28:30.149 Soft skills are invaluable when navigating these vendor relationships. Finally, let's talk about being efficient when reading and writing. This focus is critical for file transfer APIs since they commonly interact with large files in specific formats. Reading entire files into memory can lead to inefficiencies.
00:29:54.030 Computers are adept at handling files, and there's no need to read an entire file into memory. Many legacy vendors send files formatted in XML, fixed-width format, or sometimes CSV.
00:30:50.300 When working with XML, utilize streaming parsers and writers. Instead of attempting to read an entire XML file, parse token-by-token using streaming methods. Utilizing `Nokogiri`, a robust streaming parser for XML can help manage large files efficiently.
00:31:49.330 For fixed-width files, leverage strategies that read line by line, and drastically reduce memory allocation issues when handling large files.
00:32:48.540 The efficiency gained from these practices ultimately leads to smoother operations. Before wrapping up, I’d like to emphasize the key takeaway: building compassion and understanding for others—be it your colleagues or vendors—can vastly enhance the user experience. By following the principles discussed today, you'll be better equipped to tackle the intricacies of API integrations.
00:33:58.020 Again, LendingHome is hiring: San Francisco, Pittsburgh, and remote opportunities available. If you have any questions regarding the content I’ve covered today, I am more than happy to answer them. Thank you very much.
00:34:22.680 So the question was whether to log into a database or directly to your standard logging interface. When we built Grand Central, we opted to log to a database.
00:34:38.680 Each interaction with the FTP vendors is logged as a database row, providing a standardized format and allowing for easy access and analysis.
00:34:50.300 Using a database schema makes it simpler to present human-readable logs of activities.
00:35:05.600 So, if possible, I would recommend logging to a database to facilitate easier querying and access.
00:35:43.080 Another question was regarding testing in environments where specifications are poorly defined. When setting up Grand Central, we connected to a live server, and because of the differing work schedules between our team and the vendor team, we spun up our own FTPS server for testing.
00:36:02.460 Creating mock instances on your own is an effective approach to address poorly defined specifications.
00:36:19.770 Whenever we experience failures, we not only log them but send detailed alerts directly to Slack.
00:36:37.360 We utilize error tracking through Sentry, then route alerts regarding specific issues to a dedicated Slack room for immediate attention.
00:37:07.600 Regular alerts ensure that high-priority failures receive the attention they need in real time. Thank you all!