Building LLM powered applications in Ruby

by Andrei Bondarev

In the presentation titled 'Building LLM powered applications in Ruby,' delivered by Andrei Bondarev at the wroc_love.rb 2024 conference, the focus is on the integration of generative AI and large language models (LLMs) in Ruby applications. Bondarev, with over a decade of experience in developing Rails applications, delves into the transformative impact of generative AI on software development.

Key points include:
- Definition of Generative AI: Generative AI refers to artificial intelligence systems capable of generating text, audio, video, and more, with a specific focus on text in this talk. LLMs are highlighted as powerful tools based on deep learning and transformer architecture.
- Evolution of Development Timelines: Historically, building and deploying AI applications took several months. With the advent of APIs and LLMs, this process can now condense into days, allowing for rapid prototyping and deployment.
- Capabilities of LLMs: The talk outlines various tasks suitable for LLMs, including data structuring, summarization, classification, translation, content generation, and chatbot functionalities.
- Vision for AI in Tech Stacks: Bondarev envisions generative AI becoming a fundamental component of all technology stacks, similar to databases and caching. He proposes a shift in application architecture where AI plays a central role, enhancing the flexibility and efficiency of decision-making processes in applications.
- AI Agents: The presentation discusses AI agents—autonomous or semi-autonomous programs that leverage LLMs to execute tasks. They can interact with APIs and enhance business operations through automation.
- Reliability of AI Agents: Bondarev addresses the challenge of reliability in AI agents, noting that as the scope of their tasks narrows, their reliability increases. A focused task leads to higher performance and proof of concept readiness.

In conclusion, the presentation emphasizes the potential for LLMs and generative AI to revolutionize the development landscape in Ruby and other programming environments. By incorporating AI effectively into the stack, developers can focus on writing business logic while the AI manages more complex decision-making and integrations, heralding a future where AI and software development are inextricably linked.

Overall, Bondarev invites developers to embrace this evolution and consider the broader implications of integrating AI into their work processes.

00:00:13.679 Today, we'll talk about building LLM-powered applications in Ruby. I started my career 13 years ago building Rails applications for the U.S. Department of Health and Human Services, creating public-facing applications for millions of consumers. Then I moved on to continue building Rails applications at USA Today. Does anyone remember SPRE? Sean Scoffield, who was the original author, hired me, and I worked with one of the most amazing Rails teams at the time.

00:00:26.960 So, we're going to talk about generative AI. What is generative AI? It is a type of artificial intelligence that generates text, audio, video, etc. For the purposes of this presentation, we'll focus on text. Large language models (LLMs) are deep learning artificial neural networks with general-purpose language understanding and generation. They exploded in popularity after the 2017 Google paper, "Attention is All You Need," and the underlying architecture is called Transformers.

00:00:58.760 The impact of generative AI is that we are suddenly able to deploy commercially viable systems in just a few days. Previously, it would take about a month to collect data, three months to train the models, and another three months to deploy and optimize those models to run on hardware. Now, you can use APIs to get something into a proof of concept (POC) state or production state within a couple of days.

00:01:37.560 LLMs excel in various tasks. Some of the tasks they are particularly good at include structuring data, which means taking unstructured data and converting it into a structured format, such as returning JSON or YAML that developers can consume. They are also proficient in summarizing text; you could send them a short narrative and get a summary back. Other tasks include classifying data, translating languages, generating content, and answering questions in a chatbot style.

00:02:38.840 I would like to present a vision of where I think this technology is headed. I don't think anyone truly knows where it's going, so I encourage you to consider it holistically, without getting tied up in implementation details. I believe that generative AI will become a part of every tech stack, just like databases, cache, encryption, queues, Lambda functions, storage, and protocols. Every tech stack is going to have a running process that will rely on AI for certain functions.

00:03:06.960 The firm The2 presents this vision where AI becomes a crucial part of application architecture, with developers building applications on top of these AI systems. For example, in every single project, we have "if-else" statements requiring the enumeration of different options. There's a finite amount of options, leading to a predetermined output based on input. With AI, we have the flexibility to account for more ambiguity. For any given input, AI can choose the best-suited candidate among various outcomes, even for inputs it has never encountered before.

00:03:50.640 For example, an AI agent could write a file, approve a document, or text my partner. This has always been the promise of Rails: for business owners, the Rails teams should focus mainly on writing business logic—rather than solving engineering problems. Developers should concentrate on writing business logic and not reinventing engineering solutions. In the old world before AI, business logic was often scattered throughout fat models or service objects. In the new world, I propose that some business logic will reside in prompts.

00:04:29.880 In this example, imagine a simulated e-commerce store, where an AI agent orchestrates business logic and communicates between different systems the e-commerce store might connect with. Almost every commerce store connects to a payment gateway via an API, an inventory management system via another API, accounting systems, and shipping services. I propose that you could write out your standard operating procedures for your business, include that in the prompt, and have the AI orchestrate the execution of handling orders, processing returns, and managing the customer loyalty program.

00:05:05.920 This takes us to AI agents. AI agents are autonomous or semi-autonomous general-purpose LLM-powered programs that can use tools, essentially APIs and integrations, through function calling. They work best with powerful LLMs, and OpenAI's GPT-4 is currently the leader in this space, although that changes frequently. These agents can be used to automate workflows in business processes.

00:05:44.160 Anyone who has ever attempted to build these agents knows they can be pretty unreliable. I think about this reliability in terms of the upper triangular shape: as the number of tasks or responsibilities that these agents are responsible for shrinks, their reliability increases. You can narrow down the set of responsibilities that AI agents will execute or the decision tree they follow.

00:06:19.600 Looking at the chart, the x-axis represents a focused agent—perhaps a single or dual task agent—on the left-hand side, while a general-purpose agent appears on the right. The upper left quadrant and bottom right quadrant could be at least proof of concept (POC) ready. If you achieve a general-purpose reliable agent, well, that would be Artificial General Intelligence (AGI).