00:00:20.810
Hi everybody! My name is Ben Bleything, and my pronouns are he/him. I'm here to talk to you today about what I think is absolutely the most exciting aspect of modern application development: background processing.
00:00:33.210
Thank you! I also want to talk about serverless technology and how you can combine it with background processing to achieve interesting results. A little bit about me: I've been involved with Ruby for a long time, almost 15 years. I've worked on some pretty interesting applications, including GitHub, LivingSocial, and White Pages.
00:00:46.589
I've also been involved with smaller but equally fascinating projects in animation, financial services, and indie music licensing. Throughout my career, I've focused mostly on infrastructure, operations, and architecture.
00:01:03.270
Currently, I am a Developer Advocate at Google, where my job includes thinking about how to modernize our systems and architectures to make them better for both developers and users by adopting new technologies. By 'new,' I don’t necessarily mean emerging technologies, but rather those that are new to us. My goal is to help improve your development experience.
00:01:46.800
Before we dive in, I want to make a few disclaimers. First of all, this is not a sales pitch. I'm not here to convince you to adopt any specific technology or to use Google Cloud Platform (GCP). You can use whatever works for you, and if you already have a cloud provider, that's great! Ultimately, I don't want you to feel pressured to switch to anything if you're already happy with what you have.
00:02:21.720
So, the truth is that you probably don't need serverless technologies. I think they're exciting and hold potential for innovation, but exploring this as a community will reveal interesting ways to use these technologies to enhance our applications.
00:03:00.280
My aim today is to share insights gained during my research for this talk, which may inspire you to experiment on your own or save you some time. With that introductory material out of the way, let’s discuss background processing. I suspect that most of you are familiar with this concept.
00:03:31.100
This is when you have tasks that take too long during a request cycle. You want to move those tasks out of the main processing flow to avoid timing out for your users or creating a poor experience. For example, think of building a new YouTube-like platform where users upload videos. You certainly don't want users to wait through the transcoding process.
00:04:10.910
I apologize; I developed a cough this morning, so I'll be hitting the water a lot and potentially muting myself to cough. Anyway, with background processing, most folks are likely familiar with tools like Sidekiq or Rescue, and there are many others like Delay Job, Sneakers, Backburner, and Sucker Punch. It's a common practice in modern application development.
00:04:59.990
Just out of curiosity, how many of you have used something like that before? Good, I thought so. Is anyone using what they would consider serverless solutions for background processing right now? I'd love to talk to you later, as it's possible I might say something you disagree with, and I want to learn from you.
00:06:02.090
What is serverless? I came into this discussion with a somewhat vague understanding, having been in this field for a while. I asked my friends for a tweet-sized definition of serverless, and I received some enlightening responses. Jason Watkins, an experienced Rubyist from Portland, said, 'Function as a service works fine for me,' and I believe that's a common sentiment.
00:06:55.889
Asha mentioned that serverless operating typically operates by charging for every unit of compute or instance used, and you don’t pay when nothing is being executed. At Google, we refer to this as 'scale to zero.'
00:07:42.769
Before cloud services emerged, you had to buy hardware in advance, which often required significant planning. For example, a year and a half ago, I had to order a quarter-million-dollar setup for a client, which took us nine months to obtain due to a global SSD shortage.
00:08:16.859
But the advent of cloud services like EC2 changed that, allowing resources to be provisioned on-demand. Serverless takes this a step further; rather than paying for unused capacity, you only pay for what you actually use.
00:09:11.009
While VMs require you to specify how many CPU cores, how much RAM, and storage beforehand, serverless means that if no one is using the function, you’re not charged. You can dynamically scale to meet demand and ignore the complexity of capacity planning.
00:09:31.699
My colleague Sandeep said serverless implies no manual scaling or provisioning. We’ve had auto-scaling for a while at VM and Kubernetes levels, but serverless reduces your operational workload significantly. Although I come from an operations background, it’s worth noting that while operations responsibilities might theoretically decrease, they’re often just pushed to the cloud provider.
00:10:05.490
As a result, while your teams won't handle certain operational tasks, you're outsourcing them to providers that take care of managing functions for you. Serverless can indeed reduce overhead, and I think it can benefit development teams at various levels.
00:10:52.160
To clarify, the concept of Functions as a Service (FaaS) is where you take a small chunk of code and run it on a serverless framework, which can receive HTTP requests or respond to events triggered by your application. Most providers offer an array of event-driven triggers. This is an area with immense potential.
00:11:43.880
Now let's consider a basic example of a webhook handler. Imagine you decide to create a continuous integration (CI) tool that interacts with GitHub webhooks to build artifacts. In a simple flow, GitHub sends a webhook to your Rails application, which queues that job in Sidekiq.
00:12:16.750
However, if you want to make this serverless, you could use the aforementioned HTTP method and leverage AWS API Gateway to manage incoming requests and trigger functions running on AWS Lambda. The idea of an API Gateway is that it helps to standardize incoming webhooks, allowing you to route the request directly to a function.
00:13:02.540
This means you don’t need a full Rails app to handle webhooks; just a function that responds to events. HTTP has become the standard transport in our industry, so if you need to get into serverless, using an API Gateway is a low barrier to entry.
00:13:43.800
Now, you might wonder what the real advantages are for this setup. While using serverless can simplify your codebase, there are benefits to separating it from your current infrastructure.
00:14:10.350
In addition, handling background tasks serverlessly enables you to scale more easily. For instance, if your popularity spikes and you notice increased demand, serverless solutions can automatically scale to match your usage.
00:14:56.690
Moreover, decoupling your processing code can lead to a cleaner architecture, especially if you choose to integrate it with diverse tech stacks. For instance, if your main application is built with Rails but you need to run machine learning tasks, you might want to create Python functions instead.
00:15:27.430
However, you should be mindful of potential downsides. Serverless systems can sometimes be opaque, and with increased complexity comes additional moving parts that require careful monitoring. When you're not running the core processing in your application, it's essential to ensure everything is functioning correctly.
00:15:57.120
Now, let’s move to an example involving text processing, such as a new social network. Users will post updates, so you’ll need to index those for searchability, implement spam filtering, and analytics to track user interactions. When you receive those updates, you must decide whether to handle them synchronously during the request cycle or offload them to background processing.
00:16:30.030
If you opt for background processing, you could use tools to simultaneously handle database interactions and save data in Elasticsearch. Although synchronous processing can work for low-volume applications, scaling these operations can become challenging.
00:17:15.770
If your Rails app triggers background processing, it can pull data from the database, process it, and write back to your database or Elasticsearch, significantly speeding up your request times.
00:17:59.090
So how about serverless processing for these types of tasks? You could take a message queue approach, where the Rails app pushes a message to a queue that subsequently triggers functions.
00:18:42.500
For example, Google Cloud offers Pub/Sub that can manage this messaging. Your processing tasks such as spam filtering, indexing, or performing analytics can run simultaneously in the cloud. This decoupling through messaging can help systems scale effectively in case of high user demand.
00:19:21.960
Moreover, if you suddenly gain millions of users who generate significant traffic, serverless infrastructures can handle this demand automatically. You can focus on improving your core application without worrying about capacities.
00:20:04.300
Yet it’s essential to remain aware of the limitations and intricacies of your setup for proper management. If you are utilizing messaging queues, either for processing or integrating with external systems, the value is derived from reduced complexity.
00:20:42.330
Let's take a scenario involving uploading a video to a rails application, such as a new YouTube. Initially, the uploaded file could go into an object store like S3 or Google Cloud Storage. While this handles basic file upload management, transcoding the file into a suitable format afterwards relates well to background tasks.
00:21:29.110
Using background processing allows you to extract metadata, handle event-driven functions for triggering video processing workflows, and can effectively manage these tasks without hindering the user experience.
00:22:09.510
For instance, when a file upload completes to a cloud storage service, it can signal your functions to execute transcoding, notify users, or perform additional tasks, while still following best practices to avoid loops and duplicative processing.
00:22:59.160
Video transcoding can be computationally intensive, which justifies the use of serverless due to elastic capabilities. Ensuring you have access to sufficient resources is critical to maintain performance.
00:23:36.900
To clarify, serverless solutions enable you to leverage cloud-managed services, event architectures, and third-party integrations exposed to messaging infrastructures, helping you focus on your user-centric features.
00:24:14.170
As previously referenced through text processing and video transcoding examples, serverless technology can provide significant advantages like streamlined operations over traditional setups, all while scaling easily.
00:25:05.420
I want to remind you about important considerations. For instance, while the cloud providers allow automatic scaling, they do impose limits on simultaneous executions. This prevents one malicious user from overwhelming the system and incurring immense costs.
00:25:53.380
Engaging in capacity planning exercises will unveil aspects of operational design that developers may not consider regularly. Part of this initiative involves understanding the nuances and semantics of the systems and services in use.
00:26:40.290
Speaking of messaging systems, another aspect to keep in mind is potential duplicate messages. Understand that most cloud providers offer at least once delivery, and they do not guarantee order. So it might be beneficial to implement deduplication logic once you gain familiarity.
00:27:29.220
Being cautious will help prevent repeating the same messages or potentially unintended actions in your applications, ensuring functional correctness.
00:28:17.979
Moreover, with added complexity arises the need for monitoring. From logging to alerting and operational dashboards, all systems should be monitored closely to ensure everything operates as intended and address issues promptly.
00:29:00.739
As we wrap up, if you want to learn more about serverless technology, I recommend exploring the product landing pages of major providers like AWS, Azure, and Google Cloud Platform. They contain a wealth of information on their offerings.
00:29:50.379
On the left side, you'll discover the main products, namely AWS Lambda, Azure Functions, and Google Cloud Functions. On the right, several open-source projects provide similar functionality, including Apache OpenWhisk and Oracle’s FN.
00:30:40.650
Additionally, Fission.io serves as one of numerous frameworks that put Functions as a Service on top of Kubernetes.
00:31:38.309
Thank you all so much for attending! You can find me online; I’m Bleything everywhere. Feel free to reach out via GitHub, Twitter, or my website. I’ll be at the Google booth in the exhibition hall for the rest of the day, so please drop by to chat.
00:32:00.000
I especially want to hear from you if you are working with some of these technologies, and I’d love to know where I’m getting it wrong. Thanks again!