Performance

Summarized using AI

Ruby Batteries Included

Daniel Huckstep • January 28, 2020 • Earth

In the talk titled "Ruby Batteries Included," Daniel Huckstep explores the extensive capabilities of Ruby's standard library during the MountainWest RubyConf 2013. The core focus is on leveraging built-in functionality to reduce reliance on external gems, thereby simplifying application development. Huckstep outlines the following key points:

  • Understanding Gems: Many applications incorporate numerous gems, leading to unwieldy gem files. Huckstep suggests cleaning up dependencies by utilizing Ruby's standard library, which can reduce the total number of necessary gems.
  • Core vs. Standard Library: The standard library is divided into core components (like Enumerable and String, which are built into the language) and additional libraries that require explicit inclusion. There are around 552 classes within the standard library.
  • Essential Tools: The presentation highlights several useful components:

    • Set: A data structure that ensures uniqueness and offers quick inclusion checks.
    • Enumerable: An essential module that provides a suite of powerful iteration methods, simplifying operations on collections.
    • Lazy Evaluation: Discussed with examples on how Ruby 1.9 improved lazy enumeration.
    • Delegators: Tools like SimpleDelegator and Forwardable to help optimize code by managing method visibility and delegation efficiently.
  • Performance Enhancements: Huckstep emphasizes the importance of benchmarking to measure the efficiency of code. Tools within the standard library allow for effective benchmarking, aiding in identifying performance bottlenecks.

  • Random Number Generation: The secure random class is presented as a better alternative to standard randomization methods, especially in security contexts.

  • Testing with MiniTest: Ruby includes MiniTest for testing, ensuring developers can validate code with minimal external dependencies.

Huckstep concludes by reinforcing that while RubyGems and external packages are valuable, often the standard library contains powerful tools that can suffice for building robust applications without the overhead of additional dependencies. His message encourages programmers to explore and utilize the rich functionalities already provided within Ruby, thus fostering better coding practices and application performance.

Ruby Batteries Included
Daniel Huckstep • January 28, 2020 • Earth

The ruby standard library is full of great code. It's also full of dragons. I'll show you some of fun parts, parts that you may not be using and may not even know about. I'll show you that you don't have to install everything on GitHub to build your application. I'll also look at some of nasty parts, and how to put a training collar on some of those dragons.

Help us caption & translate this video!

http://amara.org/v/FGbO/

MountainWest RubyConf 2013

00:00:26.250 And also to the fine folks at Confreaks who are recording all of this. I gave my mom the URL, so hi mom! Now I'm going to talk about Ruby's standard library, or as I like to call it, Ruby Batteries Included.
00:00:46.540 So, how many gems does it take to build your application? I know Fletcher down here is involved with Blue Box. How many gems does Blue Box use on a regular basis? Does anybody know how many gems are in your gem file? You open up your gem file and it's just pages and pages long. Oh, that's a lot of gems!
00:01:11.500 Well, you know you could look in the gem file that you explicitly depend on, and then there are those gems that they depend on, creating a big chain of dependencies. So this is a quick and dirty way to look in your gem file—that's what our app, Yardstick, does.
00:01:22.119 135 gems! There are like six for Rescue, right? There's Rescue, Rescue Namespace, Retry, the Scheduler, and like, you know, unique jobs and all this fun stuff. So do we really need 135 gems? Yeah, you need to connect to a database, and maybe you really don’t like ActiveRecord, but okay. We need to do some CoffeeScript, but there are probably a few that we can clean out. The standard library can actually do a pretty good job of getting you there. Maybe you have to write a little bit more code, but hey, we’re all programmers. Let’s write some code.
00:02:04.360 We’re going to look at the standard library, which is actually broken into two sections: there's the core, which includes things like Enumerable and String that are built into the language, and then there's the standard library, which includes things that you have to require; you don’t have to use them, but they’re packaged as part of the language distribution.
00:02:41.050 We can use these handy images in the bottom right corners; the core is a fantastic movie, and that’s Ghostbusters. A quick and dirty way to figure out how many classes are in the standard library—I’m not sure how accurate that is—comes out to 552. Let’s get going; we’ve got a lot to cover.
00:03:06.360 We’re not going to cover all of them, but I’ll go through some basic things that you should probably be using on a day-to-day basis, maybe performance measurements to help improve performance, and then we’ll go beyond that into everything beyond the basics. Not all of this works in Ruby 1.8, so don’t be like Brian and say, 'Just use Ruby 1.8!' Where's Matz? He's happy about Ruby 1.9, so we should all use Ruby 1.9.
00:03:36.069 Let’s start with the basics. How many of you have used the Set data type? Everybody's probably used an array, just called unique on it, and shoved some values in and thought, 'Yeah, we want unique values,' right? But we can use a set.
00:03:55.270 This comes with Ruby; it’s free! You can shovel things in like an array, iterate over it, but of course, it’s a set, so it enforces unique value constraints. If you try to add something that’s already in the set, it won’t go in twice. We can create a new set and I tried to put one item in four times, and of course it only went in once. Another advantage of a set is that inclusion checks—like if you check if an element is included—are very fast.
00:04:20.040 On the other hand, with an array, you've got to iterate through the whole thing. Has anybody used Set? I’ve used it a few times. Good stuff!
00:04:35.070 The only caveat with Set is that you cannot rely on the iteration order. Starting from Ruby 1.9 and above, it preserves the hash insertion order, but I wouldn’t count on it. Speaking of iteration, I’m sure most of you are familiar with Enumerable—some of you are probably typing .each right now! Enumerable gives you all of those methods: .each, .map, .each_with_index, .reduce, .collect, and the list goes on.
00:05:16.180 The fun part is that you can implement Enumerable, or you can use the Enumerable module yourself. Here’s an example: it doesn’t do anything terribly exciting; imagine that it would change files, but basically, we include Enumerable. Some other stuff occurs, we shovel things into our files array, then I define the each method where I just iterate over the files and yield some stuff.
00:05:36.449 Then we can do other things with it. We can call each_with_index, wait a minute, and we can call .map. I would call .reduce too, but I didn’t define any of those methods; I just defined the each method and included the Enumerable module. All Enumerable needs is the each method, and it will implement all the other methods in terms of each.
00:06:03.520 This unsurprisingly works. I just reviewed a pull request in our codebase, Yardstick, where someone included Enumerable because they needed each_with_index. We get all this stuff for free. Sure, you could implement each with_index, but why do that when Enumerable gives it to you for free? In Ruby 1.9, we got lazy enumeration.
00:06:29.550 You could implement the lazy stuff in Ruby 1.8 with an Enumerator. In fact, we have a file called lazy.rb in our app because it’s Ruby 1.8 and Rails 2.3. With an Enumerator, you can build your own enumerators on the fly. You can also create an enumerator from something that is enumerable, and it’ll work the same way.
00:06:53.090 In this example, we’re going to build all the natural numbers. We have a new enumerator, and the yield is the yielding method. We can shovel things into that, and we’re going to use the naturals to shovel out just the odds. We’re going to do a really naive prime check and get all the primes. Then we can take the primes.
00:07:11.200 Because the enumerator is just an enumerable, I can take the first 10 primes without needing a big list of primes; it will generate them on the fly. So if you want to do something lazy, like you’ve got a big array of stuff, maybe don’t use .map and all that, because it’ll build an array each time. You can do it lazily.
00:07:30.629 In Ruby 1.9, it’s straightforward to do lazy evaluation, but if you’re still using Ruby 1.8, you can use an enumerator. Next, I want to mention SimpleDelegator. Everybody has probably read 7,000 blog posts about DCI, right? Data Context Interaction is where things don’t have the methods you want until they need to do something.
00:08:06.239 The problem with DCI is that when you use extend, you can blow the method cache, and everything becomes slow. SimpleDelegator can help you with that. You require 'delegate' because it’s in the standard library, as opposed to the core library. I made a fancy string, which just has another method on it that checks if it is, in fact, fancy. So now I have a fancy string with the string 'fancy', and of course, it responds to fancy?.
00:08:27.520 It also responds to length because it delegates all other methods down to the underlying object. You can treat it like a string; it also has this other method or whatever other methods you want on top of it.
00:09:05.500 The next is Forwardable, which goes the opposite way. We can require 'forwardable', and here, I’m going to use a struct. This will take a string, extend Forwardable, and then use def_delegator and def_delegators. Def_delegator will delegate a single method while renaming it, allowing us to call fancy_length while keeping the original length method intact.
00:09:38.370 With def_delegators, we delegate multiple methods without renaming them. So we can make a new fancy string again; include is going to work because we delegated it. Fancy_length works too, but bite_size will not work, as we didn’t explicitly delegate it. It hides methods that you don’t want to expose.
00:10:11.670 For anybody who has used SimpleDelegator and Forwardable, I find it pretty useful. If you want something similar to Draper, you can quickly throw something together with SimpleDelegator just to add some sugar on top of your ActiveRecord models or something like that.
00:10:38.300 So that's the end of the basic stuff. These are all things I think you can use regularly. Like I say, everybody uses Enumerable, Set is pretty popular, and delegating with Forwardable is handy too.
00:11:16.850 Now, let's look at the performance stuff. This is the kind of stuff you can do if you build your own web framework. Benchmarking, everybody loves to benchmark their code, right? This package probably starts more arguments on Hacker News than Bitcoin.
00:11:52.910 This is in the standard library. From the Ghostbusters image down there, we can require the benchmark package and do a simple measure that’ll measure a block of code. I put out the results, and it comes out as a string, but it's actually a results object.
00:12:10.110 You can call methods to get CPU time, system time, and user time. The important one is the wall clock time. It defaults to a string format that's similar to UNIX-like time commands.
00:12:37.410 You can also benchmark separate blocks of code in a report format. If you want to compare two pieces of code, you can benchmark them. Standard is a little boring, so I’ll include some labels to enhance readability.
00:12:53.500 You probably should be doing a warm-up by running the code and then running it again. The warm-up ensures that the CPU cache is happy and that everything settles down. If you're profiling code, consider using New Relic or benchmarking for simple cases.
00:13:04.260 Maybe read some statistics books on how to benchmark code properly. With the BBM method, you can prepare a warm-up with a report, and everything will look beautiful.
00:13:25.360 You’ve found a method, and you discover that your Fibonacci method runs slowly because your app has a Fibonacci method. It’s that common issue—every app has a slow Fibonacci method.
00:13:44.370 You might write a standard, tail-recursive Fibonacci method. Ruby does not optimize tail recursion by default, so if you run it, it will explode the stack and you’ll be sad.
00:14:12.770 There’s a blog post suggesting how to optimize this using the Ruby VM instruction sequence from Ruby 1.9 and up. You can set compile options for tail call optimization.
00:14:43.120 If you do this correctly, it will run almost instantly on your machine. You get the results you want without blowing up the stack.
00:15:08.500 If you've optimized your Ruby code but it’s still not fast enough, you can link to a C library. You can write a C extension or use Ruby inline.
00:15:29.940 If you want to call a single method from a dynamic library, you can use Fiddle. It’s part of Ruby, and you can require Fiddle to simplify your code.
00:15:50.930 You dynamically open a library, and if you’re on a Mac, you handle it differently than on Linux. You set up your function as a double and set up its arguments.
00:16:15.300 Using this method, you can call C functions directly from Ruby code, which is an easy way to accelerate specific tasks.
00:16:41.900 Now, if your CPU is down and everything is running fast, but you’re loading on memory, you might want to consider weak references. With 'weakref', pretty much all of Ruby 2 has been improved. It's especially useful if you're using the Ruby racer.
00:17:07.970 With weakref, you can wrap a variable so that it can be garbage collected at will. You can use this alongside HTML content or temporarily heavy objects.
00:17:27.600 Just like this, we can check if an object is alive after a garbage collection run. If it's gone, you can treat it as a simple cache.
00:17:49.390 That’s it for performance. Does any of that sound interesting? Everybody has used Benchmark, right? Don’t lie; you were posting on Hacker News yesterday!
00:18:11.600 Now we talk about building an application that needs random numbers. Ruby comes with 'secure random,' which is [much better than] Math.random, where you get predictable outputs.
00:18:41.370 Don’t try to build an array of A to Z and then render a random sample from it. Instead, securely use Ruby's offerings, which allow you to seamlessly get random bytes, numbers, and even UUIDs.
00:19:07.300 Next, let’s make a random number service. We are not doing it over HTTP; instead, we’ll use TCP with GServer. It’s threaded and handles connections for us.
00:19:22.780 Just inherit from GServer and define a serve method to handle I/O objects. You read a few characters from the socket, check the command, and respond accordingly.
00:19:48.670 Now we can start this service. After running the server on a specific port, you can connect through telnet to get a hex value or generate a UUID.
00:20:13.680 Now someone returns from a conference and says to integrate some Rust-generated randomness into our Ruby application. We can’t load Rust directly but use it through the shell.
00:20:35.170 Using Kernel.spawn will allow us to capture the IO of the command and retain high levels of control, which gives us great flexibility in building our service.
00:21:07.200 If you often shell out in your app, you might need to look into POSIX spawn if that’s the case.
00:21:37.960 We figured out ways to secure random generation based on already secure numbers. In some cases, you might need user input to pass data down.
00:21:59.930 Utilizing shell words will help safely handle user input and ensure the system can properly pass commands without introducing vulnerabilities.
00:22:27.420 We now have a TCP server serving random numbers generated by Rust while taking user input to ensure proper functionality.
00:23:10.340 We need to audit this service and store the generated numbers to avoid duplicates. For that, we will use 'PStore', which is thread-safe and easy to use.
00:23:33.190 PStore uses Marshalling to serialize data, ensuring safe disk writing. Transactions are also supported, which ensures atomic operations.
00:24:03.070 Now, after testing and building, you can ensure that your app runs smoothly and doesn’t require external dependencies.
00:24:30.730 We haven’t even talked about testing! Don’t introduce too many dependencies that will require installations to run tests.
00:25:01.830 Fear not! MiniTest is included with Ruby and has a lot of functionality without external burdens. Use a simple 'rake test' command to check your code.
00:25:29.330 Now, let’s review the Beyond section. We've built a random number generating service with no SQL store, and our application is audited and tested.
00:25:56.620 In the standards library, we also have various other functionalities such as thread-safe data structures, option parsers, and modules for handling HTTP and email.
00:26:25.830 While RubyGems and all the other package managers are great, sometimes you might find that the standard library has all the tools you need.
00:26:47.030 If you can write 20 lines of code to accomplish something instead of pulling in an external gem, maybe that’s the better path forward!
00:27:09.390 Ultimately, remember that with Ruby, you have access to bundles of powerful functionality. You don’t have to rely on external dependencies for everything. Thank you!
Explore all talks recorded at MountainWest RubyConf 2013
+28