Ruby for Home-Ec

00:00:15.000 Okay, let's get started. Is everybody doing well?

00:00:21.250 It's very bright on stage, so I can't see any of you. I will have plenty of time left for questions at the end, so I encourage everyone to take notes and ask a question if you have one.

00:00:27.610 I'm Adam Forsyth, a software engineer and community lead at Braintree. This talk is titled 'Ruby for Home-Ec.' After the talk, I'll be posting these slides, including my speaker notes, on GitHub.

00:00:34.810 So, let's talk about what Home-Ec is for a second. This is not my picture of Home-Ec; it's Wikipedia's picture. It's a bit of a stereotype—a 1950s kitchen scene. Home-Ec is defined as the profession and field of study that deals with the management of the home and community.

00:00:52.540 I'm going to walk through two problems that show you how to handle tasks around the house more effectively, and hopefully remind you that programming can help you find better solutions to everyday problems.

00:01:07.810 This is Jonathan—he is my coworker. In addition to being a software engineer, he's also a farmer and a do-it-yourselfer. He inspired this talk, as both of the examples I will provide today are based on real problems he faced around the house and farm last fall.

00:01:39.310 The first problem I'll discuss arose when he attempted to create a countertop. This is a real picture of the countertop, not a staged one. Butcher block countertops like this are made up of many small pieces of wood laid alongside each other. This particular countertop is L-shaped and joined along a diagonal, meaning that every single piece is a different length.

00:02:02.619 Jonathan's challenge was to cut pieces of the right lengths out of the scrap wood he already had, doing so as efficiently as possible without waste. We will refer to the initial pieces of wood as 'boards,' and the final desired pieces as 'cuts.' In this example, we have a seven-foot board and a five-foot board, from which we need to make 24-foot cuts, a three-foot cut, and a one-foot cut.

00:02:29.770 If we make the three-foot cut first, we'll only have enough space left for one of the 24-foot cuts, resulting in one leftover cut that we cannot make. However, if we make the four-foot cuts first and organize them correctly, we're able to make all the required cuts from those two boards. Let's figure out an algorithm to do this effectively, especially considering that Jonathan had many more cuts to make.

00:03:06.850 Our goal is to find places for the largest cuts first, as these are the hardest to accommodate. Thus, we will start from the biggest cut and attempt to fit each cut into the available boards. For example, we could make one of the four-foot cuts first, then another, followed by the three-foot cut, and perceptively, this leaves enough space for the one-foot cut.

00:03:24.610 Returning to the countertop example, instead of just four cuts, there were actually 64 cuts to be made. Jonathan had 35 boards of three different lengths. The number of different lengths is significant, and I'll explain why later. An important detail is that every time a cut is made, we lose some length due to the saw blade's width, known as the kerf. We need to account for this loss every time we cut.

00:03:39.310 These details about the number of boards, the different lengths, and the kerf are why we can't solve this problem by hand—we need some code to assist us. This is the code we'll be using; it consists of a little more than 20 lines. Please don't try to absorb it all at once; we will walk through it piece by piece. Even then, don't worry about catching every detail—you can always return to the video or experiment with the code on GitHub later.

00:04:04.810 We will call this method 'greedy approximation' because we want to make cuts as efficiently as possible, being greedy with our cuts. For instance, we made the four-foot cuts first before moving on to the shorter ones. This method takes as input the lengths of cuts we want, the lengths of the boards we have, and the amount we lose at each cut—the kerf. We sort the cuts in reverse order to prioritize the longest ones.

00:04:57.910 We then add some metadata for the cuts, noting whether we found a place to make that particular cut. For the boards, we track how much of the board we have used so far and the specific cuts we were able to make from each board. This will ultimately give us a list indicating which cuts to make from which boards so that we can proceed to our saw and execute them.

00:05:27.940 As we process the cuts from longest to shortest, we try to make each cut from all available boards. For the first cut, we don't need to account for the kerf since no other cuts have been made yet. For subsequent cuts, we must include the kerf, so if any length has already been used from the board, the kerf is added to the total length required for the cut. If sufficient length remains on the board, we will make the cut and record that we found a viable spot.

00:06:11.260 We also add the used length (including kerf if necessary) to the account of how much of the board has been utilized and record the specific length of the cut made from this board. Once we find a place for the cut, we break out of the inner loop; there's no need to search for additional boards for this cut. We've found a suitable spot, so we proceed to the next cut.

00:06:48.120 You may have noticed that I skipped a detail here—we sorted the cuts, but we didn't sort the boards. The order of the boards affects our search, which matters significantly in this instance. For our real input set, if you sort the boards from smallest to largest, you'll end up with 17 cuts that cannot be made. Conversely, sorting them from largest to smallest allows you to make all the cuts except for one. Thus, that specific ordering isn't intrinsic to the problem—it's merely a feature of the particular lengths in our input set.

00:07:27.490 We're dealing with a real-world problem here, and it's okay if the solution isn't perfect. It's acceptable for us to create one of the countertop pieces from two smaller cuts because, as I said, one cut could not be made as desired. Although we can fit most cuts with these boards, we only have three distinct board lengths, making it relatively simple to evaluate all conceivable orders of these lengths.

00:08:03.340 We tried the shortest, middle, and longest combinations, but this still left 17 cuts unmade. We then reversed the order, starting with the longest, followed by the middle and shortest, and encountered just one unmade cut again. Eventually, if you start with the least common lengths, then the most common, you can find a combination that allows you to make all cuts—a good reminder that sometimes manual adjustments and understanding the structure of a problem can lead to effective solutions.

00:08:37.000 Now, let's discuss the theory behind this. This problem reflects a classic computer science challenge known as the knapsack problem. In this scenario, you have a bag that can hold a particular weight and you want to determine which items, having different weights and values, can best fit into the bag without exceeding this limit. The goal is to maximize the value packed within the weight constraint.

00:09:12.730 If we were to adopt our specific algorithm for this problem, we would encode both the value and weight as a ratio. This allows us to apply the greedy approximation algorithm, which would yield results generally around 50% efficiency. While this method doesn't always provide the best answer, it's a quick and easy method for approximating a knapsack problem and often results in a sufficiently good solution.

00:09:58.820 There are various versions of the knapsack problem; one well-known variant is the subset sum problem. In this case, the values and weights correspond with each other, and rather than fitting everything in, you must hit a specific target amount. In a comic strip example, a group of people wishes to order exactly fifteen dollars and five cents worth of appetizers, and the waiter must decide how to select food items to meet that exact price. This subproblem also presents a potential oversight in the comic, as there are actually two distinct ways to achieve the target sum.

00:10:43.850 When multiple knapsacks with equivalent value and weight need to be addressed, this is referred to as the stock cutting problem. Typically visualized in two dimensions, this could involve a piece of paper or sheet metal from which you wish to cut smaller pieces. That said, this problem can also exist one-dimensionally, similar to our board scenario, demonstrating that many practical implications of cutting stock are often addressed by industry while striving for maximum efficiency with proprietary software tools.

00:11:44.730 A noteworthy derivation of this problem is the guillotine stock cutting problem, where every cut has to extend across the material entirely. This introduces additional constraints, making the problem more complex, but also simplifying the machines used in solving it. So why does this matter to us as software developers? Let's imagine again the previous scenario—this is the same illustration I showed earlier, now labeled as servers and VMs instead of boards and cuts.

00:12:34.750 Consider a situation where you possess multiple physical servers, each with a specific amount of RAM. You also have a collection of VMs to run on these servers, with varying RAM requirements for their tasks. Efficiently allocating these VMs to servers ensures that you maximize server capacity while minimizing wasted resources, leading to more effective usage of your hardware. This scenario aligns closely with the stock cutting problem, albeit presented in multidimensional aspects: CPU, RAM, hard disk space, input/output speed requirements, and more.

00:13:10.650 If you wish to explore more about these problems, I know it might sound cliché, but Wikipedia is an outstanding resource. It's accurate, easy to navigate, and provides numerous references to the various subtypes of these problems. So that's our first example. The second problem Jonathan encountered involved his pantry. Unfortunately, unlike the countertop image, this is not a representation of his pantry.

00:13:56.610 However, reflecting reality, there are many different kinds of jars in this image, with varying quantities of each type. Jonathan, being a farmer, preserves various products and his family consumes them all winter and spring until fresh produce returns. Due to the repetitiveness of his family's meals, he wishes to sort these jars to ensure maximum variety throughout the winter months.

00:14:44.670 Though this seems relatively simple, it becomes more complex when you consider the actual distributions. If you have about the same number of each type of jar, the problem is straightforward. Let’s say you have three red jars, three blue jars, and three green jars; you can easily space them out evenly for variety. However, if you have an abundance of one type of jar, say red jars, and you attempt to arrange them equally, you may find several common jars together, leading to complaints from your kids about eating pickles two days in a row.

00:15:32.720 To avoid this scenario where common jars are bunched together, we need a solution that places the most frequently occurring jars as distantly from each other as possible throughout the winter. By spacing out the red jars evenly, we can ensure that no color appears consecutively; while blue jars might not be perfectly arranged, they can still achieve a reasonable distance.

00:16:04.540 Initially, I considered spacing the most common jars equally apart, then inserting the next most common jar type into the gaps, and finally filling in the less common ones. However, this approach often leads to the least common jars being clustered together in leftover spaces, ultimately resulting in an unsatisfactory assortment.

00:16:50.230 So, how can we resolve this? Instead of placing jars where they should go and filling in gaps, what if we started by arranging the jars on the shelf and then spaced them out incrementally? If we begin with the most common jars, they must all move the same distance uniformly, or else the gaps will be uneven. This inconsistency results in the most common jar being poorly spaced.

00:17:13.420 Conversely, if we reverse our approach and start with the least common jars and move to the most common, we can ensure that the most common jars remain evenly spaced, as we place them last. Thus, when the least common jars are arranged, the layout might appear slightly biased but will overall maintain reasonable spacing, leading to a better variety.

00:17:59.880 Let’s look at the code. This code is even shorter than in the previous problem. The main challenge was not writing it but conceptualizing the solution. We'll start with Jonathan's actual pantry input, represented as a dictionary mapping jar names to quantities.

00:18:18.110 We begin by transforming the dictionary into an array sorted by the quantity of jars, placing the least common first and most common last. Next, we create an empty array representing our shelf. As we begin placing jars on the shelf, we decide how many jars there will be between each jar added.

00:18:51.830 If, for example, we have ten jars on the shelf, that yields nine gaps between them. For each new jar added, we examine how many jars are on the shelf. Starting from zero, they will be closely clustered together, but as we add more jars, they will spread further apart due to the increasing number of jars.

00:19:35.450 Next, we will place the jars on the shelf. If we have ten jars, we repeat this process ten times. At each step, we calculate the shelf position for the jar, multiplying the spacing for its index, making sure to round to an integer since our spacing will often yield fractions.

00:19:53.150 Additionally, we must factor in how many jars we have already placed—when insertions occur in an array, all items to the right will shift accordingly. Neglecting this would misplace the subsequent jars, so it’s crucial to keep track of indexes.

00:20:23.740 You might have also noted a challenge here, similar to prior instances: in Ruby, dividing two integers will only yield integer results. For example, dividing 5 by 2 returns 2 instead of 2.5. This could introduce rounding errors in our spacing. To resolve this, Ruby includes a library called MathN that changes how mathematical functions operate.

00:20:47.660 By utilizing the MathN library, we maintain enough precision in our calculations so that when we perform multiplication later on, rounding errors are minimized, allowing for accurate spacing as we required. However, this solution works well for a small volume of jars; if we were to scale up significantly—say to 100,000 or 10 million jars—the performance would greatly degrade.

00:21:35.350 With a linked list, we could mitigate insertion times, but considering we only have approximately 100 items in our pantry, this isn't a critical concern. Let's step back and analyze this type of problem generally. We aim to optimize spacing for jars to maintain variety, giving priority to the more common jars, which leads to approximation methods like the one used.

00:21:58.910 However, this specific method is not universally applicable across all optimization problems—it serves only this type or closely akin scenarios. There are more generalized approximation techniques available, and I am going to discuss one of those today—simulated annealing.

00:22:36.370 Simulated annealing involves heating a material, such as metal or glass, and then cooling it down at specific rates and temperatures to achieve a more uniform structure and minimize energy levels. The key concept is that by heating up the material, it can explore a wider variety of structures, then as it cools, it settles into a stable, low-energy state.

00:23:13.920 The accompanying animation illustrates this process. On the y-axis, you see uniformity. The top point represents the most uniform structure at a low energy state. As the temperature is high, the metal structure remains variable, but as the temperature drops, fewer structural changes occur, leading to greater uniformity and eventual lower energy levels.

00:23:50.300 While many approximation methods may only reach a local maxima or minima, this approach can enable finding a global optimum due to the randomized movement throughout the solution space. Simulated annealing can effectively help us space our jars evenly, dissipating energy through their arrangement, which can also relate to hardware design.

00:24:26.210 When designing a circuit board, for instance, you'd seek to minimize the board's area while keeping the wire lengths as short as possible. You may integrate these factors into your criteria, using simulated annealing to optimize the design. Recently, I came across an intriguing software example utilizing simulated annealing. This project, which uses public domain images, involves a JavaScript application that shuffles image columns randomly.

00:25:09.370 Through simulated annealing, it then successfully rearranges them back. Please check it out; it's a fun interactive experience that allows adjustment of parameters like iterations and starting temperatures to observe impacts on the process.

00:25:55.170 The animation demonstrates that after shuffling, the columns begin to reorder, gradually bringing back the original image's recognizable elements. This example showcases how simulated annealing can be employed effectively in areas like image processing and data organization.

00:26:39.180 Considering the vast data available, brute force is impractical, but with approximation methods like this, satisfactory outcomes can be obtained efficiently. Regardless of the situations, one could take the final outputs of the shuffled image and manually arrange the strips back into the original order, appreciating how such methods can expedite tedious tasks.

00:27:06.140 Returning to our discussion on resources, I encourage everyone to explore Wikipedia for more insight into simulated annealing. It offers a solid overview of applying it to various problems, providing a great foundation for your learning.

00:27:40.160 Through these examples, I hope to have illustrated how programming can tackle everyday tasks, like optimizing food storage or designing items in your kitchen, while also imparting knowledge on problem-solving and algorithm creation.

00:28:15.880 For some, devising algorithms may be routine, but for many, it lies outside the norm. Relating these ordinary situations back to theoretical problems and approximation methods highlights the opportunity for us to learn and apply such techniques in our daily work.

00:28:51.640 Now, I will switch to the code and execute these programs to demonstrate their practical output and show that we can apply these concepts in reality. One moment, please.

00:29:07.100 Here’s our first problem. On the left, you can observe our regular method—the greedy approximation—along with a main method that invokes it with our specific dataset. We have our cuts, organized by proportion, and our kerf values established.

00:29:35.250 When we run the application, we retrieve the required board lengths and cut locations. In this instance, all cuts were successfully made. However, if we alter the sequence to prioritize the longest boards first...

00:30:05.700 ...we discover that one cut remains unfulfilled—a shorter cut. This example emphasizes that order plays a crucial role, and the solution remains efficient for the given problem size.

00:30:25.360 Now, let’s revisit the second problem. Here’s Jonathan's actual input set. Note the inclusion of MathN to achieve the desired division. We sort the pantry and execute our algorithm, producing an ordered list of jars to place on the shelf.

00:30:54.790 As you can see, salsa verde is strategically positioned at both the end and the top, effectively spaced farthest apart to ensure optimal variety. Thank you!

00:31:03.780 Does anyone have questions? Yes, how do I develop a mathematical algorithm for a problem at home?

00:31:15.070 I tried to convey my thought process regarding the jars; I literally used graph paper, drew circles, and sought to figure out solutions. Ultimately, I translated those into an algorithm. It's mainly about tweaking the math to ensure it works in your favor while tackling rounding during integer conversions. Recognizing patterns from previous experiences will aid the process, particularly for those familiar with algorithms.

00:32:10.450 Create rough prototypes where you work with three or four items to derive the optimal solution before extrapolating to larger sets. This exploratory method is the most effective way to approach such discrete problems. Great, I’m happy to answer any more questions afterward. Thank you very much!