Data Validation
From Excel to Rails: A Path to Enlightened Internal Software

Summarized using AI

From Excel to Rails: A Path to Enlightened Internal Software

Nick Reavill • May 25, 2016 • Kansas City, MO

In the talk titled 'From Excel to Rails: A Path to Enlightened Internal Software,' Nick Reavill, Director of Engineering at Stitch Fix, discusses the shifting of internal software development from tools like Excel to the Ruby on Rails framework. The presentation underscores the relevance of internal software in enhancing workplace efficiency and employee satisfaction. Reavill shares insights based on his personal experience with internal software, highlighting the following key points:

  • Importance of Usability: Reavill challenges the notion that internal software can sacrifice usability. He argues that enhancing usability leads to happier employees and improved efficiency, ultimately lowering staffing needs by streamlining processes.
  • Excel’s Dominance: Despite being a powerful tool for prototyping and flexibility, Excel has significant limitations, including lack of version control, data validation, and scalability issues. Even large companies like Macy's rely heavily on Excel, underscoring its ubiquitous use in internal processes.
  • Transition from Excel to Rails: Reavill provides a case study on how Stitch Fix transitioned from using Excel spreadsheets for purchasing management to a custom-built Rails application. He describes Excel spreadsheets functioning as prototypes that reveal the needs and processes of the team, allowing for a well-informed development of the software.
  • Database Structuring: The necessity of organizing data into a relational database for clarity and reduction of errors is emphasized. By transitioning to a Rails application, users can have a single source of truth, and the software becomes capable of enforcing data uniqueness and validation.
  • UI and User Experience Improvements: The presentation highlights the importance of creating intuitive user interfaces that can manage large data sets effectively, including features like elasticsearch for robust data querying, eliminating the cumbersome experience of Excel spreadsheets.
  • Iterative Development and Feedback: Reavill concludes by stressing the importance of user feedback in refining the software, noting how ongoing improvements help in gaining user trust and easing transitions from Excel-style workflows to more structured applications.

In conclusion, the talk effectively illustrates how Ruby on Rails can provide solutions for internal business needs by enhancing data integrity, usability, scalability, and overall employee value. It advocates for a thoughtful approach to developing internal software that prioritizes user needs as opposed to merely replicating existing workflows.

From Excel to Rails: A Path to Enlightened Internal Software
Nick Reavill • May 25, 2016 • Kansas City, MO

From Excel to Rails: A Path to Enlightened Internal Software by Nick Reavill

Rails is the ideal framework for creating software to run successful new businesses.

Help us caption & translate this video!

http://amara.org/v/JUpM/

RailsConf 2016

00:00:11.870 Alright, hello everyone and welcome to 'From Excel to Rails: A Path to Enlightened Internal Software.' When I originally gave the title of this talk to the people running the conference, it was called 'The ePath to Enlightened Internal Software,' but they told me it was two characters too long. So that's why it's now called 'A Path,' but I feel that this probably reflects my ambitions more accurately.
00:00:18.539 My name is Nick Reavill, and I'm a Director of Engineering at Stitch Fix. Stitch Fix is an online styling service for men's and women's clothing. People fill out a style profile with us, and we use a combination of data science and human styling to send them five options of clothing in a box. They try on everything at home, keep what they like, and return what they don't. The reason we are here is that we build a lot of Rails software. Our customer-facing software is all Ruby on Rails, but we also build all our internal software. Everything that runs our five warehouses, everything used by our stylists, and everything that our buyers use to purchase the clothes we sell is built on Rails.
00:01:01.379 We’re also one of the sponsors here, so hopefully, you can come and see us in the exhibition hall. We have some really amazing socks that I'm sure you will all enjoy, as well as cold brew coffee. We did have t-shirts, but we ran out; however, we should have more t-shirts for tomorrow. Please come and talk to us about jobs.
00:01:20.729 I'm here to talk to you about internal software. This is a candid photograph of a colleague of mine in our office. Don't worry about the watermarks on the picture; he's called Chad or Trent or something like that, and he's very excited about the internal software we've created for him. This is really the goal for us when we write internal software.
00:01:27.650 I've been working on internal software—that is, software that is not consumer-facing—for most of my career. When I first started writing it as a developer, I was working with technology that probably almost no one has heard of called Lasso, with a FileMaker Pro database back-end, which was kind of terrifying. But then we upgraded to PHP and MySQL, which was great, and then finally, I moved on to Rails. I spent most of my career writing what you might call expert-use software, and most of that has actually been internal software.
00:02:01.560 When I first started as a developer, like a lot of people, I was reading many blogs and books, trying to learn about the industry. I came across an influential blog called 'Joel on Software,' written by Joel Spolsky, who is now more famous as the founder of Trello and Stack Overflow. Back then, he was mostly known for his blog, which later became a book. I found a lot of good content in it.
00:02:59.740 There’s a picture of Joel Spolsky here. If you do a Google image search for him, you'll see lots of pictures of him looking like the pleasant human being he actually is, but I chose this one where he looks like a homicidal Dan Aykroyd because it suits the purpose of my narrative. I was reading one of his articles while writing internal software, and I really enjoyed my job and got a lot of value out of it. In one article, Spolsky identified five different kinds of software that people might write, including shrink-wrap software, embedded software, games, and throwaway software. The last one he identified was internal software, and he was quite dismissive of it, which upset me. It felt like being friends with the cool kid at school when they tell you that, for example, Def Leppard is lame, and you think, 'Come on, that's not fair!'
00:04:50.200 This has stuck with me for like eleven years. I remembered this while writing this presentation, and I decided to use it as a chance to completely repudiate everything he said about internal software based on one sentence. The first thing he said was, 'In internal software, usability is a lower priority.' This is definitely false. The more usable the software you write, the happier your employees will be.
00:05:59.290 It's not going to be the only thing that makes them stay at the company, but it contributes to how they feel about their job. This also means they can be more efficient, which reduces you from needing to hire as many people because fewer individuals need to use the software. While it's true that no internal software will have the same number of users as successful consumer-facing software, the difference is that any internal software might be used all day, every day as a huge part of their job.
00:06:42.310 We write software for our warehouse employees, who pack the boxes and receive all the goods we have. We have five warehouses around the country, all located next to other warehouses where people do similar jobs. If our employees want to leave because they don't like what they are doing with us, they can go and work for Amazon or someone else. In fact, people say that the software we write for them is one of the reasons they enjoy working for us. Another misconception is that they're forced to use it, which is not true, as they can go find other work.
00:07:16.630 They have alternatives like Google Spreadsheets or Excel, and if they feel the software isn’t doing a good job, they'll create their own solutions, often causing bigger issues for the company down the line. Excel is a remarkable piece of software used by most businesses, including lots of small businesses and startups. In fact, Stitch Fix was founded by our CEO, Katrina Lake, who largely used a combination of Excel and Google Spreadsheets.
00:08:05.770 She ran the business at a small scale and figured out the necessary processes. By the time it came to write real software, she had a great idea of what was needed, which allowed us to build the right thing the first time. It's not perfect, but it was close. Even large companies like Macy's and Nordstrom still use Excel to run their merchandising business.
00:08:39.639 You might be surprised to learn that these companies, like Macy's with a revenue of around $17 billion a year, rely on Excel for planning their businesses, which is crazy. So, why is Excel so popular? A big reason is that Joel Spolsky worked on version 6 of Excel, and he will tell you this repeatedly. Also, it requires no specialized expertise. You need to learn how to use Excel, but you don't have to be a software developer to create those spreadsheets, allowing for significant ownership and control over the processes being modeled.
00:09:18.428 Excel is super flexible, which makes it excellent for prototyping. For instance, Katrina's use of Excel was great for trying things out because it had low overhead for changing parts of the system she was creating. Most importantly, one of the critical features for heavy Excel users is the presence of both vertical and horizontal scrolling, which many people find essential.
00:09:45.360 If Excel is so great, then why do we build internal software at all? Several reasons. First, there's no version control; people share copies of important spreadsheets over email, leading to multiple versions of the same spreadsheet, which creates confusion regarding which one is the valid source. Users also frequently copy and paste data throughout Excel because referring to the same piece of data in multiple places is challenging without that, leading to unintentional errors.
00:10:29.180 Moreover, Excel does not adequately handle data validations. While it's possible to fudge it somewhat, as Rails developers, we know true validations are challenging to implement. Whether the data is in one place or not, you can't be sure it's correct. Excel has scalability issues, too; the Mac version has a row limit of just under 1 million rows, which isn't sufficient for any serious or successful business.
00:10:51.250 There are also logistical challenges when multiple people want to work on the same spreadsheet. You have to ask others if they can use it, creating a situation that is simply untenable. Additionally, Excel can make poorly formatted nonsense look professional, which we also strive to avoid.
00:11:31.800 Internal software solves all these problems. With a relational database, you have a single source of truth. There is one record for everything that you care about, and it can be referred to in various parts of the system. You can implement validations at both the database and application levels to ensure that the data going into your system is sensible and accurate.
00:11:55.660 It's also scalable. A million rows in a PostgreSQL database is nothing, and even if you do hit limits, there are well-known established solutions to scalability issues. Lastly, internal software protects the users of the software, allowing them to focus on their jobs—making informed decisions—rather than being stuck building spreadsheets or performing tedious manual tasks.
00:12:50.180 You can build internal software with various frameworks and technologies, but Rails is particularly well-suited. Ruby is excellent for writing readable and maintainable code, which is essential for all software. In Rails, if you follow the conventions, you can produce high-quality software quickly, which is especially critical in the early stages of a startup.
00:13:31.090 There are many things to cover, and you want to replace a significant number of those Excel spreadsheets with something more effective. Rails gives you that ability, ultimately resulting in maintainable code. If you follow conventions, it is also easy to onboard new developers because they will know where to look for the key pieces of code.
00:14:22.570 Now, I want to provide a case study of how we transitioned from Excel to a Rails app at Stitch Fix. What I find fascinating about this type of process is that we can treat the Excel spreadsheets that people have created and the surrounding processes as prototypes of what we aim to build. Our business partners have already done much of the work for you.
00:14:53.230 They've seen different things that can be done with Excel, but one common issue we frequently encounter is when people use Excel to manage vast amounts of tabular data for buying. Our buying teams organize their purchases on a monthly basis, planning three or four months in advance. They list every single item they will buy, whether they've already placed an order with a vendor or not.
00:15:59.690 This spreadsheet does not look bad at first glance; you might think it's manageable. However, in reality, the actual spreadsheet is ten times longer than what I've shown you. There are also multiple teams, each with their own spreadsheet trying to accomplish the same thing, which makes it even more complicated.
00:16:52.500 Senior management wants to see an aggregation of this data across the entire organization, but aggregating all those spreadsheets is a complex task, and manual work creates a chance for errors due to repeated data copying and pasting. Moreover, we already have a system in place for creating purchase orders that uses this data, but the buying teams have to copy the information into our system to generate an order, presenting yet another opportunity for mistakes.
00:17:30.950 They also often want to compare, for example, July of this year to July of last year. Comparing spreadsheets over time is already complicated because the older spreadsheets may not align, leading to an even more complicated situation. Furthermore, there are numerous tabs within the spreadsheet, such as reference data that contains permissible values in specific columns and guidance based on targets to help buying teams.
00:18:36.280 Turning this information into internal software means the first step is to view the Excel spreadsheets as prototypes. By analyzing what people have built and what they're trying to achieve, we recognize that the data they're using is often in a denormalized view, making it easy for teams to filter and search. Thus, we need to pull this apart and create a normalized view of this data.
00:19:50.290 This requires engaging with the business partners and looking at the data itself. For instance, the purchase order numbers are repeated in every row, and other pieces of data vary, indicating we need to confirm with data entry personnel regarding what constitutes a purchase order level piece of data and what does not. We can create a structure where one vendor record corresponds to multiple purchase orders and each purchase order has multiple buying lines.
00:20:56.180 This normalization can be reflected in your database structure and ActiveRecord models, helping you construct what your Rails application will ultimately look like. The data collection and organization process can be managed similarly to the spreadsheets.
00:21:39.530 Another aspect we can leverage is observing users as they work in their Excel spreadsheets. By noticing times of frustration—like when building a pivot table takes countless keystrokes or data needs to be copied and pasted repetitively—we can determine areas where we can provide significant value with minimal effort.
00:22:11.660 Rails can help us enforce the uniqueness of purchase order numbers across the system; this could be a full-time job in Excel, but with Postgres and Rails validations, it becomes straightforward. We can also constrain values in certain columns to ensure only specified values are allowable.
00:22:59.080 Additionally, internal software allows for a single source of truth in our application, such that if we need to change a vendor address, it’s updated in one place and reflected throughout the software. Handling derived values within Rails is also simpler; while Excel has formulas, they can become convoluted when applied across multiple spreadsheets with changes needing to be made in several places.
00:23:44.150 In Rails, we define proper data types and derive business logic from them, which brings clarity. Finally, we can visually associate semantic meaning with color coding. Instead of the buyers having to remember what a particular color signifies, we can encode that meaning within our business logic.
00:24:53.110 After creating this normalized database, we need to provide our buyers with an efficient way to search this database. This is particularly challenging in Excel, especially when dealing with complex relationships and derived values. To facilitate this, we can build an interface that interacts smoothly with the database.
00:25:41.670 In our system, we may denormalize and load information into Elasticsearch for easier searches across a huge number of rows. This is how we constructed our 'buy sheet' tool for the team, retaining a tabular format yet offering additional features like adding photos easily.
00:26:12.290 This screenshot displays the view for one team, the blouses team, for one month. While you might question the absence of pagination as a UI necessity, for expert users, some UI rules don’t apply. They are already familiar with the data, and they can easily search through it in their browser.
00:26:59.330 Instead of just providing one buy sheet, our system allows for multiple formats, enabling users to view data across varying time periods or groupings. This flexibility means that grouping can easily change to meet specific needs, allowing them to visualize the data differently.
00:27:48.800 We also avoid horizontal scrolling by incorporating useful rollovers. Instead of needing to scroll extensively left and right, users can pin particular columns they want to explore while traveling down long data sets. Summary information now appears on the same page, loaded through AJAX, enabling a more cohesive experience.
00:28:50.000 This new Rails application provides greater process consistency, better data accuracy, and scalability as our company continues growing rapidly. While people will always desire access to Excel, typically for functions not programmed into our system, we are sure to offer features like exporting to CSV to accommodate those needs.
00:29:12.450 Now, regarding our testing process—yes, managing all those columns does require a thorough testing approach. The challenge is that Elasticsearch setup can be somewhat awkward since you'll need to create records in your database with a factory, push these into Elasticsearch, search through them, and determine what results you are getting back, leading to slower tests rather than difficult ones.
00:30:05.550 Overall, we collected feedback from users about how much easier the new system has made their lives. They can grow their operations effectively without increasing headcount. It’s vital to emphasize where human beings add value—removing laborious tasks allows them to focus on the more impactful aspects of their roles.
00:31:35.860 Turning to implementation, adoption of the system is generally simple due to its role in their job; people are incentivized to use it. Still, there may be apprehensions out of fear of change, which we address iteratively by refining the software and responding to user feedback, making sure to include vital features like multi-row editing.
00:32:02.490 As for favorite gems or libraries, we tend to create internal gems, but on a broader scale, we use typical gems applicable to any Rails application. So while there’s always common foundational components, I would emphasize the importance of keeping things tailored to our internal processes.
00:32:32.570 Lastly, regarding admin frameworks, we experimented with Active Admin initially, but ultimately found such tools to be somewhat limiting. Instead, we focused on building a more bespoke system that handles complex processes beyond simple CRUD functionalities.
00:32:59.790 Thank you very much for your attention and I'm happy to answer any further questions.
Explore all talks recorded at RailsConf 2016
+106