00:00:13.100
Thank you for coming, everyone, and welcome to Day One of employee orientation at Ivory Tower Innovation Technology. We're super happy to have you here, and we all think you made a great choice joining the team.
00:00:24.539
You've probably heard in the media a lot of people call us IT. You'll come to appreciate that we don't care much for redundancy; we refer to ourselves as IT.
00:00:37.170
I'm going to talk to you a little bit about our engineering culture. What we like to say is that our code has a philosophy we all buy into. We even have T-shirts that embody this philosophy. To us, that means we follow the old adage: when writing code, we make it work.
00:00:55.440
And making it work also means ensuring we have appropriate test coverage. To us, appropriate test coverage is full test coverage; we have 100% test coverage on every line of code that's written in all of our applications. So congratulations! No longer do you have to worry about whether you’ll introduce a regression when making a change to the system because the tests will have your back.
00:01:12.960
But it’s not enough to just make it work. We also make it right. We follow all the latest and greatest best practices, and one that’s near and dear to our hearts is writing DRY code. We aggressively abstract away any duplication that exists in our system, so you only have to change it in one place at one time.
00:01:31.620
We don’t just make it right; we also make it fast. Performance is a first-class citizen here at Ivory Tower. We ensure that writing code is done in the most performant way possible from the get-go. We don’t wait for a performance problem to introduce itself; we address it before it even becomes a concern.
00:01:59.130
So, as you leave orientation, make sure to remember to make it work, make it right, and make it fast. Thank you all for being here. I think you've made a really great choice joining the team; give yourselves a round of applause.
00:02:19.130
You really think you made a great choice? Great! Now I can show everyone at home that everyone clapped after my presentation, and they won’t know why.
00:02:32.710
But I'm not going to continue this charade for another 37 minutes and 30 seconds. My name is Kevin Murphy. I work at the NAR Company, a software consultancy in Boston, Massachusetts. Today, we're going to talk about what it might be like to work at Ivory Tower or a company like it.
00:02:48.920
I don't work there; I presume you don't either, because it’s a place I made up. Everything I said at the beginning doesn't sound bad, in fact, there are all things that I like and value in my codebases.
00:03:01.070
However, that sort of blind adherence to some of those structures can have downsides or unintended consequences that maybe we don't talk about as much. So today, we're going to discuss those issues.
00:03:19.340
If we continue on our Ivory Tower journey and wrap up orientation, we go back, and we meet our team. Everyone's super friendly. Our product owner says to us, 'I've got a great first problem for you to work on.'
00:03:36.830
We have a sign-up flow on that signup page, and there are some testimonials that show up to tell everyone how great our product is, which encourages them to sign up. The number of testimonials changes based on different factors, and I noticed today there should be four, but we’re only showing three. Could you take a look at that?
00:03:52.290
So we say, 'Sure, of course, what else are you going to say?' You clone the repo, get everything set up, and you’re excited because during your interview process, everyone was talking about how Ivory Tower only uses the latest and greatest technologies.
00:04:06.800
So you're wondering what sort of crazy AI or machine learning you’re going to get to throw at some data set to figure out how many testimonials to show. You finally find a piece of code that does it, and you’re right to be excited; Ivory Tower is using the latest and greatest in predictive analysis.
00:04:25.820
But when checking the code, it turns out there's a big IF test, and looking at the bug, it states that there should be four. But why are we only seeing three? It feels like Mercury's in retrograde.
00:04:40.730
Like any developer, I have a stack of resources that I carry with me from desk to desk, and so I pull off the most important one: the Farmers' Almanac helps me every day. It turns out Mercury actually is in retrograde, and today is the last day of the calendar year that we're going to see Mercury in retrograde.
00:05:04.370
So we have some urgency here; we really should fix this today to ensure it actually works. Great, more pressure on Day One.
00:05:19.460
But before actually starting to solve this problem, let's think back to orientation, right? Don't we have full test coverage? How is this even a problem?
00:05:31.400
Maybe the coverage tool is broken? After looking at the code coverage run, we find that this class is fully covered, and we even have a test for this exact scenario.
00:05:45.640
But it turns out that test is stubbing some collaborator in some Mercury class, and we make sure it's in retrograde, and then we instantiate a unit under test and, yes, it says four. Okay, that’s all well and good, but what we really need to look at is that Mercury class.
00:06:09.680
It also has 100% test coverage with only three lines. So let’s skip the test and look at the implementation; it's one class that can tell you if it's in retrograde given a date, and it's always false.
00:06:22.690
We have some tests that execute that, and we have full test coverage, but we still have a bug because unfortunately, coverage doesn’t give you the whole story.
00:06:34.600
Coverage can tell you if you executed every line of code, but it can't tell you if you executed every use case or business requirement that you need to satisfy in your system.
00:06:47.520
Now, you might be looking at this and thinking this is a ridiculous example. It is, and that's part of the fun of making conference talks. You might think this would never happen at your company, right? Of course not; you have a strong TDD culture.
00:07:11.639
You would have immediately written that test to ensure Mercury can report it's in retrograde, and you wouldn't have shipped this unless it passed the test.
00:07:19.139
We do pair programming, and all of my coworkers are thoughtful and intelligent folks, and they would have noticed this. We would have solved it together.
00:07:33.569
We might even run mutation testing on all of your changes, and you would have noticed that the 'day' parameter isn’t used in that method, so it wouldn't have passed mutation testing.
00:07:46.649
We have a strong code review culture; again, similar to pair programming, someone would have caught that. Well, it just happened.
00:08:03.710
Maybe the last stopgap is your QA team, and they are really good at their job. A part of their job is identifying all the test cases that your code can go through.
00:08:17.370
Certainly, they would have thought to test if Mercury can be in retrograde! Congratulations; those are all great processes to have, and I'm not against any of them.
00:08:34.539
But if you're just looking at coverage to tell you if you have sufficiently tested your system, you're missing something.
00:08:47.230
So, let’s look at something else that can occur when you're writing code in service of test coverage. We're starting to make changes to our system to actually tell you that Mercury isn't in retrograde.
00:09:03.100
We're running the test locally, and one of them is failing some of the time. It’s not directly related to the changes we’re making, but it’s in the same class.
00:09:19.660
So we’re kind of nervous or concerned that we just don’t understand the system, but we're not sure what’s going on. We give an awkward wave to the person sitting next to us.
00:09:32.990
They take their headphones off and come over; they’re super nice. You explain the situation to them, and they go, 'Oh yeah, that one just fails sometimes. Don't worry about it.'
00:09:45.930
You can take them at their word, but that's kind of bugging you. So you're seeing if there's zero or one testimonial, but we don’t have the numbers zero or one showing up anywhere in this method.
00:10:05.629
I guess we're in this coin flip scenario that just generates a random number between zero and one. Cool, let’s look at the test that’s giving us a hard time.
00:10:24.360
The description seems great; we’re making some date and have some results array and we create the unit under test. We then move to that date, which is neither a Tuesday, a date where there's a full moon, nor a time when Mercury's in retrograde.
00:10:38.790
So we fall into the ELSE condition, and twice we're asking for the number of testimonials, making sure we get 0 and 1 back. Some of you are laughing because you probably understand probability, and this isn't going to be the case all the time.
00:10:56.999
So now you can fix this, right? You could say, 'The problem is we're running it twice; let’s just run it 200 times.'
00:11:13.010
And I mean, you didn't fix the problem; you made it less likely. But maybe you don’t care about executing and seeing both zeroes. Let's just make sure it returns only 0 or 1, and if the regression suite ever throws a 3, then we’ll fix it then.
00:11:30.769
That’s fine, but the issue is that the test existed in the way that it did. There are many ways you could solve this problem, and this isn’t a talk on fixing flaky or non-deterministic tests.
00:11:45.070
The point is you had to do something because this test was written, and the reason it was written was due to the need to test the random method in the Ruby standard library.
00:12:02.240
I may have been a little unfair to code coverage, but let’s dig deeper: Is it something that really matters? Is it just a vanity metric that companies with high code coverage like to boast about to make others feel bad?
00:12:21.460
Like, Ivory Tower has a 1% test coverage and they still have bugs, so why do we care? Well, coverage is a signal. Unfortunately, we can’t just throw it away; it gives you information.
00:12:38.560
But to be able to do something with that information, you’ve got to dig a little deeper than just looking at the number.
00:12:53.640
Let’s look at an example of some code coverage for a Rails application. If you're able to do so, you’ll see that the numbers are all in red. Red is bad.
00:13:09.530
So obviously every file here has bad code coverage, but let’s dig a little deeper. The first two files here are for Action Cable, a Rails framework feature that doesn’t really matter.
00:13:25.390
If your application isn’t using Action Cable, I hope this number is 0. Why are you testing something that you’re not even using?
00:13:41.490
You may have a good reason; I’ll have a talk afterwards if you do. You could say, 'Yeah, we should just tell our code coverage tool to ignore these files or we should rip Action Cable out of our application.' Those are reasonable choices.
00:13:59.790
We could also just do nothing, and nothing bad will happen, I promise. The same story goes for the next two files; they are other framework features for background jobs and sending emails.
00:14:16.500
It may be the same story here, but if you just introduced email into your system, seeing this big 0 here is giving you information. Maybe you want to test that mailer.
00:14:30.920
This last one here is at 80%, and that's better than zero, but it’s still red so we need to do something about it. We don’t know why it’s wrong or bad, so let’s dig into what the actual file is.
00:14:47.360
You can see there are two lines that aren’t covered here: a getter and a setter. Now if you have a lot of confidence in your test suite, you can argue this shows you dead code because this is never executed.
00:15:05.930
Therefore, you could remove those methods; you have less code to maintain, your code coverage goes up, and everyone wins. That may be the case.
00:15:22.750
But, you may also not live in that situation or maybe you know that part of your application uses this method and for whatever reason, just doesn't test for it.
00:15:39.350
It's not exactly a high cost to make a test for either of these methods, but you could make the argument that you're pretty confident in Ruby's ability to take a blob of data.
00:15:56.600
If it didn’t, there's probably a bigger problem in the world than your application. It’s an argument you could make. I'm just interested in looking at code coverage and how my tests are written.
00:16:14.850
I’m more interested in what the total cost of ownership of that test suite is. Code coverage can help explain how far along you are in the path, but it can't tell you if all your tests are pulling their weight.
00:16:31.800
One simple trick to increasing your code coverage is to write more tests. If you write more tests, you have more code; just code that needs to be maintained, updated, and kept working.
00:16:49.440
That is all great as long as it continues to be valuable. Unfortunately, it’s a bit of a judgment call: if you write more tests, your entire test suite takes longer to run. You may experience longer feedback cycles both on CI and locally.
00:17:09.720
If you write more tests, you may write a flaky test that fails sometimes. You need to make a choice: you can either fix it and spend time figuring it out, or just ignore it.
00:17:26.700
Or you could delete it. Maybe that affects your code coverage, but it could make your life a lot better because you know it won’t take twice as long to deploy to production since you can just hit the rebuild button.
00:17:41.580
But talking about all of this isn’t about getting answers; it's about giving justifications for not doing work.
00:17:55.490
Let’s talk about another example at Ivory Tower. I have now been working here for a couple of weeks, and things have been going really great.
00:18:10.370
Ivory Tower is not a remote shop, and we have an open office plan—because is there even another type of office anymore? You’re diligently working at your desk.
00:18:24.890
Often, in a corner, some people begin cheering and celebrating. I've mentioned it's an open office, so immediately you're distracted and annoyed. You don't really know what's going on.
00:18:40.640
So you give the same curious wave to your neighbor, and they give you a knowing glance: 'I'm so sorry; I should have invited you to the Slack channel. Let me invite you right now.'
00:18:55.180
It turns out there's actually some great news in there: Ivory Tower has decided to acquire one of their main competitors, Dark Dungeon. Now we're going to be working together.
00:19:09.240
We're going to be using the Ivory Tower codebase because obviously it is the best. Both Ivory Tower and Dark Dungeon have an API-based product.
00:19:26.610
They secure their APIs with an access key, and between Dark Dungeon and Ivory Tower, the structure of those access keys looks a little different.
00:19:40.210
For whatever reason, decisions well beyond your pay grade have determined that as Dark Dungeon customers start using the Ivory Tower codebase, we’ll make new access keys for them.
00:19:56.520
However, they’re going to keep it looking like a Dark Dungeon access key. Sure, we can do that! But we’ve never worked on the part of the codebase that generates access keys.
00:20:09.790
So we start by looking at some of the tests to get a sense of what this thing does. We find a test that says it doesn’t make a company key. Sure, okay, cool.
00:20:25.360
So we’ve got some class here that generates an access key, and here’s how you make an access key: just call this method and give it a couple of parameters.
00:20:42.490
Then I need to make sure that what I get back for a key doesn’t look like this company regex thing. I'm assuming the company regex is a regex because it's called that, and I'm passing it to the match method.
00:20:57.110
However, I really can’t comprehend what I’m reading. The reason is that this is just one test in a big test file that has many tests, and the company regex is defined at the top.
00:21:12.630
This is good because the company regex is used in a couple of places in this test file, and we've chosen to not repeat ourselves.
00:21:28.380
However, when I write tests, I'm a bit more concerned about the readability and understanding of the individual test files.
00:21:46.310
Instead, I prefer to write DAMP code, which stands for Descriptive And Meaningful Phrases. I did not coin this term, nor could I find out who did, so if you're in the audience, please take a bow.
00:22:02.960
When writing DRY code, it doesn't mean that your code isn't DAMP. However, if you write code purely in service of being DRY, it may not be as DAMP as it could be.
00:22:29.050
This test is one such example where we could define the regex in the test file and use that regex. All we need to do is read a regex, and that’s hard enough.
00:22:42.620
It’s really hard when it's 100 lines above. We’ve put all the context in this test. Yes, we have some duplication in our code.
00:22:55.710
Yes, if the company regex changes, we have to change it in three places. However, you gain a lot more readability here.
00:23:09.790
At this point, we need to get back to work; we’ve got to make those access keys look right. As we find the actual implementation file, we look at it.
00:23:22.660
Okay, it is what it is; great! But we can solve this quickly. We’ll just add another parameter to the method signature to say if it's for the acquired company.
00:23:37.410
That should default to nil so all our existing call sites will continue to work. Then we’ll just squish in another IF test here.
00:23:54.010
If it's for an acquired company, we’ll give them the Dark Dungeon format; otherwise, we’ll just keep doing what we've been doing.
00:24:08.310
You may think that this is fine because nothing in here is repeated, but hold on—what does this have to do with DRY code? Nothing! Everything that can be repeated in a program can be a good reason to abstract.
00:24:24.490
To address this, let’s take a little walk down memory lane or look through the commit history for how this class came to be and how it evolved over time.
00:24:40.000
In the beginning, there was a user class—and there still is, because it’s a web app, and we need users to use it! Users needed to have access keys, so they needed a method to generate an access key.
00:24:56.270
Everything worked great, then later on, Ivory Tower started selling some enterprise clients, and they had unique needs. They didn’t want their access keys tied to particular users because users could leave the companies.
00:25:09.080
Since they didn't want their access keys tied to actual people, Ivory Tower could have made a shared account for them or something. But for whatever reason, they decided to go with the existing company model.
00:25:23.630
So they coded this up, put it up for review, and one enterprising developer that looks a lot like me—but isn’t—made a comment that we should extract this into one place.
00:25:40.740
The thought was that we only needed to worry about it in one place, and both users and companies could use the access key generator class.
00:25:55.210
Over time, how you made an access key started to become a little more complex, leading to situations like this. But with each individual change, it was kind of decided it wasn’t that bad.
00:26:10.150
Because it was just adding one parameter to the method signature, and if test there—it was quick and simple, so we honored that abstraction as it existed.
00:26:27.430
But instead, we could have embraced writing some WET code. You can write everything twice! Don't get too hung up on the number; you can write everything thrice and still adhere to the acronym.
00:26:39.670
However, be willing to accept some level of duplication in your system. When that other developer suggested, 'Hey, let's abstract this out,' the original author could have said, 'That’s a great suggestion, but I’m not comfortable yet.'
00:26:54.230
Since they recognize that they are similar, but they’re not sure how they might change over time, it would be acceptable to reject that proposal and say, 'Yes, users and companies can generate access keys, and they will look exactly the same.'
00:27:12.060
As the requirements change, the way a user generates an access key can differ from a company's access key, and they don’t have to concern themselves with each other.
00:27:28.920
This may not be the best way to generate access keys for these two things, but at least it's a different way.
00:27:41.630
Your ability to agree or disagree with these changes probably depends on your ability to predict the future. I'm not great at that.
00:27:58.540
What I’m saying is that this is hard to get right, and it's also hard to know in the moment if you're doing it right.
00:28:13.870
When I’m interested in figuring out if I really want to DRY this up or not, I like to think in terms of flexibility—both in terms of gaining or losing flexibility in your system and how much flexibility you do or don't need.
00:28:29.600
There are situations where parts of code should work the same every time. An abstraction is a wonderful way to model that, so pull that out and use the same piece every time. It's great.
00:28:44.660
There are also situations where things simply look the same but aren’t; it's an implementation detail subject to change over time, as is all code.
00:28:56.420
It may make your life a little more difficult to have then made that abstraction. Hence, consider maintaining or getting rid of it because inertia is a hard force to work against.
00:29:12.360
But just because it’s hard doesn’t mean you can’t change it. We could back out of any of these decisions, but we must consider how we would do that.
00:29:26.920
It's essential to keep this in mind when introducing changes to your system—whether that be for DRY code or otherwise.
00:29:42.920
One last example to share with you. We’ve merged that part of the code, and Dark Dungeon customers are starting to use our product; everyone is excited.
00:29:58.250
Then we receive a bug report from one of those Dark Dungeon customers saying, 'Hey, I passed in 15 to your API, but I got one number back.'
00:30:12.620
'In the Dark Dungeon system, I got a different value, and I went off and did the math by hand, and I think Dark Dungeon is right. Could you take a look at this?'
00:30:28.190
Sure! But before I take a look, let's talk a little about performance code. I mentioned at orientation that Ivory Tower ensures all of its code is as performant as possible.
00:30:43.430
This part of the system is the core of their application—it’s the entire business value they offer to the world, so they need to make sure it works right and fast.
00:30:56.830
Everyone is going to use this product, and it needs to fly. Ruby’s great, but they're kind of worried about it, so they decided to write this part of the system in Rust.
00:31:11.790
It's a systems language, so it’s obviously going to be faster. When I say we wrote it in Rust, I really mean Alice wrote it in Rust, and Alice is a wonderful developer, but she just left for vacation yesterday.
00:31:25.890
No one else in the company has written Rust before, so there are organizational issues at play here. That is not part of this talk, but if that's the situation you are in, someone must figure it out.
00:31:40.050
We decide to be a little adventurous; we’ve never looked at Rust before, so let’s take a look. We start rooting through places we're familiar with, and we find in our gem file this Helix thing that handles some interop between Ruby and Rust.
00:31:55.920
So we read the documentation on that, and through this, we find where the Rust file is. We see that we have some generate method or function—I don’t know what Rust calls it.
00:32:09.920
I guess that takes some sort of hash or something, and then we point it to a vector of strings. If you’ve done some Java before, you might think this is similar to the collections API.
00:32:24.700
That’s cool; this is probably making a variable. Although I still have no idea what a 'Vec' is, let’s go with it. There's a for loop; I know what a for loop is, and I'm feeling good.
00:32:38.820
But then a 'bound' shows up at the top. Rust is a strongly typed language, so 'bound' is the name of the parameter you're passing into your method.
00:32:54.080
It’s an i32, which is a type of integer. Great! Now I'm feeling really good; I’m iterating through that...
00:33:08.930
Oh no! Confidence lost! I’m calling some push behavior on that 'Vec' and passing it the result of the match function/macro/method keyword. I have no idea what 'match' does.
00:33:22.420
I believe it takes some math with the number that we're iterating through; then it calls a closure. But I’ll be honest, I don’t really know what I'm reading.
00:33:39.970
Even though I forgot to mention what Ivory Tower does during orientation, here’s the scoop: it’s one of the premier vendors for FizzBuzz as a Service.
00:33:54.390
That’s right, they identified a core marketing need: developers worldwide have been asked to write FizzBuzz, and none of us want to do it.
00:34:09.260
They said, “You don’t have to worry about it anymore; we’ll take care of it for you. Moreover, you can write FizzBuzz in any language you like, all you need to do is make an HTTP call in that language.
00:34:24.310
If you’re familiar with FizzBuzz, you might be looking at this and thinking, 'Well, this isn’t a big deal.' It’s just this case here that doesn’t belong in the problem set; we can just delete it, and the program works correctly.
00:34:42.390
If you’re not familiar with FizzBuzz, I’m not going to explain it to you, and congratulations! We’re doing something right as an industry.
00:34:56.930
Even though this change was easy to implement and solve, it was a lot harder than it needed to be. Why? Because we are all Rubyists here.
00:35:13.170
If we wrote this in Ruby, we would have been a little more familiar with the problem set and could have sussed it out easier.
00:35:30.550
The reason it was written in Rust was because it was faster. But I don’t know that it’s faster because I didn’t benchmark it.
00:35:46.320
It could be the case that Rust is 70 times faster at FizzBuzz than Ruby, or it could be that Ruby is twice as fast as Rust, or maybe the difference is negligible.
00:36:01.370
But Ivory Tower sure didn’t figure that out; they just read something on Hacker News that said Rust is fast, so they decided to go with Rust. While performant code is great, it’s essential to ensure you're doing it for a reason.
00:36:17.150
When you know the reason behind your choice because you have some data points that indicate this needs addressing, you can use that information to decide if what you’re doing provides a performance benefit.
00:36:34.870
I’ve written code that I thought was more performant but later discovered that it made things worse.
00:36:49.890
Had I simply eyeballed it, I could have thought I solved the problem, but I sure didn't. Because they didn’t have all this information, Ivory Tower doesn’t know if they're in a better spot or not.
00:37:05.590
But they sure have all the costs associated with it! Now they need to have experts in both Ruby and Rust and continue ensuring all the bits and pieces talk to one another.
00:37:23.150
Again, that may have been warranted; however, they didn’t figure out whether they needed it.
00:37:39.290
So when you leave here, go back to your working profession or just your daily life writing code.
00:37:54.570
Definitely make sure to make it work, make it right, and make it fast. When you're writing tests to ensure they work, make sure you’re doing that to give yourself confidence in the changes you’re making.
00:38:09.160
So the system is confident it’s going to work, not just because some number on a screen tells you to do it.
00:38:25.690
Best practices are important—they're the best! It's right in the name! So please use them. But the unfortunate truth is that not all best practices apply equally to all situations.
00:38:42.150
The hard part is knowing when to use them. As I mentioned, definitely write performant code. I love using systems that are highly performant.
00:38:56.960
I sometimes enjoy writing performant code, but when I do, I ensure it's for a reason and that I have benchmarks to tell me I’ve achieved my goal.
00:39:13.370
Perhaps that's just me; I might not trust myself very much.
00:39:26.370
If you're interested in learning more about the NAR Company or you’d like a narrative description of this presentation or want a copy of these slides, you can find that at tariko.org/rubyconf.
00:39:40.570
I also have a Rails application on my personal GitHub account, Kevin-J-M. The repository cleverly named Ivory Tower contains a copy of the slides, along with code examples that didn’t make it into this presentation.
00:39:54.580
I’m happy to take any questions afterwards on an individual basis, so feel free to chat with me throughout the rest of the conference.
00:40:10.600
I also have some stickers that look like this, so if you want one or both of them, feel free to come by and get a sticker.
00:40:27.560
Otherwise, thank you all very much for your time; you've been a great audience. Enjoy the rest of your conference.