MountainWest RubyConf 2010
I was wrong about ruport

I was wrong about ruport

by Gregory Brown

In the presentation "I was wrong about Ruport," Gregory Brown discusses his evolution as a programmer and his experiences with the Ruport reporting library he created. He uses a humorous approach to compare his early perceptions of his code, likening it to the story of Don Quixote battling windmills, which he once thought were giants. Brown reflects on how the challenging, complex code he created turned out to be inadequate.

Key Points:

  • Introduction and Background:

    • Brown starts with a comedic introduction, showing an image of Don Quixote which symbolizes programmers battling their misconceptions.
  • Lessons from Bad Code:

    • The talk conveys how he once believed poorly structured code was impressive simply because it was difficult to write.
    • Over five years, Brown learned this code was ultimately ineffective.
  • Concept of Ruport:

    • Ruport was a solution aimed at simplifying reporting tasks by consolidating multiple reporting hacks used in earlier projects in languages like Perl.
    • Although Ruport's 1.0 version was useful, Brown acknowledges the outdated nature of its implementation at the time of his talk.
  • Good vs. Bad Code:

    • Brown contrasts a simple, elegant coding approach he currently upholds against the convoluted Ruport code he wrote in 2006.
    • He highlights the importance of clarity and simplicity over unnecessary complexity, which can lead to bugs and maintainability issues.
  • Code Reading and Improvement:

    • Brown urges developers to read and understand both their code and others’ code to improve their programming skills.
    • Multiple questions posed by Brown during the code review process foster an understanding of how programming decisions can significantly affect the outcome.
  • Final Thoughts:

    • Brown ends with his strong belief that the 'giants' faced in coding today will often be less daunting in hindsight. He encourages programmers to learn from their mistakes and view their evolving understanding of code as an asset over time.

Conclusions:

  • The journey from confusion to clarity in coding practices is gradual and often painful, but it is an essential part of a programmer's growth. Brown motivates programmers to embrace their past mistakes and recognize that clearer, simpler code leads to better functionality and easier maintenance.
00:00:15.400 Um, I'm Gregory Brown, and that's a picture of me from Christmas. That's pretty much my introduction. The picture is worth probably 50 or 60,000 words.
00:00:21.400 Alright, so this is a picture of a guy on a horse attacking a windmill. And why is he doing that?
00:00:28.920 Anyone? What's that? He thinks it's a dragon, yeah, a dragon or some giant or some terrifying beast.
00:00:36.800 The reason why I decided to pick this picture to start off this talk is because that's pretty much how I feel all the time as a programmer.
00:00:43.039 So today, what I want to show you is some really bad code, but it's bad code that at the time I thought was really great. The reason why I thought it was great is because it was so hard for me to write.
00:00:56.600 It was complicated, it was confusing, and it was difficult. And then when I was done with it, I was proud of it. It took me five years to realize what total crap it was, and now I'd like to share it with you.
00:01:07.759 This reporting library that I started as a fun side project, in addition to the work I was doing. By its 1.0 release, we had some smart people working on this, and I had learned some stuff. Ruport can be useful for reporting tasks, so if you're doing reporting work, you could check it out.
00:01:30.200 In 2010, it may seem a little outdated. We weren't really up to date with the latest Ruby idioms and things like that, simply because we stopped developing it at that point. It was stable; it had run its course.
00:01:49.640 But just keep this in mind: Anything that you see in this talk that's terrible is my fault. So really, what I'd like to talk to you about is sort of the Holy Grail of what I was trying to do with Ruport.
00:02:10.240 I came from a Perl background, and I was doing some reporting stuff in Perl, and I was forced into doing some Net stuff at one point.
00:02:18.120 When I was doing things like reports, I tended to just write these one-off hacks. As those hacks grew, we started off with something simple like, 'Oh, we just need a CSV file.'
00:02:30.680 ' Oh, now we need an Excel file; now we need a PDF; now we need HTML,' etc. What I wanted was a way to take all of those little hacks and bring them together into a system that would work.
00:02:44.200 Now, I wasn't looking for something to do the job for me; I was just looking for something that would help me make sure that I don't hurt myself.
00:02:51.920 In order to do that, I wanted to build something really simple and low-level. That was my goal.
00:03:00.320 Unfortunately, to make that goal make any sense to you, I'm going to have to show you some good code first because if we look at the bad code right away, I'll get so confused that I won’t be able to explain it to you.
00:03:12.319 So I'm going to give you a very rough taste of how I might approach this problem in 2010, given what I know now. So here's the idea, and the reason why I'm using this example is because it maps to an example that I wrote in 2006 for Ruport.
00:03:24.799 It's a little bit convoluted, but at the same time, it's representative of the sort of things you might want to do. You've got some raw data; in this case, it's a bunch of points, and it's a nested array structure.
00:03:38.720 They form lines, and you want to output them in a couple of formats. Here, we're going to look at SVG and text output. The idea is that we want to just be able to change a word to get the different output.
00:03:52.760 Now, of course, by making the interfaces the same, you can do that, but we want to encourage it further down the stack, finding a nice place for the common code to lie and finding a nice way to hook in new formats while taking advantage of the existing API.
00:04:03.520 Now, I want to point out that I'm showing you the most basic part about this idea. It goes on to other things, but I think we should be able to figure it out.
00:04:17.239 So we had the SVG and the text, and the only thing we changed was the symbol that we were using to tell it what to render. So, starting from a modern outlook on this, I was thinking, okay, well, I could have a line plotter class.
00:04:28.040 It could be a subclass of a formatter that provides some things for me, and a text format class, and it's a subclass of a format that provides some things for me.
00:04:35.520 And then each of these are going to have some render methods, and all they do is deal with the parameters that are being passed in and return strings. Pretty straightforward stuff.
00:04:50.960 But the line plotter is going to need to know what formats it supports, so we can just store them in a hash. That's sort of the rough idea of where I would start with this, very low ceremony, very simple.
00:05:01.600 So this is the implementation of that. You see that formats is just a class method that returns a hash.
00:05:06.720 Render basically will look up a format in the format class in that hash, instantiate it, assign the parameters so that they're there, and then call render.
00:05:17.360 Now, format is basically just a container sort of abstract class for now. It's got params fed to it by the formatter, and it's got this abstract method that you need to override called render.
00:05:25.200 That can do whatever you want so long as it returns a string. As of right now, there's nothing special about this; it's just sort of ordinary Ruby development.
00:05:36.720 With just that code, you get something like this. Now, I've ignored what you would do inside that render because it doesn't really matter for the purposes of what we're talking about.
00:05:43.760 Just keep in mind, whatever we've passed in this hash is available in that format object, so you can do whatever you want with it as long as it returns a string.
00:05:50.560 Okay, so what I'd like to do is lift that up a little bit because it's tedious to keep writing those definitions over and over again.
00:06:00.559 So I'd like to be able to write something like this, which is totally in the style of Sinatra or something like that.
00:06:07.559 Totally, completely stolen straight from there, but the idea is that we'd like to be able to register and define these steps all at the same time.
00:06:14.560 To do that is way easier than you might think. You've got the format method which takes the format you want to work with and a block, and it uses that block to create an anonymous subclass of format.
00:06:22.239 Then it just sticks it in the hash so that it knows how to look it up later.
00:06:28.320 That actually all does is the end result is the same as our sort of raw code.
00:06:43.520 Now, we want to talk about just going one step higher, and that's if there's some common code between two different formats.
00:06:56.840 Since both of those formats have access to the same source data, if there's something you're going to be doing for both of them, it would be nice to be able to share some common code.
00:07:10.560 So, of course, Ruby gives us something to do that in modules. The most simple thing that we could possibly do is just create a module.
00:07:17.120 Say, okay, we have these raw arrays; it would really be nice to work with something a little bit more structured, so we made these line structs.
00:07:24.480 Then, inside of the text formatter, we include those helpers, and now we can call the lines method directly and do whatever we want with it.
00:07:31.440 This shows you a little example of how you would generate that stuff.
00:07:39.399 But again, what we'd like to do is lift this up because if we're doing it again and again, you don’t really want to type that include helpers into every single format that you're working with.
00:07:49.560 So, stealing again straight from Sinatra, we make something that allows you to have helpers, and you define your helper module through a simple block form.
00:08:01.300 Now, every format that you create in the context of this formatter will have access to those methods.
00:08:09.600 The implementation here, again, is simple. All you do is create a class method called helpers that takes a block.
00:08:17.360 It creates an anonymous module, and in essence, all it's doing is what you did manually, but dynamically.
00:08:25.200 Now render only needs one line change. We don't need to include it in the whole class necessarily; we could just extend it into an instance.
00:08:32.479 That works fine; these are now Singleton methods on the format instances.
00:08:40.359 With those changes, you've got something in which you can pick and choose. You can use the explicit interface or the implicit, sort of pretty sugar.
00:08:46.760 Both work, and in the end, you end up with something like this. It may be hard to see, but it's just all the code that you've been looking at so far. You've got your common code at the top and then two separate formats.
00:08:55.040 That gives you what we wanted in the first place: a nice way to abstract common code out and then have a standard interface between two different formats, or as many as you want.
00:09:04.840 Now, I've been experimenting with these ideas. There's a lot of things you would want to do in addition to that, if you were going to do something interesting. Parameter validation, for example, or things like more complex pre-processing.
00:09:19.440 Allowing you to use different base classes. Say, for example, I make a PDF format, and I include it as an extension in Prawn, and then you could just pull it in and drop your format from that.
00:09:32.920 I've been playing around with things like that, but I didn't want to get into that. I want to just go to the basics. Ruport will make this significantly confusing enough with just these features, I promise you.
00:09:47.640 So, the idea here is that this code is nice, but it's not that exciting, kind of like a windmill. Your work is the thing that really needs to be exciting.
00:10:00.360 If you happen to be writing exciting code, that's great, but if you're trying to write exciting code, then unless you're doing it just for the fun of it—which I think is awesome—you can learn so much.
00:10:11.640 How many people like things like golfing? You know the sort of... yeah, not the sport but shortening the code or just convoluted code contests, things like that.
00:10:18.480 You can learn so much from that stuff, but if you're actually focusing on being a producer of things that people use, then your work is the thing that matters.
00:10:25.200 But we are hackers, and that means that we love a good challenge. When I was done with this, I didn't feel like Don Quixote. I felt like David versus Goliath.
00:10:37.120 You know, not with this first code, but with this Ruport code that we're going to look at. Because that's the day that I thought I killed the giant.
00:10:48.560 Alright, I am going to go directly into the Ruport code. This was inspired from... before I do that, are there any questions about the stuff that I showed so far?
00:10:57.920 Okay, cool. I have to warn you, I'm going to get confused, and so are you. This is going to be terrible, but it's going to be a good exercise in code reading.
00:11:06.880 Just to give you the analogies that you need, we're working with the same exact example, but a format in the context of Ruport in 2006 is a format plug-in.
00:11:18.080 A formatter is a format engine, and I'm not a TextMate user, so I'm only using this because it's easy to click at.
00:11:29.440 And I'm afraid to use my own workflow, so if I do a really bad job using this editor, just forgive me.
00:11:39.200 Anyway, I've got a bunch of questions, and when I said this would be a good exercise in code reading, I definitely meant that.
00:11:51.880 How many people go out and read code on a weekly basis? Okay, other people's code? Nice! That's really good.
00:12:04.640 How about code that's not related to your work? Awesome! Okay, perfect. So that's a really good thing.
00:12:14.960 That's probably because we're at MountainWest, you know, and this is a hackers conference, and that's awesome.
00:12:24.440 But I mean, the thing is that when I sit down to read code, I try to come up with some questions to sort of guide me.
00:12:30.160 Then I just follow those questions wherever they may lead me. So let's walk through this example first and just take a quick look through it.
00:12:39.920 At the top, can everybody see this good size? Yeah, okay, good. Alright, so at the top, you see something that's sort of like a Twilight Zone version of what we did in the first place.
00:12:48.960 But this is because this is the actual code example from 2006. You can see that it's pretty similar.
00:12:59.680 You're telling it what format you're using and what data you have, and then it just does it for you.
00:13:09.120 If we had a text plugin or an XML plugin or just straight XML, whatever it is, it would just look like that. And that's the goal.
00:13:17.680 When we look at the SVG plugin, the content for this looks pretty much the same way that it would look using the other approach that I had mentioned.
00:13:28.480 We've got this idea of defining a renderer. Instead of just doing def render, Ruport gave you a little helper for that. Not a big deal.
00:13:35.760 And then it registers itself on the line plotting engine. So the line plotter is the engine, and you can see this code again looks sort of similar.
00:13:48.880 The interesting thing here is not in the interface so much, even though there's some little things we could talk about, it's in the implementation.
00:13:57.760 So we're going to drill down into that implementation and decide whether or not this is a good idea to be working this way.
00:14:09.520 When I read code, if I'm looking for a particular problem, I tend to go through wherever I happen to enter it, you know, from the outside API and then work my way down.
00:14:19.440 But if I'm reading for explanation, I tend to go from the bottom up just because it helps me work with the most simple objects first.
00:14:29.760 Try and understand what they are and then see how they interact with other things. So that's what we're going to do.
00:14:37.920 So a good first question might be, how is the format plugin registered with the engine?
00:14:44.640 We see this line, register on. Alright, so let's go take a look for that.
00:14:51.680 Yes, okay, so immediately I see that it does an unnecessary type check. Awesome!
00:15:01.680 Then it calls the brilliantly informed method, format engine.engine classes. Why is there a K on classes in the argument that's being passed to this method?
00:15:10.560 Right, class is a keyword, but why is there a K on engine classes? Because I'm an idiot, alright?
00:15:17.840 And this is clear that...what's that? It's consistent! Okay, so classic example of cargo culting. I knew that I had to use class because class wouldn't parse, but I didn't get why, so I just made it consistent.
00:15:32.560 I think novices are amazingly good at being consistent with their mistakes. And at this time, I would definitely be a Ruby novice.
00:15:43.840 Although, definitely a Ruby novice, okay? But now let's play around with this.
00:15:50.960 Okay, so class right now is some format engine class, that's what I'm reading from this, and it accepts a format plugin by injecting itself.
00:15:59.840 Alright, well now let's go over to engine.
00:16:09.640 Except format plugin. Okay, great. Now we have to go back to the plugin and get its name.
00:16:17.760 So we've got a little circle going on here which means that clearly one of these things or the other should have been doing the job.
00:16:25.760 But at the end of the day, this is doing something that looks almost the same as what the code that I showed that I said was good does.
00:16:33.600 Which is it's storing these format classes in a hash so that it could look it up later.
00:16:43.920 Now we saw before that you could have just done that on just the formatter, but hey, I didn't know it at the time.
00:16:55.920 So let's see, where are my... okay, so how does alias engine work?
00:17:07.840 Oh wait, we had one more thing to look at before we did that. Okay, so let's see.
00:17:15.760 Follow it back up: except format is class format name. Now if we go in there, we see something sort of interesting.
00:17:25.040 It really didn't ask you to explicitly define your format with a symbol, like we did before. It infers it from the name of the class.
00:17:33.560 Which, in theory, is great because it's convention over configuration, but in practice, it means you’re doing a completely unnecessary step.
00:17:41.480 Because you have to say the word SVG or text at some point anyway. So that great little bit of code right here does that.
00:17:51.600 That's how it matches SVG to the SVG plugin. It introduces all sorts of interesting edge cases and bugs and things like that.
00:18:02.800 That's exactly the sort of stuff that, when I started to feel like I just knew a tiny little bit about programming, I wanted to get out of my life.
00:18:16.160 You know, convenience at the cost of clarity is only useful if you're just trying to spike on something and get it done right then and there.
00:18:30.040 Okay, so now we've gone through that and the next question I had was how does format engine alias engine work?
00:18:42.880 Okay, so we've got this register online plotting engine. It doesn't explicitly refer to any of the classes or anything like that.
00:18:56.600 But we see that Alias Engine, the line plotter, line plotting engine is here, so that links a name of a class with a symbolic name.
00:19:09.760 So let's look for that.
00:19:19.760 Yes, okay, thankfully, this one is straightforward.
00:19:27.840 So again, cargo culting with the classes thing, but it's just storing these things in a hash.
00:19:37.480 Again, we saw before that you don't need anything like this. This is sort of rework; we're doing the same thing more or less twice.
00:19:47.680 Now, we get how that works. Alright, now let's see format plug-in renderer.
00:19:58.720 How are we at for time?
00:20:05.920 Okay, great. So format plug-in renderer.
00:20:14.440 Alright, let's go look for that one.
00:20:22.680 Oh, okay, so exciting again.
00:20:31.760 So let's go back to our original code, and we've got render plot.
00:20:40.160 So it's renderer and then some name for the kind of renderer that it is.
00:20:47.440 And let's see, so that's here. This would be a symbol plot.
00:20:56.120 It does some more awesome cargo culting which is that it takes this string and converts it to a symbol.
00:21:02.960 We can ignore this line because it's not relevant to our example, but then it does something great.
00:21:10.240 It uses that to define a method, but that method is being defined on the class, which is a little bit confusing.
00:21:18.040 And we'll mention that before, but here's the cargo cult I mentioned earlier.
00:21:28.120 This is why... why did I convert it to a symbol? No reason at all. Perfect!
00:21:38.120 Thank you. Alright, but has anyone else ever seen people do that? Has anyone else themselves done that?
00:21:47.880 Oh yeah, yeah, you don't need to do that. Maybe learn the APIs instead of copying and pasting from blog posts. I still do it to some extent.
00:21:57.320 But this is a classic example of that. Alright, so seven, okay, that's alright.
00:22:06.480 So things are going on at the class level. This here is just because we're at the class level and we want to define a class method.
00:22:16.360 This here is because we're doing something a little bit weird, so we have to use send.
00:22:26.080 So be it; whatever, not that interesting. If I've got seven minutes left, I want to bring you to the cool stuff.
00:22:36.080 Alright, so how does engine renderer work?
00:22:46.320 Oh, we didn't actually say how it worked; I just ripped on it.
00:22:56.080 So, the plug-in renderer defines a method called something like render after plot.
00:23:04.960 And that makes some sense because you can see that that's being called in the plugin.
00:23:13.919 So when you do this, it makes render something and then you call it from this.
00:23:21.840 Format engine renderer: they both got the same name. You would think they do something quasi similar.
00:23:32.920 Am I miss...? Yes, alright, whoever is saying 'up' is the best. Thank you all!
00:23:44.560 So, did I lie to you again? Okay, let's see.
00:23:52.080 Send to inline method render. Okay, so it is doing the same thing. Good! It's basically defining a method called render.
00:24:02.240 Which then gets used, but this is where things start to get interesting.
00:24:12.000 Let's see where does the magic happen?
00:24:22.360 This, okay, sorry. Okay.
00:24:30.160 So now, the big question is: what does this format build interface for line plotter thing do?
00:24:37.920 Okay, so format build interface for great. So again, on every single line, I'm reopening classes and doing all of this awesome stuff.
00:24:49.440 Apparently, it's defining a method that has a lambda around it that passes options into something called Simple Interface.
00:25:02.800 Which sounds simple, right? So then name object is just... it's defining a method called name object.
00:25:11.760 When I do that, this is just because we want to have both a way to get back an object and to do a direct render.
00:25:19.480 We're only interested in the direct render, but that drags us down to this crazy stuff.
00:25:29.080 Okay, my engine equals engine dup. That is a shallow copy of a class object.
00:25:38.760 Okay, so my engine send plugin equals options plugin. Okay, okay.
00:25:48.720 So now we're dealing with both engines and plugins that are both classes, and we don't subclass them.
00:25:58.240 We don't instantiate them; we don't do anything. We make shallow copies of them all over the place.
00:26:06.880 And now, let's see.
00:26:14.560 Then we can ignore this option stuff; it's not relevant. And that's pretty much how all this crap works.
00:26:26.560 So what's going on? Let's see. When you see active plugin here, this is a duped plugin class, and you're setting stuff on it.
00:26:36.480 That means that, because at the time, plugin didn't have any sort of safe way of doing a copy, it didn't do this right.
00:26:45.760 If you had any state that you happened to put on your subclass or something like that, it’s being shared with everything else.
00:26:54.720 The class itself creates a new duped copy of itself.
00:27:02.640 Okay, so I didn't use anonymous subclasses; I didn't use instantiation. I chose the best possible way of object-oriented design.
00:27:11.360 Classes in Ruby are objects. You can make copies of them; that's how you make more objects.
00:27:20.080 Yes! Alright, so just to give you a sense, all of this discussion has been about how to do what I can show you right here.
00:27:27.680 That does the same thing, and I explained it in just a couple minutes in the beginning.
00:27:34.880 But I did it the hard way. And why did I do that? Because once I started down a path of expecting something to be hard, insurmountable, scary, difficult, impossible...
00:27:44.920 My mindset was that it was going to be impossibly hard, and the pain that I suffered in doing it was just the cost of being brave.
00:27:58.240 As it turns out, it was just the cost of being ignorant.
00:28:07.280 So read other people's code, learn from it, and go back and read your own code.
00:28:14.720 How many people have read code that they wrote more than two years ago in the last month or so?
00:28:21.760 How many people love the programmer who wrote that code?
00:28:27.680 Alright, so am I pretty much out of time? Two minutes? Alright, so I'm going to close on that, and if you have any questions, I'd be happy to answer.
00:28:34.560 Oh for, yeah exactly! So you know the Don Quixote story of attacking a windmill because you think it's a dragon or a giant or something like that.
00:28:40.360 Whatever things that you think are giants right now, in a couple years introduce yourself to that windmill.
00:28:47.480 You know, that's pretty much the idea: the giants of today are the windmills of tomorrow.
00:28:54.640 Do you know? We are profoundly more blind to our own mistakes than we are to others.
00:29:02.080 We tend to do a good job. Although, if you think, okay so for example, I wrote this book, Ruby Best Practices.
00:29:10.600 And this is my code from just a couple of years ago. This trash that would make you want to return this book.
00:29:16.840 If you maybe buy it three years from now, the good news is that it's open source and everyone can fix it now.
00:29:25.040 At least it's all of our faults if this isn't good three years from now.
00:29:34.640 Okay, anyone else?
00:29:40.239 Questions? Reass?
00:29:44.560 Unless a block has been given. Hey, there you go, there's another horrible thing that I did.
00:29:53.520 If you're going to use block given, the keyword that Ruby provides, use it with yield.
00:30:00.880 This only happens, Matt. Why does this work?
00:30:07.600 Alright, so I don't...
00:30:23.520 I see, okay, so yeah that's basically it.
00:30:30.640 No, no, no, no, no, no, no.
00:30:35.680 Sorry, sorry, data is a... it is an instance variable of the object.
00:30:45.200 If you realize that all of the data was being passed through copies of the class object.
00:30:52.560 So it's all class data. Everything's a class.
00:30:59.920 Alright, so I guess we'll wrap it there. Thank you very much.