Jakub Malina
Lightning Talks
AS
See all speakers
See all 9 speakers

Summarized using AI

Lightning Talks

Carolina Karklis, Ferdous Nasri, Philipp Tessenow, Ben Fritsch, Uli Ramminger, Andrei Beliankou, Andy Schoenen, Jakub Malina, and Amr Abdelwahab • February 22, 2019 • Tegernsee, Germany

The video 'Lightning Talks' from the Ruby on Ice 2019 event features a series of brief discussions by various speakers on topics related to Ruby programming and its community. The main theme revolves around the vibrant and evolving Ruby community, focusing on diversity, coding philosophy, and technical skills. Key points include:

  • Code Curious Initiative: Ferdous Nasri introduced the rebranding of Rails Girls to Code Curious, emphasizing inclusivity for all genders and a broader focus on coding beyond Ruby on Rails. They reported significant community impact with over 1,500 participants across 40 workshops and invited the audience to support their upcoming events.

  • Streaming CSV Downloads: Philipp Tessenow shared methods for efficiently generating CSV files without timeouts by streaming data directly to users while processing, enhancing application performance with techniques like using the Postgres COPY command.

  • Embracing Code Reviews: Andy Schoenen discussed the importance of code reviews in improving code quality and fostering collaboration. He provided best practices for both a pull request author and reviewers to create a supportive and constructive environment.

  • Postgres Data Types: Ben Fritsch presented how improper use of JSON fields led to performance issues in a large database. He demonstrated that transitioning to structured fields significantly improved query execution times.

  • General Data Structure (GDS): Uli Ramminger introduced GDS, an innovative configuration and data definition language aimed at simplifying data structure management in Ruby applications, showcasing its flexibility with hash-based structures.

  • Ruby Beyond Rails: Andrei Beliankou highlighted Ruby's potential in various applications beyond web development, illustrating its utility in data science and automation with practical examples.

  • Machine Learning with Ruby on Rails: Jakub Malina discussed implementing machine learning solutions in his company’s production processes using Ruby on Rails, emphasizing rapid development and reliability.

  • Critique of Ruby: Amr Abdelwahab closed the session with a dialogue on the criticisms levied against Ruby, emphasizing the need for constructive discussions about its limitations and the importance of distinguishing Ruby from Rails. He invited contributions to a research project aimed at uncovering Ruby's underlying issues.

Overall, the talks promote both technical knowledge and a strong sense of community within the Ruby ecosystem, advocating for diversity and continuous learning.

Lightning Talks
Carolina Karklis, Ferdous Nasri, Philipp Tessenow, Ben Fritsch, Uli Ramminger, Andrei Beliankou, Andy Schoenen, Jakub Malina, and Amr Abdelwahab • February 22, 2019 • Tegernsee, Germany

Carolina – Ruby Community in Brazil (https://twitter.com/carolkarklis)
Ferdous – Coming Out (https://twitter.com/ferbsx)
How to Stream CSV Downloads (https://twitter.com/philipptessenow)
ActionController::Metal for JSON imports (https://twitter.com/olliprater)
Know your Postgres data types (https://twitter.com/beanieboi)
Uli – GDS, a new configuration and data definition language
Ruby without Rails (https://twitter.com/_arbox_)
Dance like nobody is watching (https://twitter.com/andysoiron)
Jakub – HW & Machine Hacking with Ruby & Rails
What's Wrong with Ruby? (https://twitter.com/amrAbdelwahab)

Ruby on Ice 2019

00:00:12.540 Fair dues, talking about coming out. Welcome! Is it on? No, no, it's on. Hello, everybody! Can you hear me? Yay, hello! So, I'm Ferdous, and this is Kaya. Come over here so I can see you. Yes, I think it's time to come out. Who here knows Rails Girls? I know that question was asked, but I couldn't see you all, okay? For those who don't know, this was an initiative started in Finland by Linda Lucas in 2010. It's inspired by RailsBridge from the US and it focuses on beginner workshops for women – free programming workshops. We started our own chapter in Berlin, Utah. Who's an organizer here? I can't see her. Yes, she's there! She was one of the organizers and founders, and she's amazing. It's been seven years that we've been doing this. We've had more than 1,500 students, about 40 workshops, and eight project groups have formed ever since, with around 300 coaches volunteering to help us. But we've grown up, and we're slowly thinking of coming out.
00:02:07.260 So, we went on a retreat about a year and a half ago where we finally had time to think about ourselves. We realized, "Hey, we are not girls; we're grown women, and we're also not just focused on Rails, but on coding in general too!" We went through a big transition process, criticizing our own name and considering whether it was time to grow up and reveal a new name that better reflects our identity. We've had trans and intersex individuals approaching us, wondering if they were welcome at our workshops, which is horrible. Of course, they are welcome!
00:02:29.970 So now we want to present to you our new name and new identity: Code Curious! This name says a bit more about who we are: curious individuals who want to spread this curiosity about coding to others, no matter where they come from or what they did before in their lives. We also have a really cool logo, I think... a curious squirrel! This is our coming out. We’re going to throw a party to celebrate and show you what we mean by this. You’re all invited! It's going to be in Berlin on March 14th – save the date! Please contact us if you know a good location; our previous venue pulled out due to a double booking!
00:03:16.360 And if you want to sponsor us or if your company would like to help, please come and talk to us! We need your support! We would love to have help organizing and coaching. We want to take Code Curious further and ideally establish three tracks: a back-end track, possibly with Ruby on Rails, a front-end track, and a DevOps track. In a perfect world where we have enough organizers and coaches, we would love to hold monthly workshops. There are so many women with a 'w' who need help and would love to learn. For every single workshop, we have three times the number of applicants than we can accommodate, so we really need help!
00:04:09.000 We want to thank Ruby on Ice for everything, for all the organization, and for all the help. I've sent them so many emails, and they’ve responded immediately. They’re just so amazing! We also want to thank all the diversity ticket sponsors – each one of you who donated a little bit – because that's why I'm here, that's why Kaya’s here, and that's how our Brazilian Rails Girls are here. You are all amazing, and we want to give you all our love. Thank you!
00:05:51.870 So now I can switch to our next speaker, Philipp, who will talk about streaming CSV downloads. Thank you! Also, thanks for resetting the time, so the next five minutes will be a little less world-changing than Code Curious, but it was born out of curiosity about code. It all started with this little law of nature: everywhere you go, you need some kind of video or JSON download. It seems easy, right? You can just Google it, find the docs, or find a Stack Overflow post.
00:06:34.590 So, in some users' controllers, you can just respond to CSV format, and in the render CSV method, you can send data and the whole string array. Since we are fancy and maintainable and test everything, as I know you do, we use service objects. The service object will provide us with the whole history string that we send to the user. This is great! However, as the app eventually gets more users, generating the CSV will take more time. After a year or so, you will receive back reports saying that generating those hold-ups takes longer than your HTTP server is willing to wait. This is where you need to put your engineering hat on and start implementing background jobs to produce CSV files, which are sometimes stored in a temporary file. You then have to clean it up post-download and notify the user that their CSV is ready. This used to be so easy, but now it’s become quite cumbersome.
00:08:30.930 I googled and thought, maybe we can give the HTTP server something weaker. Maybe we can stream the CSV data while generating it, so it doesn’t timeout, and the user can immediately receive some CSV data. This is actually possible with Rack. Not only Rails, but all Rack apps can implement this. You can send the response body to an enumerator. In this case, we navigate through all the data in a stream, and it just works and doesn’t timeout.
00:09:43.730 When we rewrite our user service object and refactor it a little bit to return not the whole JSON, but only a single line of CSV, we can use this enumerator trick to stream to the user. We can take as long as we want within reason, without it timing out, which is great! I tried it, but then I got curious: when we generate this in Ruby, isn't it fast to process a lot of data with a database?
00:10:30.970 I googled again and found the Postgres COPY command, which is a SQL feature. I know, it's not that hard. Just prefix 'COPY' in front of your SELECT statement and add 'FORMAT CSV'. You get your CSV output, and there’s a little trick I can’t explain right now whereby Postgres can stream this directly to a Ruby app. You can then stream it out to the client easily. In the end, it looks like this. People have extracted this into a little gem, so you can easily render CSV from a sequel; for example, be it an Echo or an SQL operation.
00:11:56.020 This will stream your response while also automatically gzip-ing it to save some bytes on the wire. Just to be honest, I haven’t used this in production yet because, frankly, I just created this on the train ride to this conference, but I would be happy to talk about this. This is just a very small code base, basically three files of Ruby, and we can enormously over-engineer it if we want. We can also look at the headers, and we can have fun with that. So, please feel free to reach out to me on Twitter or GitHub!
00:12:51.150 Our next speaker is Andy, who will talk about dancing like nobody's watching. Yes! If the title is not too misleading, my name is Andy, and there's a story behind that picture. Feel free to ask me about it later. I work as a back-end engineer and when I arrived here on Friday, I was given a little fortune cookie that said, "Dance like nobody's watching, code like everybody is." What a true statement! It reminded me of the early days of my career, where I was introduced to code reviews, and it was super scary for me. I felt that everyone was watching my code and judging it.
00:13:25.850 However, I later realized that code reviews are actually quite helpful. They bring a lot of good outcomes. Here are some reasons why we should embrace code reviews. Firstly, they improve my code. Knowing that people will review it makes me more aware and conscious. Additionally, they spread knowledge across the team, which can help discover bugs early and make more people responsible. If I do something very stupid and someone approves it, then they are also part of that mistake.
00:14:19.020 This isn’t about blame; rather, if there's no culture of blaming, it fosters a culture of learning from failures. Nonetheless, getting code reviews right can be challenging. I want to share a few points based on my experiences. There are two sides to consider: the author of the pull request and the reviewer. Authors should adequately describe their pull requests, giving context on why changes are needed and ensuring the request is as small as possible. If some changes go beyond the scope of the pull request, those can be addressed in another request. If there are complex problems, grab a teammate and pair with them so the communication and understanding can be smoother.
00:15:39.440 However, it’s even harder to get it right as a reviewer because written communication can often be misunderstood. Reviewers should be aware of their responsibilities; if you review code, it’s also your code now, so be careful and read it thoroughly. If something is difficult to understand, feel free to ask for clarification. Furthermore, suggest changes constructively rather than dictating them. For example, instead of saying 'please change this to be faulty,' you could say, 'how about this instead?' This invites a healthy conversation rather than putting the author on the defense.
00:16:43.620 Also, avoid rhetorical questions like 'why didn’t you just do this?' as they can make the author feel defensive. Instead, frame it positively. If you have a suggestion, offer an example or code block to clarify your point. Always remain supportive. GitHub introduced the feature of requesting changes, and while it may seem harsh initially, remember that it doesn’t mean everything is wrong – it’s merely an invitation for some adjustments.
00:17:32.240 It’s about improving your code. So, let's embrace code reviews as a critical aspect of our workflow. This talk was inspired by my fortune cookie and Derek Pryor, who has a great talk about implementing a strong code review culture. Please go watch it!
00:18:42.240 Hello, everybody! I can't see anyone, but sorry for my cracking voice; you’ve pushed it with everyone. It's really nice but quite taxing. I want to talk about data types in your database. This talk is based on some work I've been doing at Heroku over the last couple of weeks. Let's set up some context. We have one big table with a couple of hundred million records, and as always, we have some slow queries. But looking at average execution times, they’re around 10 milliseconds, which seems fine.
00:19:25.810 However, I discovered we have some crazy outliers, with execution times up to 1.2 seconds for the same query. So let’s find out what went wrong. Our table has a large JSON field. For context, Postgres has supported JSON for the last four or five years. For example, if you want to extract some data from this JSON blob, you’d use the double arrow notation to retrieve the URL outside of it. But let's look at our actual query.
00:20:05.930 In our query, selecting the URL from the table with certain conditions resulted in those execution times. When I first examined it, I was perplexed that it could take 1.2 seconds. I started with the EXPLAIN ANALYZE tool, which provides internal metrics about slow queries and execution. Recently, I discovered the 'EXPLAIN ANALYZE BUFFERS' feature, which gives stats about the I/O subsystems in Postgres. For this particular case, it revealed that Postgres was reading many blocks out of the cache, which indicates that it was reading over 60 megabytes of data just for a single URL, which is quite excessive!
00:21:06.300 So how do we fix it? I converted this single JSON field into three separate text fields, resulting in the query execution time dropped from two seconds to a regular 10 milliseconds! The main takeaway is that JSON wasn't necessary for our use case. When we initiated this project three or four years ago, it was the best choice given the unstructured data we had available. What changed was that we now have significantly more information on how we use the system. This allowed me to conclude that JSON was not a suitable data structure in our case.
00:22:29.130 It turns out we've been inserting the same JSON structure for years, so there was no real point in using a data structure better suited for unstructured data. We had been misusing JSON and reading large chunks of data, which was not an efficient use of our time. Therefore, I encourage everyone to revisit their database schema. What may have made sense at the time might not be appropriate anymore. Evaluate whether the data types you're using still make sense, but don’t change merely for the sake of change. As long as it works, it works!
00:23:25.479 However, if you face issues, it may be time to make adjustments. Thanks! Next, I’ll be speaking about GDS, a new configuration and data definition language. Hi, everybody! Today, I want to discuss an idea I had for a new data definition and configuration language, which I named GDS, for General Data Structure. This aims to be a universal and composable data structure for use in storing any kind of data or information. Typical use cases would be defining configuration specifications and data sets.
00:24:42.230 The concept of GDS uses hashes in Ruby to construct these data structures with basic values as well as nested hashes. The keys of these hashes are always symbols, meaning that no other key types are allowed. For instance, you might define a simple hash, or an array, along with composed structures. On top of this basic concept, I’ve defined a special DSL specifically for GDS. It’s designed to be succinct and indentation-sensitive, using indentation to create a hierarchical structure and avoiding the usage of curly braces and square brackets.
00:26:02.820 Let me give you a quick overview of the syntax: you use a colon to define a hash, commas to create arrays, and vertical pipe symbols to separate multiple values or key-value pairs on the same line. Indentation always requires two space characters and allows for inline and block comments. A simple example is illustrated where the top symbol indicates a comma for defining an array with one hash element, including key-value pairs, where the second item is an array containing hashes.
00:27:11.210 The default structure is a hash, so you don’t need to define a colon at the top level. Just start typing out the keys and values. There’s also a method where you can create this structure. The core of GDS can support various data types including integers, floats, symbols, string literals, and keywords like true, false, and nil. Any other type will be treated as a default string literal. Variables can also be used as placeholders for basic values, allowing for substitutions in strings.
00:28:14.610 GDS provides schema specifiers to streamline the definition of datasets, allowing you to define keys and values separately. Furthermore, references are available to manage various relationships, defined using an ampersand money symbol and an asterisk to assist in merging hashes into others. These references can help define relations between datasets, like one-to-one, many-to-one, and one-to-many, all in a single block.
00:29:31.990 Here is an example of a simple dataset where we define all the relations in one block for various relationships between authors and books. This would further facilitate creating models and saving data into database files. And lastly, I’d be happy to receive any feedback if you’re interested!
00:30:40.160 For the next talk, Andre will discuss Ruby without Rails. Hi, I’m Andre, and I work for Boudreaux, focusing on automating dark separations and accounting. Ruby's ecosystem is very closely tied to Rails, which is excellent. However, we also see emerging applications in areas like DevOps, Chef, Puppet, and others outside of web development. Today, I’ll highlight some examples of machine learning applications and data explorations using the Ruby stack.
00:31:53.240 If you want to become a data scientist, it depends! In 1992, you might have wanted to learn Ruby. As the years progressed, with the introduction of new libraries like Pandas for Python in 2008, people began flocking to Python as the go-to for data science. In 2019, Ruby started to be on the radar for data science as well. However, we must not forget that we stand on the shoulders of giants, so let’s not overlook the contributions made to other languages. Today, I’ll show how we can use Ruby for data science.
00:32:57.560 First, I often open a Jupyter Notebook, or similar tools if you plan to use Ruby. I have Ruby 2.5 installed on my machine, and we can install dependencies suited for the OS. We use a well-known dataset, ensuring we don’t waste time cleaning. The dataset I use is known for time series data. Once loaded, we can present it in a tabular format before moving toward graphical exploration, and we can integrate charts from sources like Google Charts into our applications.
00:34:06.180 In the next day’s lesson, we learn to make predictions. Taking datasets like the Iris dataset for example, we can train a linear support vector machine model using a library similar to Scikit-learn in Python for Ruby. Just note that N-dimensional arrays in Ruby operate similarly to arrays in Python. If you want more information, I suggest looking at libraries such as Pikal for transparent connections to Python. Please check out works by the community or look into lists for machine learning resources in Ruby!
00:35:20.400 Our next speaker is Jakub, who will share insights about machine learning with Ruby on Rails. Hello everyone! Do you hear me? I’m really enjoying this Ruby conference! My name is Jakub, which is the Czech variant of the name Jacob. I work for a food company we automate food production, particularly mislay and porridge. I’m here to discuss a problem I was solving in the food production line used at our company based in Prague.
00:36:17.820 Our company produces mixes, fitness bars, and more, with our best customer feature allowing them to create their own mixtures. Most of our applications are internal, optimizing product assembly and shipping them to customers. My role involves ensuring the monitor display on the machine is functioning correctly. In 2014, we acquired our first part of the machine and started thinking about optimization because the machinery was expensive and not very smart. Initially, I considered various options for programming, however I decided to rewrite the project into a web app using Ruby on Rails.
00:37:39.570 I believed that since I knew Ruby, I could build a more reliable web app to control the machines. Web apps maintain continuous functionality without being hindered by local machine updates. It’s easier to develop and maintain since issues that took me significant amounts of time in Python and Qt took only a fraction of that time in Ruby. Our structure features several web applications with grouped orders for production control. This is the actual application we use for controlling our machines. I regret that I don’t have time to share more code, but you’ll see it is clean and operates smoothly. We've been using this Ruby application successfully in production for three years.
00:43:09.140 Now, let’s welcome our final speaker discussing what's wrong with Ruby. What’s wrong? A lot of things! First off, a shout-out to Monika, who is the first emcee to say my name without irony! I know you're all hungry and want lunch, but I want to kick off my research project titled 'What is wrong with Ruby.' Over the past few years, there’s been a Ruby-bashing hype in many conferences, with talks discussing the notion that Ruby is dead.
00:43:40.000 Is this hype factual? Often, there are real problems at the start leading to this hype. I believe seniors, who get paid as such, owe it to beginners to explain why we criticize Ruby. It's crucial we provide solid reasoning and not just anecdotal evidence to substantiate our claims! I started qualitative interviews with individuals who maintain larger Ruby projects, aiming to gather insights on their views on Ruby.
00:44:50.000 I’ve noticed several focal points in the consistent criticisms. One is the utilization of multiple CPU cores; the complexity of concurrency or parallelism in Ruby isn’t straightforward. Another point I see people criticize is its community contribution issues. While the language has potential, the existing implementation might not be welcoming to contributions. Please remember, Ruby is more than just the MRI implementation and encompasses others like JRuby or Truffle.
00:46:27.000 Additionally, I worry about dangling associations between Ruby and Rails; people often perceive Ruby solely as a web language. Is Rails the only framework available? Is it the best? Let’s discuss these distinctions. Lastly, I invite anyone who is interested in joining this conversation or contributing to this project to reach out to me.
00:47:23.000 Thank you all and enjoy your lunch!
Explore all talks recorded at Ruby on Ice 2019
+12