Data Analysis

Discover Machine Learning in Ruby

Discover Machine Learning in Ruby

by Justin Bowen

Summary of "Discover Machine Learning in Ruby"

In the talk "Discover Machine Learning in Ruby" presented by Justin Bowen at RubyConf 2022, the speaker explores the intersection of Ruby programming and machine learning, a field not commonly associated with Ruby. The presentation emphasizes the capabilities of Ruby to incorporate machine learning and showcases the contributions of Andrew Kane, a notable figure in the Ruby community.

Key Points Discussed:

  • Introduction to Speaker's Background: Justin Bowen discusses his extensive experience in programming with Ruby and Python, especially focusing on computer vision applications since 2008.
  • Overview of Computer Vision: He describes his enthusiasm for computer vision and the exciting opportunities it presents for interactive visual feedback in programming.
  • Importance of Andrew Kane: Bowen highlights Andrew Kane as a significant contributor to Ruby’s machine learning community, crediting him with multiple popular Ruby gems such as StrongMigrations, Searchkick, and Chartkick that enhance Ruby’s capabilities.
  • Ruby Gems for Machine Learning: The talk details several notable gems:
    • StrongMigrations: Popular for safely managing migrations with 23 million downloads.
    • Searchkick: Enables effective full-text searches utilizing Elasticsearch and offers offline re-indexing.
    • Chartkick: Simplifies the generation of charts in Ruby.
    • Other Noteworthy Gems: Mention of several others like PG Hero, Lockbox, and ThunderSVM, all contributing to Ruby’s machine learning capabilities.
  • Potential of Ruby in Machine Learning: Bowen argues that, with collective community interest and effort, Ruby can be a viable option for machine learning.
  • Encouragement for Contributions: He encourages developers to explore and contribute to the Ruby ecosystem, emphasizing that participation doesn't require advanced expertise in data science.
  • Community and Collaboration: The importance of a curious and collaborative community in developing advanced machine learning solutions using Ruby.

Conclusion:

Bowen wraps up by reinforcing the message that machine learning is possible in Ruby through community efforts and the utilization of available resources. He encourages attendees to experiment with the Ruby ML stack and consider contributing to this evolving area.

In summary, this talk highlights the exciting possibilities of integrating machine learning into Ruby and acknowledges the contributions of key community figures, calling for greater exploration and collaboration in this field.

00:00:00.000 Ready for takeoff.
00:00:17.300 Thank you for coming! I'm glad everyone's excited to be at RubyConf.
00:00:20.520 If you were looking forward to more Star Trek memes like at my RailsConf talk, unfortunately, you will be disappointed. Though, I do really like the theme through the RubyConf swag this year with the NASA-themed space stuff. So, I guess I missed an opportunity on my part.
00:00:36.600 But today, we will talk about discovering machine learning in Ruby. You could say we're exploring where few Rubyists have gone before; that will probably be my only Star Trek reference—maybe. I'm Justin Bowen, and I am tons of fun! You can find me on Twitter and Instagram as @tonsoffun and on GitHub as justtonsoffun.
00:01:00.120 Currently, I work as a CTO consultant for Silicon Valley Software Group. Additionally, I am a director of engineering at Insight Surgical AI, where we're doing computer vision for surgical objects in operating rooms. It's really exciting stuff. I've been working with Ruby professionally since around 2008 and have been using Python for computer vision since about 2016.
00:01:31.439 For me, computer vision is tons of fun, and I derive a lot of joy from running code with visual inputs and getting visual feedback and outputs that you can chart and highlight. Again, I hope you have tons of fun during this talk. Just a bit of background: I've built a number of applications, such as drone mapping.
00:01:50.460 In 2016, I worked with an Irish ag tech company to highlight different parts of the field using OpenCV to help farmers see problem areas or sections ready for harvest. I've done things ranging from monitoring cow health—something I worked on for a few years at the same company—to real-time computer vision with cameras.
00:02:19.020 In the operating room, we currently have eight cameras monitoring at all times at 60 frames per second. Each camera has its dedicated GPU, and we run our own projects. This is a small demonstration highlighting various surgical instruments such as hemostats, scissors, needles, and sponges using a YOLO V7 model trained with PyTorch and running through OpenCV—all on Python.
00:02:40.620 But since we're at RubyConf, you might wonder why I'm talking about Python. In my RailsConf talk, I discussed using Python for computer vision with Rails applications. All the examples I showed had Rails backends; the computer vision code was done in Python, primarily because most machine learning libraries are built with C++ with Python bindings.
00:03:02.040 There are some Ruby machine learning and computer vision gems, but they aren't as feature-complete. During the Q&A at that virtual conference, I was asked, 'Why not use Ruby?' My response was that, for example, Ruby OpenCV, which I used for a side project on Leaf area and counting bud sites in cannabis plants, didn’t have the functions I needed. Python is built alongside the C++ implementation of OpenCV, so I opted for Python.
00:03:33.600 This made me curious about what options exist for Ruby gems. Interestingly, I've discovered some hidden gems—literally. This is the 'hidden gems' track; I meant to mention that in the first slide. The discoveries included an individual doing a lot of work with machine learning in Ruby.
00:04:00.399 Now, bear with me; we will return to machine learning and a little bit of computer vision, but first, I want to talk about Andrew Kane. Some of you may have heard of him—the man, the myth, the legend. I've been using gems made by Andrew Kane for about ten years since I first found SearchKit, which is an Elasticsearch gem.
00:04:30.840 Although all indications point to him being a real persona that works at Instacart, there are rumors he might not actually exist! Just look at his contribution graph; it's very impressive. Please don’t look at mine; I told you my handle; it's been a tough year.
00:05:09.600 Some say he might just be an artificial intelligence, while others speculate it's a group of people working under a pseudonym. At least it seems he took a two-week vacation, so that looks pretty human to me. One theory I heard yesterday is that Andrew Kane might not exist.
00:05:22.680 Seriously though, Andrew Kane is a hidden gem in the community. I have yet to meet him or anyone who knows him. If you do, please introduce us; I’d love to thank him. He has published more than a few gems, totaling 134, which is impressive.
00:05:57.300 Others in the community, like Aaron Patterson, have contributed over 190 gems, and Rafael Franca is right up there too. While there are many core Rails library contributors like Rack and Active Support, Andrew Kane has achieved around 153 million downloads for his projects, which mostly appear to be his contributions.
00:06:24.600 His top downloaded project has 23 million downloads. I don't have any gems; I have a few Python packages, but that’s not the significant part. He often tags his gems as 'battle-tested' at Instacart. One of his most popular gems, StrongMigrations, has 23 million downloads. While I've not personally used it, it seems very popular and has great safety features for running migrations. I've heard good things about it; it's definitely worth checking out.
00:06:53.880 As for another gem, Searchkick, I've used this one. You can do full-text search in Postgres, but Searchkick offers many other cool features using Elasticsearch, including the ability to re-index in the background without any downtime. This means users can continue making searches while the re-indexing happens. Its query structure is similar to SQL, making it user-friendly, while it’s adept at handling spaces, misspellings, and more. It's battle-tested at Instacart.
00:08:11.400 There are also gems like Chartkick, which provides nice JavaScript charts in a single line of Ruby. It works well with Blazer, allowing you to explore your SQL data, write queries, and generate reports. PG Hero is another gem based on a Heroku blog post that gives you a visual dashboard to see common PostgreSQL issues. Dexter is an automatic indexing gem for Postgres that suggests indexes based on your queries.
00:09:30.960 Grouping temporal data can be accomplished with Groupdate, while ActiveMedian aids in median percentile queries. Ahoy is another gem for first-party analytics, allowing for email analytics and field tests for A/B testing. Lockbox is another awesome gem for field-level encryption, predating Active Record's encryption.
00:10:00.600 To sum up, there's a considerable breadth of Ruby gem contributions that I think deserve recognition. ThunderSVM is another notable gem for SVM, and together with the Ruby ML organization created by Andrew Kane, these projects highlight the potential for machine learning in Ruby.
00:10:41.760 Ruby ML consists of several repositories, including IRuby—an interactive console that allows Ruby code to run alongside IPython notebooks. Vega is a visualization tool similar to Chartkick but even more powerful. Other gems include Rover and Profit; Rover is akin to Pandas in Python for data frames, and Profit is for forecasting in Ruby. NeuMo (or Naray) serves as Ruby's equivalent to NumPy, crucial for array manipulation.
00:11:51.740 TorchRB is yet another gem developed by Andrew Kane, allowing us to use PyTorch and the underlying concepts in Ruby. As you can see, the Ruby community has a lot to offer when it comes to machine learning.
00:12:43.120 Why Ruby, you might wonder? Well, as a community, we can make machine learning possible in Ruby! It all comes down to community interest. This could attract numerous new developers eager to explore machine learning and artificial intelligence. We learn through reading, codifying this knowledge through writing, and sharing our experiences.
00:13:53.999 Contributing to the community can start simply by being curious. You can download the ML stack, click on the repo, and open it in mybinder.org. Play around with it! You might be inspired to create something or ask questions. You do not need outstanding expertise in data science or computer science to contribute. I don’t hold a degree, but I'm grateful for opportunities to unify this work.
00:14:46.799 It's not about the number of gems you publish, nor do you need to get on stage to give a talk. Even small gestures can lead to valuable insights and engagement within our community. The question, 'Why not in Ruby?' comes to mind—I never traditionally considered it. Instead, I typically searched for whatever was easily accessible when clients needed products.
00:15:35.700 However, we can create excellent solutions in Ruby if we make a concerted effort. This brings me to my conclusion; I'm Justin Bowen, the Tons of Fun.
00:16:54.440 Unfortunately, we have three minutes left for questions.
00:17:00.560 Someone asked if I’ve explored text imaging with Ruby. I haven’t done that yet, but I see those services online where you can upload photos and get custom avatars generated. Many successful products leverage such tools, albeit perhaps not in Ruby without making the proper bindings.
00:17:22.060 Onyx exists for Ruby, and with the right runtime, it could potentially be used to create similar functionalities. The cool part about constructing products from available tools is that you don’t always have to fully understand how they work; you can focus on having them perform efficiently and effectively.
00:18:53.200 I won't get into that specifically, but if you're considering leveraging models to produce something unique, don’t hesitate to explore. Even generating pet-themed images can be a fun project. As for questions, I'll keep my cat stories for another time!
00:19:13.720 Great, thank you for your inquiry! I really appreciate it. This concludes the talk, and now it's snack time!