Talks
Speakers
Events
Topics
Sign in
Home
Talks
Speakers
Events
Topics
Leaderboard
Use
Analytics
Sign in
Suggest modification to this talk
Title
Description
In this talk we’ll review the state of the tools for data science in Ruby! Python is the “crown jewel” of the data science languages today, but many of us work mostly with Ruby for the business applications, and it is important to use the best tool for each job.
Date
Summarized using AI?
If this talk's summary was generated by AI, please check this box. A "Summarized using AI" badge will be displayed in the summary tab to indicate that the summary was generated using AI.
Show "Summarized using AI" badge on summary page
Summary
Markdown supported
In this presentation, Rodrigo Urubatan explores the possibilities and challenges of utilizing Ruby for data science. Despite Python being the dominant language in this field, the talk emphasizes that Ruby can also be a viable option, depending on specific needs. The discussion is structured around several key points: - **Definition of Data Science**: The term is subjective, varying by company and context; commonly viewed as extracting insights or using statistics and machine learning for data manipulation. - **Ruby's Capabilities**: Ruby has libraries for data science, including those for integration with Python and R, data visualization, and statistics. Notable libraries include Kiba for data manipulation, PyCall for using Python in Ruby code, and Daaru for creating DataFrames. - **Tools and Libraries**: The talk highlights the diversity of Ruby libraries such as rb-gsl for statistics and Matplot for data visualization. Despite this, performance issues are noted in some libraries, like NMatrix and NArray. - **Current State of Ruby in Data Science**: Urubatan describes various projects such as SciRuby and Ruby Numo that aim to facilitate data science tasks in Ruby. A new project, Red Data Tools, is mentioned for its potential to enhance interoperability within Ruby data science libraries. - **Performance Comparisons**: Comparisons with Python libraries reveal a significant performance gap; for instance, summing numbers with Ruby's NMatrix is notably slower than Python’s NumPy. This could present challenges in production environments. - **Recommendation**: While Ruby can handle data science tasks, it's best suited for web and business applications. Integrating Ruby with Python for more intensive data tasks could enhance performance and efficiency. In conclusion, while it is possible to use Ruby for data science, developers must consider performance issues and the integration of existing Python tools to make the most of their data science efforts. Urubatan aims to provide resources to help Ruby developers choose the best tools for their data projects, allowing them to work more efficiently while fixing bugs effectively. Resources for further learning were also shared at the end of the presentation, including past talks and library documentation.
Suggest modifications
Cancel