Talks
Speakers
Events
Topics
Sign in
Home
Talks
Speakers
Events
Topics
Leaderboard
Use
Analytics
Sign in
Suggest modification to this talk
Title
Description
Exploring Big Data with rubygems.org Download Data by Aja Hammerly Many people strive to be armchair data scientists. Google BigQuery provides an easy way for anyone with basic SQL knowledge to dig into large data sets and just explore. Using the rubygems.org download data we'll see how the Ruby and SQL you already know can help you parse, upload, and analyze multiple gigabytes of data quickly and easily without any previous Big Data experience. Help us caption & translate this video! http://amara.org/v/PsL8/
Date
Summarized using AI?
If this talk's summary was generated by AI, please check this box. A "Summarized using AI" badge will be displayed in the summary tab to indicate that the summary was generated using AI.
Show "Summarized using AI" badge on summary page
Summary
Markdown supported
In this talk, Aja Hammerly explores the world of big data using the download data from rubygems.org. The session, which took place at GoRuCo 2016, highlights how novice data scientists can utilize Google BigQuery to analyze large datasets without prior big data experience. Aja emphasizes that many Ruby developers do not engage with big data due to intimidation from its perceived complexity, particularly in statistics and machine learning. The talk is structured around the following key points: - **Introduction to Big Data**: Aja shares her background and explains the significance of data in technology, noting how storage costs have decreased dramatically over the years. - **RubyGems Download Data**: The core of her demonstration involves using RubyGems data, particularly focusing on tables like 'RubyGems', 'Downloads', 'Dependencies', 'LinkSets', and 'Versions' to uncover insights into gem usage. - **Formulating Questions**: Aja discusses the importance of posing relevant questions to analyze the dataset effectively. She suggests hypotheses about gem popularity and dependencies. - **Utilizing BigQuery**: Aja introduces BigQuery as a powerful tool for handling large datasets, explaining its efficient SQL querying capabilities and how it can manage unstructured data. - **Live Demonstration**: During her talk, she conducts a live query to showcase BigQuery's speed and efficiency, using the Hacker News dataset as an example to find average scores of stories related to specific keywords. - **Data Importing Methods**: Aja explains two primary methods to import data into BigQuery: streaming for real-time needs and batch processing for less immediate data. She describes the process of connecting Postgres to BigQuery, emphasizing the automation of this process through Ruby. - **Conclusion and Encouragement**: To conclude, Aja encourages attendees to leverage data for various applications, highlighting how understanding data patterns can drive insights and impact decision-making in development. The session provides invaluable insights into the practicalities of data analysis, demystifying big data for Ruby developers and inspiring them to explore the vast possibilities data offers.
Suggest modifications
Cancel