Git: The NoSQL Database

The video, titled "Git: The NoSQL Database," features Brandon Keepers discussing the use of Git as a potential data store, extending its application beyond just code management. The talk explores the flexibility and advantages of utilizing Git's schema-less structure for storing data, as well as the challenges that come with it.

Key Points:

Introduction to Git as a Database:
- Brandon introduces himself as a developer at GitHub and shares his interest in using Git as an unconventional database.
- The concept is not new, but he has wanted to experiment with it.
Understanding Git's Structure:
- Git is described as a content tracker, which supports the idea of using it as a data storage solution.
- A basic understanding of how to initialize a Git repository and store JSON data is provided to illustrate its functionality as a database.
Functions of Git in Data Storage:
- Git storage involves creating blobs that hold data, which can be referenced by SHA-1 checksums, making them unique based on content.
- The process of updating data and the immutability of blobs are highlighted, distinguishing Git from traditional databases.
Usage of Trees and Commits:
- Brandon explains that trees function similar to directories, holding blobs or even other trees, providing a structured way to reference multiple data items.
- The role of commits in marking specific states over time is discussed, further solidifying Git's capability to manage changes.
Practical Application:
- An example application, "Gasket," was created to merge issue tracking with code repositories, showcasing practical implementation.
Challenges in Scaling:
- The speaker addresses potential issues like concurrent access and load distribution when using Git as a data store.

Conclusion:

Brandon concludes that while using Git as a database can offer incredible flexibility and unique advantages through versioning and branching, it also presents challenges that should not be overlooked. Further exploration and experimentation within this framework could lead to innovative practices in data management.

The audience leaves with an understanding of the mechanics of Git as a database and the complexities involved in such a paradigm shift.

Git: The NoSQL Database
Brandon Keepers • October 08, 2012 • Earth

We all know that Git is amazing for storing code. It is fast, reliable, flexible, and it keeps our project history nuzzled safely in its object database while we sleep soundly at night.

But what about storing more than code? Why not data? Much flexibility is gained by ditching traditional databases, but at what cost?

In this talk, I will explore the idea of using Git as a data store. I will look at the benefits of using a schema-less data store, the incredible opportunity opened up by having every change to every model versioned, and the crazy things that could be done with branching and merging changes to data.

I will also explore the challenges posed by using and scaling Git as a data store, including concurrent access and distributing load.

Help us caption & translate this video!

http://amara.org/v/FGfp/

Aloha RubyConf 2012