Introducing Tensorflow Ruby API

00:00:07.930 Okay, so hi there! I am Arafat Khan, an undergraduate at the Indian Institute of Technology, Kharagpur. I just recently came here from India, and today I'm going to be talking about TensorFlow Ruby API. Before I start this talk, I want to take a moment to express my gratitude to all of you. Although I've only been in Hungary for a very short time, I feel that you are all very welcoming and sweet. Thank you so much!

00:00:37.899 I have always loved solving interesting problems, one of which is this project. I started this project in the summer of 2016, with funding from Somatic IO. After that, I was awarded the prestigious Ruby grant from the Ruby Association. In the summer of 2017, I worked at the Medical Image Analysis Lab at Simon Fraser University, Canada, focusing on machine learning in the field of medical imaging. We were solving many fundamental problems using TensorFlow, and the medical imaging community is really interested in addressing machine learning problems in areas like retinal imaging and diseases such as diabetic retinopathy. So, for me, it's exciting to see real-world examples in action.

00:01:41.890 Now, the question is: what is TensorFlow? Show of hands—how many people here use TensorFlow in their work or projects? Even if it's just for fun or hobby projects, if you've seen the tutorial, please raise your hands. Okay, that's great! I expected there would be more people, but that's okay.

00:02:10.800 TensorFlow is the world's largest machine learning library for numerical computation using data flow graphs. In this graph, nodes like softmax, operations, etc., represent mathematical operations, while the graph edges represent tensors, which are basically multi-dimensional arrays that are communicated among the nodes. The TensorFlow project was open-sourced around twenty-one months ago by Google, and it has become so popular that it now has seven thousand stars on GitHub, ranking it number one in the machine learning category on the platform. Having watched its growth from zero has been an amazing journey.

00:02:54.420 TensorFlow was initially developed by researchers and engineers at the Google Brain team and the Google Machine Intelligence Research organization primarily for neural networks and deep neural networks research. At first, it was not intended to be open-sourced for everyone's use; rather, it was just for researchers. However, they soon realized that many people would find it useful for various applications, which led them to open-source it in 2015.

00:03:24.930 Before delving deeper into TensorFlow, I just want to ask: how many students are here? That is, college-going students who are new to open source and machine learning? It's okay if you're a newbie; everyone starts somewhere. I've come from a university known for its strong open-source culture. Indian Institute of Technology, Kharagpur, ranks third in the number of selected students for Google's Summer of Code program, and we have a solid culture for GSoC, Rails Girls Summer of Code, and other outreach programs. If you'd like to talk about open source, grants, or student programs, feel free to approach me.

00:04:13.430 Now, let's go back to TensorFlow. It's not just for experienced developers or researchers; it's literally for everyone. If you're a student or a beginner in Python, you can go through TensorFlow and understand most of it, even if you don't grasp everything completely. It's open source, flexible, and production-ready, making it easy to transition from research to production.

00:04:55.310 Let's take a brief look at the timeline for TensorFlow development. The initial release was in November 2015, and post-initial release, they rolled out several features like Python 3.0 support, TensorFlow Serving (a flexible, high-performance serving system for machine learning models), and compatibility for mobile platforms with specific features such as GPU acceleration. Another significant milestone was TensorFlow version 1.0, released in February of this year, which was accompanied by a large conference for the TensorFlow Developer Summit. There are remarkable videos available from the core developers on the Google Brain team, which you should check out, even if you're not well-versed in machine learning.

00:05:49.410 Developers at TensorFlow have emphasized the importance of working on multiple platforms. Mobile platforms are very relevant, as it runs efficiently not just on CPUs and GPUs but also on custom accelerators like TPUs. Companies like Google and many startups worldwide are leveraging it across various platforms, such as Android, iOS, and Raspberry Pi. For language bindings, TensorFlow supports Go lang, Python, C++, Java, and of course Ruby.

00:06:31.590 With the release of TensorFlow 1.0 came the corresponding Ruby API. All of the language bindings for TensorFlow fundamentally work in the same manner, allowing seamless connection with the TensorFlow C API to perform function calls. Following the C API's release, many developers worldwide, including myself, began building language bindings, having extensive discussions that contributed to this effort.

00:07:16.240 Here is the issues page for the Go lang API. As you can see, there was a proposition for how the Go API would work, and I’m very thankful to everyone involved in that discussion. While the Go language is quite different from Ruby, the ideas they generated were quite helpful for Ruby and other language bindings. The discussions for both the Java API and the Ruby API were equally important. The Ruby API was officially accepted on June 16, 2017, marking a significant milestone.

00:08:49.280 Before I go into more depth about the TensorFlow API and present examples, I want to highlight a few key dependencies of TensorFlow. Many of you here are working on interesting problems that aren't limited to just machine learning; utilizing these dependencies can enhance your projects significantly.

00:09:10.220 The first dependency is SWIG (Simplified Wrapper and Interface Generator). SWIG connects programs written in C and C++ with high-level programming languages like JavaScript, PHP, Ruby, etc. It allows you to access STL (Standard Template Library) containers from other languages easily. For example, let’s assume you have a function in C++ that expects a vector of floats as input. With SWIG, instead of writing a type map for the vector of floats manually, you can directly use the STL vector module, streamlining the integration process.

00:10:17.680 Another important dependency is Google Protocol Buffers. They provide a language-neutral, platform-neutral, extensible method for serializing structured data. This mechanism involves an interface description language that outlines your data structure, enabling you to generate source code in various languages, including Ruby, C, and Python. This code allows you to load, access, and manipulate data efficiently. Compared to XML, protocol buffers are not only simpler to understand but also result in considerably smaller byte sizes, making for more efficient memory usage.

00:11:40.960 Now, let’s introduce some key TensorFlow classes: Tensor, OpSpec, Graph, and Session. Tensors can be thought of as N-dimensional matrices. While conventional matrices are represented as rows and columns, tensors extend this concept across multiple dimensions. The OpSpec class defines specifications for operations, where each operation expects input in the form of a placeholder. A Session class is responsible for executing TensorFlow operations; think of it as the environment in which the entire computation graph operates.

00:12:30.389 For a simple example, we'll define a graph and two tensors. Here’s a straightforward representation of adding these two tensors through defined placeholders and operations. You can create a hash that maps the placeholders to their corresponding tensors and run the session for result evaluation. Through these operations, we can utilize the flexibility of TensorFlow, allowing us to explore and manipulate data efficiently.

00:13:17.470 The code snippets seen may look complicated, but TensorFlow simplifies the challenging aspects of mathematical operations, thus allowing you to focus on solving intricate problems. For instance, the result of adding two matrices might look trifling, but the process of matrix inversion, determinant calculation, or solving systems of linear equations can become significantly more complex. If you're familiar with NumPy and other scientific libraries for matrix operations, you'll find Ruby's Matrix gem serves a similar purpose.

00:14:10.700 In addition, I want to return to protocol buffers briefly. When handling the TensorFlow graph described earlier, you can save it in protocol buffer format easily, allowing you to read and analyze the graph's definition across various platforms. This functionality enhances TensorFlow’s versatility, enabling debuggers to gather useful insights into models across multiple devices, maintaining compatibility.

00:14:48.500 It's remarkable that such operations can be executed across various platforms with identical protocol buffer descriptions. For example, defining a model in Python and leveraging it on Raspberry Pi or Android maintains coherence and efficiency, showcasing the real power of TensorFlow in enhancing model usability.

00:15:39.890 The challenge of why language bindings beyond Python haven't exploded in popularity can be attributed to limitations on the underlying C API. Though TensorFlow supports executing predefined graphs and functions across multiple languages, certain advanced functionalities may remain unavailable. However, you can still define Python models, transfer them to Ruby, and execute them, emphasizing cross-language compatibility.

00:17:37.030 Let’s take a step back and explore an engaging problem in machine learning: image recognition. Over the past few years, researchers have made remarkable progress, particularly through convolutional neural networks (CNNs) in tackling image recognition issues. The ImageNet dataset is standard; it's associated with an annual competition aimed at improving accuracy. Google contributed their open-source implementation of the Inception v3 model, which, although complex, can perform efficiently even on non-GPU PCs.

00:19:25.130 Now let's see how we can utilize this model in practical scenarios. You can load the serialized representation of this pre-trained model into a TensorFlow graph, executing the operations on any given image to classify it. For example, when I ran the model on an image of a cute puppy, it predicted with an 85% accuracy that it was a golden retriever. When using the model correctly, adopting new pictures becomes an exciting venture!

00:20:34.110 Earlier today, while exploring the beautiful city, an idea sparked where we could run the model on a different image, specifically the Chain Bridge, achieving a 62% accuracy of identifying it as a suspension bridge. Just to give you an idea, the Inception v3 model consists of around a thousand classes. If you give it an image from those classes, it works remarkably well. I encourage everyone to try out the models by taking pictures with their mobile phones and testing them to see the results.

00:21:33.300 Now, let’s talk about TensorBoard! It's an essential tool designed to improve model understanding. TensorBoard visually represents your data, making it easier to track metrics such as model accuracy and compare several models with each other. One of its features allows you to view how certain metrics improve through multiple training iterations, helping you analyze which models perform best.

00:23:04.800 For example, the data flow graph we discussed earlier clarifies the connections between tensors while providing insights into how these connections work together. Additionally, TensorBoard captures quantitative metrics, enabling developers to visualize data and the evolution of their algorithms. All these visual aids serve to make the complexities of machine learning more accessible.

00:24:30.800 Moving on, I want to illustrate how clustering can be understood through visualizations. The MNIST dataset, for instance, interprets handwritten digits using dimensionality reduction techniques. By mapping similar handwritten digits into distinct clusters in a 2D chart, you can usher a different perspective towards classification, pinpointing misclassifications clearly with the help of color-coding.

00:25:20.000 As we reach the end of our discussion on TensorBoard, I’d like to remind you that a simple example will be available on the TensorFlow Ruby webpage, along with detailed tutorials on the official TensorFlow site. Moreover, the TensorFlow Dev Summit featured a complete ten-minute talk dedicated entirely to TensorBoard, accessible through a quick search.

00:26:34.000 I would also like to express my sincere gratitude to the Ruby community for an incredible amount of support during this project. Our GitHub project led to trending discussions on platforms like Reddit, where many developers lent their insights and encouragement. So, if you haven’t used Reddit yet, I highly recommend it—there’s a vibrant developer community that provides helpful resources across various programming topics.

00:28:39.980 Finally, I want to thank everyone for your time and attention. I started this project in the summer of 2016, assisted by Somatic IO and mentorship from Jason Toy and Dr. Soon Him Ko from IBM Research Tokyo. Their ongoing support was crucial when obstacles arose—it’s a testament to how valuable the open-source community is! Remember, don’t hesitate to approach more experienced developers when seeking help because you’ll find most are incredibly welcoming and eager to assist. Thank you once again to the Ruby community and Euruko for this opportunity!