Data Structures
Playing Nice with Others - Tools For Mixed Language Environments

Summarized using AI

Playing Nice with Others - Tools For Mixed Language Environments

Jeremy Hinegardner • August 27, 2009 • Earth

In his presentation titled "Playing Nice with Others - Tools For Mixed Language Environments" at the LoneStarRuby Conf 2009, Jeremy Hinegardner discusses the challenges and tools available for effectively managing communication between multiple programming languages in development projects. He highlights the significant growth of technologies used in mixed-language infrastructures, with Ruby being a central focus alongside Java, C++, and other languages.

Key Points Discussed:

- Importance of Interoperability: Hinegardner emphasizes the need for programming languages to communicate with each other effectively, using examples where programs written in different languages need to exchange information.

- Communication Methods: He explores various tools for data exchange beyond traditional relational databases, mentioning options like SOAP, JSON, and XML.

- Persistence: He categorizes persistence into three types: none, snapshot, and lifetime, explaining how this affects system design.

- Tools for Mixed Language Projects:

- Tokyo Cabinet: A local library for data storage supporting various data structures.

- Tokyo Tyrant: Extends Tokyo Cabinet to network server capabilities, including RESTful API and replication options.

- Redis: A versatile data structure server that supports asynchronous snapshot persistence with various data structures like lists and sets.

- Libjlog: A library for publish-subscribe messaging, enhancing inter-process communication.

- Beanstalkd: A job queue system aimed at job processing efficiency.

- ZeroMQ: A messaging library that allows for custom messaging models.

- MongoDB and NoSQL: Discussed in the context of modern data handling and flexibility.

- Demos: Hinegardner showcases functional demos of Tokyo Cabinet, Tokyo Tyrant, and Redis, illustrating their practical applications in real-world scenarios.

In conclusion, the presentation highlights the essential tools and strategies for achieving interoperability in mixed programming environments, underscoring the importance of communication protocols, data structures, and persistence strategies. As developers increasingly encounter projects that integrate multiple languages, the insights from Hinegardner’s discussion are invaluable for fostering collaboration and innovation across diverse technology stacks.

Playing Nice with Others - Tools For Mixed Language Environments
Jeremy Hinegardner • August 27, 2009 • Earth

Playing nice with others - Tools for mixed language environments by: Jeremy Hinegardner

Help us caption & translate this video!

http://amara.org/v/G1Ww/

LoneStarRuby Conf 2009

00:00:20.960 Hello, everyone. Thank you for coming. I'm going to talk about how to play nice with others. This presentation mainly focuses on tools that you can use in mixed language environments.
00:00:26.880 My name is Jeremy Hinegardner. You can find me on Twitter at @copiousfreetime or at my websites jeremyhinegardner.org or copiousfreetime.org. I work for a company called Collective Intellect.
00:00:39.360 While that part isn’t too important, the fun part is the various technologies we use in the production of the products we create. Last year, I gave a talk about building a Ruby infrastructure. How many of you were here last year? Did anyone attend my talk?
00:00:52.239 Great! Since then, our Ruby infrastructure has grown quite a bit. We have added several new systems to our total infrastructure, including Java services, a couple of C++ libraries, a Groovy application, and around 20 micro Rails apps, along with some Sinatra apps and a wide range of gems.
00:01:10.240 In this multi-language environment, we need technologies that can interact harmoniously with one another. It’s essential for Java applications to use some of the same resources as Ruby applications. Similarly, Groovy applications and C++ libraries need to communicate effectively.
00:01:24.000 So, what tools can we use, aside from the ever-popular relational database, which sometimes may not be the best option for the task?
00:01:30.400 To kick things off, let’s do a little survey. Raise your hand if you have a favorite programming language. Keep those hands up! It’s great to see so many diverse languages represented here.
00:01:47.600 Now, I’ll call out some languages, and if I mention yours, put your hand down: Ruby, Smalltalk, Java, C#, C++, Lisp, Fortran, PHP, JavaScript, Perl. We’ve got a lot of languages in the room today. Ruby is obviously a favorite at a Ruby conference, but it’s fantastic to see others as well.
00:02:22.080 All these languages need to communicate with each other somehow. You might have a program written in Fortran that needs to interact with another program written in Java, Ruby, or Smalltalk. The question is, how do they exchange information?
00:02:34.480 The simplest method might be through plain files, but I started looking into commonalities between these various coding languages. What can we learn about how programming languages can communicate?
00:02:47.920 How many people here have a computer science degree or a background in computer science? What are some of the things you learned in your studies that may not necessarily relate directly to any specific language?
00:03:11.519 For example, big O notation, computational complexity, and in-depth knowledge about data structures. Who has that big white book with the blue sweep on it, authored by Rivest and others? It covers all these essential concepts about data structures.
00:03:45.360 Data structures are crucial because every language has an implementation of various data structures. In Ruby, for example, we have integers, floats, rationals, and even imaginary types, which can all be considered data structures.
00:04:11.760 In addition to data structures, another commonality across programming languages is the method of communication between them.
00:04:27.440 So, let’s do a quick survey. What does everyone currently use to communicate data structures between different applications?
00:04:33.600 Options include SOAP, Sagan, CORBA, JSON, HTTP, delimited files, Marshall, YAML, and others. The interesting challenge is ensuring these tools can effectively communicate across different languages.
00:04:58.560 I break down communications into two realms: network-based communication, which covers the exchange of data between different physical machines, and local communication, which refers to interactions within the same system without relying on a network API.
00:05:14.639 Additionally, we need to consider the aspect of persistence. I define persistence roughly in three ways: none, where there’s no persistence at all; snapshot persistence, which involves taking a snapshot of the data structure and saving it to a medium like a disk; and lifetime persistence, where data remains useful for a certain duration.
00:05:40.800 Let’s see if anyone has been paying attention. Using persistence, communication, and data structures, can you guess what tools I'm describing? The first tool has network communication, no persistence, and utilizes a hash data structure.
00:06:06.240 Can anyone take a guess? Yes, it’s Memcache! It has network communication, no persistence, and relies on a hash data structure.
00:06:14.639 The next tool has network communication, lifetime persistence, and operates using a struct data structure.
00:06:28.639 Any ideas? Yes, it’s any database you prefer. We’re categorizing tools using the taxonomy of communication, persistence, and data structure.
00:06:39.680 To be considered a cross-language tool for communicating these types of data structures, it should ideally support at least three languages.
00:06:53.000 Let’s have a quick show of hands. How many here are working on a project that involves more than one programming language? Great! More than two? Fantastic. And even more? That’s awesome!
00:07:44.400 Through my experience, the average number of languages in a project tends to be around three. Even in a standard Ruby on Rails project, you’re likely to encounter Ruby, JavaScript, and SQL for database interactions.
00:08:02.640 Now, I would like to talk about a few different tools that I enjoy working with and how they fit into this mixed-language context.
00:08:20.800 First off is Tokyo Cabinet. Who's familiar with these products? Not many? Interesting! Over the past year, I've noticed a surge of simple tools that offer versatile features across many different languages.
00:08:28.800 Tokyo Cabinet and its other products have been prominent in this context. In fact, I think there were a few talks on Tokyo products at RubyKaigi and the recent conference in Toronto.
00:08:45.760 I'm currently using Tokyo Cabinet in production for Tyrant. In terms of data structures, Tokyo Cabinet supports arrays, hashes, and structs. It has several file formats: hash, B3 table, and an array.
00:09:01.440 In terms of communication, it is local, meaning it acts as a straight library. For persistence, it offers lifetime storage, allowing access by another process.
00:09:13.440 Tokyo Cabinet ships with bindings for a variety of languages such as C, Perl, Ruby, Java, Lua, and Python.
00:09:23.840 Next, we've got Tokyo Tyrant, which converts any Tokyo Cabinet database into a network server. It supports the same data structures as Tokyo Cabinet.
00:09:30.560 Tokyo Tyrant offers several cool features, including compression for key-value stores, with automatic compression and decompression using zlib.
00:09:48.960 Another valuable feature is that it fully understands the Memcache D protocol. This allows you to persist your Memcache data to disk easily.
00:10:05.920 Additionally, Tokyo Tyrant has a full RESTful API that makes it simple to interact with, allowing for GET and PUT requests to store and retrieve values.
00:10:17.920 Moreover, it has Lua extensions for executing various functions as required and offers replication options such as master-master and master-slave setups.
00:10:32.960 Now, let’s move on to Redis. Has anyone heard of Redis? More familiar faces! Great!
00:10:39.680 Redis is a data structure server that can hold different types of data structures. It provides lists, hashes, sets, and standard key-value pairs.
00:10:56.320 The outstanding feature of Redis is its ability to process list and set data structures. It operates on a network level with its own protocol and supports asynchronous snapshot persistence.
00:11:16.640 With Redis, data is saved in the background, but if your server dies unexpectedly, you may lose data between the last save and the time of the crash.
00:11:31.680 Redis supports various languages including Ruby, Python, PHP, Erlang, Tcl, Perl, Lua, and Java, making it quite versatile.
00:11:47.920 Furthermore, Redis provides replication capabilities, allowing for both master-master and master-slave setups, which facilitate data streaming effectively.
00:12:11.840 The fun part is relation to in-server set operations. For example, you can easily find intersections between sets stored within Redis.
00:12:27.680 Next, let's touch on Libjlog. How many of you have heard of it? It’s an excellent tool that acts as a library for publish-subscribe messaging between processes.
00:12:41.920 While I haven’t yet used it in production, I find its concept of enabling communication between two processes to be very intriguing.
00:12:57.920 Currently, it has support for C, Perl, and PHP, and I'm working on integrating it into Ruby as well.
00:13:17.920 After that, we have Beanstalkd. Have you heard of it? Many of you, excellent! This is another library that I really appreciate.
00:13:29.040 Beanstalk uses a simple job queue structure and does not offer persistence yet; however, in the next minor version, it is set to include persistence.
00:13:45.760 With Beanstalk, you can push jobs onto a queue and process them with multiple workers. It’s straightforward; once a job is reserved, it’s the only one that can work on it.
00:14:05.920 Lastly, let's touch upon ZeroMQ, which has the potential for being a plumbing component for any message system you wish to develop.
00:14:20.000 ZeroMQ provides high throughput messaging, and its latest version supports lifetime persistence based on the usefulness of messages.
00:14:36.159 It’s a very flexible messaging library that allows you to implement your own messaging models as you see fit.
00:14:52.080 Now, let’s move on to MongoDB, which is a network database that has been gaining traction in the field.
00:15:06.160 There’s been a historic shift toward NoSQL technologies, allowing for various flexible data structures.
00:15:29.680 Additionally, there are newer technologies like Flare, which starts becoming sharded and scales automatically.
00:15:46.080 Other interesting technologies include Cassandra and CouchDB—both of which have made significant impacts as well. Their ability to handle large-scale data processing has seen immense growth in the last couple of years.
00:16:00.480 And we cannot forget about Solr. Solr represents a powerful search platform that effortlessly integrates with various programming languages.
00:16:16.240 Please share any of your favorite tools that I haven't mentioned. I'm starting to compile a growing list of technologies, and I'd love your input.
00:16:38.720 Let's transition into some demos. I’ll start with Tokyo Cabinet, using sample data from the US Census for a variety of names.
00:17:01.520 The dataset includes first names and last names, which I’ll read in to illustrate the functionalities.
00:17:18.080 Tokyo Cabinet is often overlooked for its table file format, a simple key-value store with the values as hashes.
00:17:33.680 I will demonstrate how creating an index on the name field allows for super-fast lookups to return names in a matter of seconds.
00:17:52.880 Overall, we managed to insert a multitude of names and observed how quickly the process can be done.
00:18:16.720 Next, let’s execute the same demo using Tokyo Tyrant, which operates as a network server instead.
00:18:29.760 So, let’s do that now.
00:18:37.600 In the case of Tokyo Tyrant, there’s a slight difference in speed, but it’s a very efficient option overall.
00:18:51.680 We will also explore using Lua inside Tokyo Tyrant for some advanced functionality.
00:19:06.720 Next up is Redis. I’ll be running through a quick demo that showcases storing male and female first names with Redis to find sets and intersections.
00:19:22.080 This should also demonstrate how Redis provides enhancement in server-side operations.
00:19:34.240 Lastly, I'm excited to conclude with Beanstalkd, which makes job processing in queues exceedingly simple.
00:19:57.360 I think we have time left for one more, or would anyone prefer to wrap up? Thank you all for your attention and participation.
00:20:19.360 Are there any final questions? It was great discussing this with you all!
00:20:34.160 Thank you!
Explore all talks recorded at LoneStarRuby Conf 2009
+14