00:00:00.000
Uh, hi everybody. I'm Fiona, and I'm a developer at a company called CipherStash. They've sponsored me to speak here, and my talk is directly related to what I work on every day.
00:00:10.559
Ruby was the first language that I learned, and as most boot camp attendees would know, that's generally the case. However, until recently, I haven't had the opportunity to work with Ruby professionally. It's been a lot of fun coming back to Ruby after all these years. The main language that I have worked with is Elixir.
00:00:36.600
Also, prior to a year ago, I didn't know much about cryptography. How encryption worked was a bit of a mystery to me. I’m not a cryptographer. But today's talk is about sharing what I've learned about something called application-level encryption.
00:01:04.199
The original title didn’t really work as a conference title and didn’t have the same ring to it, so I decided to call my talk the "Encrypted Search Party." More specifically, I'll be discussing an encryption scheme called Order Revealing Encryption, or ORE, throughout this talk.
00:01:15.840
What we'll cover today is a brief introduction to some concepts that will help us understand what encryption is, what we mean by application-level encryption, and why querying encrypted data can be challenging. To help explain how ORE works, I've built a toy ORE library that I'll demo at the end of the talk. This library is available as a gem, so you can experiment with ORE on your own.
00:02:04.680
One of the first things I learned about encrypted data is that it can limit the usability of the data. For example, being able to query or search that data can become quite challenging. Consider this users table with an email and date of birth column containing some personally identifiable information. We might use an SQL statement to select all users where the date of birth is greater than January 1, 1990, and normally we would get records back.
00:04:34.800
However, if we choose to secure that data by encrypting it and then attempt to run the same query again, we may get nothing returned. Why is that? Let's break this down starting with application-level encryption. Encryption is the process of converting human-readable text, which we will refer to as plaintext, into incomprehensible text called ciphertext using a key or keys.
00:05:01.860
This data can be decrypted by someone who has access to those keys. The encryption can yield what's called non-deterministic or deterministic output. Non-deterministic means that given the same plaintext, a different ciphertext will be generated each time, often using something called an IV or initialization vector to introduce randomness.
00:05:25.380
Deterministic means that given the same plaintext, the same ciphertext is generated every time. Both methods have their pros and cons, which we will explore further. Now, let's understand what we mean by application-level encryption.
00:06:07.380
This basic diagram illustrates the data lifecycle between a client (where our application is) and an RDS instance in a cloud provider like AWS. One way we can encrypt our data is at rest, which protects our data from physical attacks on the underlying storage.
00:06:39.600
But what happens when our data leaves our RDS instance? We can encrypt it using SSL or TLS, known as encryption in transit. However, there are still gaps where an attacker could gain access to our sensitive data.
00:06:48.600
So, what is application-level encryption? It means the client is in control of encrypting and decrypting the data. This ensures that our sensitive data is encrypted as close to the client as possible during all stages of its lifecycle, including at any location where that data is stored.
00:07:24.240
To demonstrate application-level encryption alongside deterministic and non-deterministic decryption, I’ll show you a demo of a Rails app. As part of Rails 7, Active Record now supports application-level encryption, and there are a few basic steps to set this up.
00:07:55.860
The first step is to generate some secret keys. In a production environment, you would store these sensitive keys in your Rails credentials file, but for the purposes of this talk, I'll display them as they are not being used anywhere. The demo app we'll use has a basic users table with an email field, and in our model, we will declare that we want to encrypt our email attribute.
00:08:47.520
As you will see in the recording, when we try to create a user record, it inserts successfully. However, when we attempt to query that record, we don't get anything returned because the value of the email is now ciphertext, which is generated afresh every time we execute the query, meaning we can’t perform a simple query clause.
00:09:33.780
Active Record does have a deterministic mode that we can switch on. In this example, when we create a record, the ciphertext produced is exactly the same when we query the record. This means we can successfully retrieve the record.
00:10:26.760
However, every time we create a user record, it will produce the same ciphertext. This could be problematic because if an attacker determines the plaintext value of that ciphertext, they could decipher the underlying data for all instances of that ciphertext.
00:10:56.280
This indicates that non-deterministic encryption might be the preferable option, but the challenge is how to make that queryable. This is where order-revealing encryption comes in.
00:11:23.760
Research for ORE was conducted by two professors at Stanford University, Dr. David Wu and Dr. Kevin Louis. Although I do not know how to read the paper published in 2016, I've been learning about ORE from individuals who do understand this research and have collaborated with them.
00:12:13.860
In simple terms, ORE allows us to compare two ciphertexts such that we can determine the order of their corresponding plaintexts without revealing the plaintext itself. For instance, if we encrypt a plaintext of 'a,' ORE generates a ciphertext consisting of a left ciphertext and a right ciphertext.
00:12:37.560
By comparing these ciphertexts, we can ascertain whether the plaintext of one is less than, equal to, or greater than the other. In our simplified example, we will encrypt just the letter 'a' and analyze how each plaintext relates to the potential values in a defined domain.
00:13:39.560
We have values from zero to three, which are mapped to the ASCII characters 'a', 'b', 'c', and 'd'. We will then encrypt the letter 'a', comparing its value against the possibilities in our domain. For example, when comparing against zero, one, two, and three, we will store the results of those comparisons.
00:14:27.840
We also maintain an offset, which indicates where the plaintext value in relation to the domain is equal to each plaintext value. This allows us to maintain the context of the comparison within our ciphertext.
00:15:06.840
Now, when looking at how the plaintext values map to their respective ciphertexts, we can apply the same method to any plaintext we wish to compare. As we compare each letter to our domain values of zero, one, two, and three, the results can inform us if one is lesser, equal, or greater.
00:16:01.260
In ORE, we utilize two keys: a pseudo-random function (PRF) key, referred to as a hash key, and a pseudo-random permutation (PRP) key, referred to as a shuffle key. These mathematical concepts help facilitate our encryption and comparison results in a secure manner.
00:16:45.180
When we encrypt comparisons, we generate a key to encrypt each outcome and store an IV for those comparisons to enhance the security of our output. The results will appear jumbled, encrypting our key details while still allowing for comparison with the previously encrypted right ciphertext.
00:18:04.680
The left ciphertext will have the initial offset location where the plaintext corresponds to a domain value, while the right ciphertext stores the necessary comparisons. Therefore, both ciphertexts together facilitate querying without exposing sensitive data as plaintext.
00:18:43.920
Let’s discuss what happens in our database as we attempt queries. The client generates a left ciphertext when it performs the query, and this left ciphertext is then sent to our database that already contains the respective right ciphertexts. A function in the database compares the left ciphertext with the right ciphertext and returns the comparison results.
00:19:25.260
This illustrates the effectiveness of separating the encryption keys from the queries, allowing our database to operate with the encrypted data without ever needing to understand the underlying plaintext. Essentially, it safeguards sensitive data while still enabling functional queries.
00:20:00.360
For the sake of clarity, I'd like to reiterate that while many aspects of this encryption methodology might seem confusing now, you're more than welcome to approach me after the talk for further discussion. In conclusion, I'd like to showcase a demo using a toy ORE library that I've built. We'll walk through some code and hopefully, you'll recognize some of the concepts I've just mentioned.
00:20:54.540
The method encrypts our data based on the IV, the key, and the comparison result. We'll also review the class representations of the left and right ciphertexts, which house the necessary attributes for conducting comparisons in ORE.
00:21:46.920
This demo will illustrate how we can initialize our ORE scheme with a domain size allowing us to create ciphertexts from the values zero to three. After establishing our cipher for each value, we can then demonstrate how comparisons are effectively returning results according to their order relationships.
00:22:39.840
In summary, these comparisons will reveal how the letters rank against one another, which showcases the usefulness of ORE in applications requiring secured data while still allowing effective data querying.
00:23:26.160
The ORE scheme is available on RubyGems as "toy-ore" and features some documentation along with well-commented code, which I hope will help get you started. I'll share the links to my talks along with my GitHub profiles.
00:23:53.320
Thank you all for listening and please feel free to reach out if you’d like to dive deeper into this topic.