RailsConf 2017

A Deep Dive Into Sessions

A Deep Dive Into Sessions

by Justin Weiss

In the RailsConf 2017 talk titled "A Deep Dive Into Sessions," speaker Justin Weiss explores the crucial concept of sessions in Ruby on Rails applications. Sessions are essential for maintaining user state and data continuity, allowing developers to store user-specific information across multiple requests. The presentation covers the mechanisms behind session management in Rails and the importance and nuances of cookies in this context.

Key Points Discussed:

  • Understanding HTTP Statelessness: Weiss emphasizes that without sessions, applications cannot recognize returning users or maintain state. Unlike functional programming, where you would pass data explicitly with every request, Rails uses sessions to simplify this process.
  • What Are Sessions?: Sessions store user data, such as preferences or identifiers, allowing applications to remember user interactions without unnecessary complexity.
  • Mechanisms of Sessions: Weiss elaborates on how Rails leverages cookies for session management. When a user interacts with a Rails app, a cookie stores session data on the client side, which the server retrieves on subsequent requests.
  • Cookies Basics: Cookies contain both data (what the server wants to remember) and metadata (instructions for the browser on when to send the cookie). Weiss explains the difference between session cookies (which expire when the browser closes) and permanent cookies.
  • Security Considerations: Weiss discusses potential vulnerabilities, such as cookie theft over insecure networks and the importance of using HTTPS to protect cookie data. He also highlights Rails' built-in encryption and signing features that enhance cookie security.
  • Session Storage Strategies: Different strategies for storing session data include:
    • Cookie Store: Simple and requires no backend setup but is limited in size.
    • Cache Store: Utilizes existing caching mechanisms but may face expiration issues.
    • Database Store: More permanent but requires management of outdated session data.
  • Best Practices: Weiss provides strategies for effective session use, such as programming defensively to handle missing session data, avoiding storing large or complex objects, and keeping data clear and concise.

Conclusion:

Weiss concludes that despite the complexities surrounding sessions, they are fundamentally straightforward constructs built on simple key-value pairs and metadata. Understanding session management helps developers mitigate frustrations and build robust applications. He encourages developers to embrace learning from their challenges with sessions, emphasizing that knowledge gained during debugging can significantly enhance their programming skills.

The talk also includes an open invitation for further discussion, highlighting Weiss's willingness to connect with fellow developers.

00:00:11.660 All right, we could probably get started. Hey everybody, I hope your day is going well so far. Thank you for coming by. I'm Justin Weiss.
00:00:19.260 I work at Avvo, where we help people find the legal help they need. I help our software developers get better code into production more quickly.
00:00:25.019 I also write articles and guides to help others become better Rails developers on my site, justinweiss.com. Additionally, I wrote a book titled "Practicing Rails" which can help you learn Rails without feeling overwhelmed.
00:00:36.230 Now, I want you to imagine if your experience on the web was like this: What if your site couldn't recognize the same person visiting at different times?
00:00:43.559 What if all the information you had about a user vanished as soon as you returned that first piece of HTML?
00:00:48.690 While this might be fine for a website that only serves generic information, most of us don’t live in that world. We need to know about our users and keep track of some data about them.
00:01:01.170 This could include a user ID, preferred language, or whether they prefer the mobile or desktop version of your site. You could try solving this the functional programmer way by passing all the necessary data with every single request, but since we're using Rails, this problem is easy to solve.
00:01:25.740 We place data in the session hash, and it magically returns to us on the next request. With that, we never have to worry about which user is accessing our site ever again. But this isn't my talk; it's an introduction into sessions.
00:01:37.770 Thank you all for coming, but wait a second—how does this work? How does it stick around? I mean, I thought HTTP was stateless.
00:01:48.810 Rails makes using sessions really easy, which is great, but it can also be a little dangerous. For a long time, I treated sessions like a database that didn't require any setup. I didn't quite understand how sessions worked, but I didn't really need to because they were a magic hash I could always depend on.
00:02:06.799 That unexpected flakiness meant that I often hated using sessions. How many of you have felt unreasonably frustrated when session-related issues don’t work correctly? It’s frustrating to see session exceptions or bugs related to missing data for a user.
00:02:24.280 In my first web programming job, many users didn’t have accounts, so we used sessions extensively. I encountered numerous problems with them; exceptions related to sessions showed up far more frequently than any others.
00:02:30.990 Several times, I got reprimanded for issues I caused because of sessions errors. Since I didn’t understand them, I tried working harder; I wrote more code, added null checks everywhere, and when those didn’t work, I thought to avoid sessions altogether.
00:02:48.430 I considered making users log in on every page, thinking that would be great. But then I matured a bit and realized that this problem wouldn’t just go away. Instead, I spent time to really understand sessions at a deep level.
00:03:01.170 After some time, I learned to write code that avoided many of these issues in the first place. The nice thing is, to understand sessions, we really don’t need to get too complicated. This is what we want: we want to know about our users securely so that nothing can mess with that data.
00:03:33.400 We want to keep track of that information until they leave our site because once they stop using it, we don’t need to have that data around anymore. All we need is a way for the user’s browser to coordinate with our Rails app and connect everything up.
00:03:54.909 The issue of not being able to track users was acknowledged early on, especially as more people began using the web to buy things, which caused developers to seek ways to track data such as shopping carts and user preferences.
00:04:06.760 Earlier, I jokingly mentioned passing all the necessary data through URLs, but that’s quite close to what we typically do. If those parameters are in the URL, it's easy to see, lose, or fake them. However, there’s another option: when a browser makes a request to a server, it sends some headers.
00:04:30.740 So, what if browsers included user data in the headers? The server could see it, modify it, and send new data back to the browser, which would then send this modified information back to the server. They could essentially ping-pong information back and forth.
00:04:51.320 This idea of using a header that automatically sends from the browser originated in Netscape in the early 90s, which called this special header 'cookies'. About a year later, it was supported in Internet Explorer.
00:05:03.710 The nice thing about this is that the server doesn’t manage this data; it’s stored and managed by the browser. So, how do cookies work? What do they look like? Imagine making a request to Google through your browser.
00:05:17.060 When Google sends you a page, it also sends HTTP headers that contain metadata about the request. Focus on this line—it’s a cookie header. When your browser sees it, it stores it alongside the server's details.
00:05:29.180 That way, the next time you request a page from Google, your browser will send headers along with the exact same cookie, establishing a shared connection for ongoing conversations.
00:05:35.890 All cookies consist of a few parts: they hold data—the information your server wants to remember—and metadata, which the browser uses to determine how and when that cookie should be sent to the server.
00:05:46.080 Before the semicolon, you have the data. When you use the session object in Rails, it stores data in this part and reads data from it. The rest contains metadata. For example, cookies can have expiration dates. After that date, the browser won’t send that cookie anymore, and the server won’t have access to that data.
00:06:06.030 If no expiration date is set, the cookie usually disappears as soon as the browser is closed. These are known as session cookies. They last for one browser session and are deleted upon closing the browser. Cookies can also have an expiration date called permanent cookies, which last until the specified date.
00:06:39.970 Sites cannot read each other’s cookie data. If I were visiting goggles.com instead of google.com, that cookie wouldn’t be sent. This is also true of subdomains; a leading period before a domain name indicates that it applies to subdomains.
00:06:52.230 Setting a domain also includes all subdomains, meaning a cookie is valid for the main domain and its subdomains—like Drive or Docs. There are also extra attributes you can assign cookies, such as HTTPOnly, which we’ll discuss later.
00:07:03.310 You didn’t come here just to learn about cookies and get hungry before lunch; what’s all this got to do with sessions? Well, sessions are built using cookies since they’re a dependable way to track users without needing to remember all those parameters.
00:07:17.390 The better you understand cookies, the easier it will be to understand sessions and how they operate. Just like a hash, you can construct many more complicated systems on top of cookies.
00:07:32.560 However, out of the box, they’re pretty limited; a single cookie can hold only a single value, necessitating separate cookies for every key-value pair. Furthermore, since all information is stored on the user's browser, you can't necessarily trust what they provide. Imagine storing a username in a cookie—the server sends it to your browser, which you can then tamper with.
00:08:10.940 If the server relies on this cookie, it trusts the user’s provided data, which is problematic. This isn’t much safer than placing data in the URL, as previously discussed. How does Rails navigate these pitfalls? Let’s dive into an example.
00:08:41.580 Consider a simple controller action that takes what’s in the params as a name and places it into the session under 'session name'. This way, we can create session data and observe what we receive.
00:08:53.050 For instance, if we set the name to Justin, the controller processes that name, puts it into the session, and returns it. Rails stores the session under one key, using a cookie, known as 'my_app_session'. If you check the initializers, you’ll find this session stored in the 'session_store.rb' file.
00:09:03.420 If you change that option, it will change the key under which your session data is stored, potentially breaking all your old sessions in the process—so be careful! The data stored in this cookie is usually unreadable, meaning that attempts to tamper with it could be challenging.
00:09:23.260 The rationale behind this is straightforward: Rails session cookies are signed. If tampered with, they become invalid. They’re also encrypted, preventing anyone from viewing the data inside.
00:09:31.340 All Rails applications have a secret key that gets regularly leaked on GitHub. This key is pivotal for encryption and signing cookies. Rails generates default keys for development and test environments, but new ones should be created for production using 'rake secrets'.
00:09:46.430 When you boot your Rails application, it places the secret key into the key generator. You don’t need to worry about copying it down; I’ll have the code and links for you at the end.
00:10:03.690 The key generator at the top creates some secrets, which we then use to create an encrypted object. This object is the same that we utilize for both encrypting and decrypting cookies. Let’s see what happens when we try to access our cookie data.
00:10:12.720 If we take that sizable encrypted text and input it into our encryption object, we should visualize it as JSON. So why is it JSON? The reason is that it can also be configured as a different serializer in an initializer.
00:10:37.430 You could also use Marshal if you prefer, but most people opt for JSON as it’s the default and beneficial for sharing cookies between applications written in other languages.
00:10:58.300 Now, to summarize what we understand: Rails stores all session data in a single cookie. It does this by converting it into JSON, allowing multiple keys and values to exist in just one cookie. Additionally, Rails signs and encrypts the cookie to prevent tampering.
00:11:13.810 Both the session key and the serializer can be configured if needed. Now we know that the cookie contains the 'name' parameter we initially sent, and if we stop sending that parameter, we should still be able to retrieve it from the session data.
00:11:35.490 Let's explore this. If we drop the parameter from the URL but send the cookie back to Rails, we should still know who we are without needing to pass in a parameter. This is how a browser retains data over multiple requests.
00:11:47.690 Put simply, your browser hits the server, stores data in the session, Rails converts that session data to JSON, encrypts and signs it, and sends the encrypted cookie back to the browser within the 'Set-Cookie' header.
00:12:05.600 The browser then keeps this alongside information about its origin, so next time you interface with your Rails app, the browser will send that cookie back, Rails verifies and decrypts it, turning it into that session hash.
00:12:22.360 It acts like parameters passed on every page but is managed automatically, so you don’t need to think about it. Rails can then update the data and send it back to the browser, replacing the previous cookie, allowing continuous data exchange.
00:12:36.160 However, if simply passing cookies back and forth was all there was to sessions, there would be no reason to call them sessions. I mean, you could just say, "Hey, here’s your session cookie; it's just a simple cookie." But cookies aren't all that reliable.
00:13:01.640 Consider what happens when you start storing significant data in that cookie. If you were to store 4 MB of data in a cookie or maybe the entire text of Moby Dick for some reason, each request to your server would incur that large data size, even if it doesn’t utilize it.
00:13:20.960 Cookies have limits: you can only put 4 KB of data in there, and exceeding this will trigger an 'ActionDispatch::Cookies::CookieOverflow' exception—the most delicious of all Rails exceptions!
00:13:37.360 Even 4 KB is considerably larger than most HTTP requests since typical requests are only a couple of hundred bytes. If you’re concerned about performance, you likely want to stay far beneath that 4 KB limit.
00:14:01.690 Now, if you need to store more data than what can fit in that cookie, how do you keep your cookies small while allowing large sessions? If you’re using cookies, you’re probably simply storing a user ID there; you wouldn’t store their email address, full name, or a list of cart items.
00:14:40.300 You’d simply record their user ID and later look up other information in your database. But what if users don’t have an account and, thus, no user ID?
00:14:53.600 In this case, you could generate a random session ID and store that in the cookie, allowing you to later use that ID to retrieve information from your database.
00:15:04.300 Now we have two options for persistently storing data across multiple requests: either store the data directly in the cookie or store a reference to that data in the cookie while keeping the actual data somewhere else, such as a database.
00:15:20.450 What would this second option look like? For instance, if we want to use Active Record to store our session data, we’d generate a random session ID and convert the session hash into a string. Then we could save both the ID and the data to a row in our database.
00:15:50.690 Later, we’d return the session ID with 'Set-Cookie' so that when the user returns to our site, the browser can use that ID to look up the session data, retrieving the session hash.
00:16:08.890 First, let’s change our session store to Active Record and add some data to the session using Curl. When we pass that name parameter, the application takes it out, puts it in the session, and now we get a short string instead of that large blob of encrypted signed data.
00:16:21.630 You’ll notice that the session ID and the string returned to the browser align, so we use that ID to look up the session data later. When your browser sends cookie data back to your site, it once again identifies who we are without needing to submit a parameter.
00:16:40.590 The process involves the browser extracting the session ID from the cookie, looking it up in the database, pulling the data associated with that ID, and transforming it back into the session hash. Session data can also be stored in memcache, Redis, MongoDB, or other places.
00:17:05.390 These methods generally follow the same process: the cookie now functions merely as a session ID, while the app uses this ID to retrieve the rest of the information. You can even create your custom session store.
00:17:23.940 You simply need to inform Rack how to find sessions, create new sessions, write session data, and delete sessions by implementing a few methods. Rails includes a simple cache store that uses your Rails cache for session storage, which is an excellent example to follow.
00:17:46.480 That’s essentially how Rails stores sessions. Two strategies are employed: the cookie store strategy and everything else strategy. Regardless of storage method, some data must be retained in the cookie to maintain that connection.
00:18:02.600 While cookie storage preserves all data within the cookie, alternative methods store references to that cookie and can archive the data as needed, whether in a database or in-memory.
00:18:18.200 However, a choice remains: you can select from various session storage options, which is a crucial decision since altering session stores is complex. So, should you choose the cookie store, the cache store built into Rails, or the database store?
00:18:39.490 Using cookies for session data is the simplest option, requiring no additional infrastructure setup; it works out of the box. Another benefit is that it aligns well with your user lifecycle.
00:19:06.480 When users are active on your site, their session data persists, and when they stop visiting, you don’t have to perform any cleanup as the cookies remain on the browser side instead of the server side.
00:19:29.520 No other methods can guarantee this convenience. However, cookie storage is limited; it allows only 4 KB of data and has vulnerabilities to certain attacks.
00:19:38.770 If the cookie store is unsuitable, your two choices are to save sessions in a database or use Rails cache. If you’re already using Memcache for caching partials or API responses, using it for session data is convenient and requires minimal effort.
00:19:56.260 This alleviates worries about sessions becoming too large since most caches evict older items when new data comes in, thereby remaining fast. But it isn’t perfect; session data must compete for space in the cache.
00:20:22.370 If memory is insufficient, you risk facing early cache misses and expiring sessions. If you need to reset your cache due to a significant change, doing so will also wipe your sessions clean.
00:20:36.810 Nevertheless, this is how we commonly store data in our main Rails applications, and it has performed well with those considerations. If you want to have sessions persist until they legitimately expire, keeping them in some form of database is advisable.
00:20:56.190 However, utilizing a database to store sessions presents its challenges. Sessions won’t naturally clean up, requiring you to delete old sessions manually.
00:21:22.061 You'll also need to be mindful of how your database copes with a load of session data. For example, if you’re applying a Redis session store, are you housing session data in memory? Does your server possess enough memory to accommodate that?
00:21:47.210 Or might it require swapping? In this scenario, you can't directly SSH into it to fix the issue. You’ll also need to practice caution about when you create session data; if you trigger session creation on every request while Googlebot crawls, it could generate a multitude of unnecessary sessions.
00:22:00.000 Most of these problems don’t arise often, but they require careful consideration when temporarily storing session data. However, if you're confident that the cookie store limitations won’t impact you, I recommend that approach.
00:22:22.730 It’s simple, requires no setup, and is headache-free from a maintenance perspective.
00:22:42.990 For cache versus database, it’s more about how much maintenance you're willing to take on versus the risk of early session expiration. My personal preference is to start with the cookie store, followed by the cache store, and then the database store.
00:22:56.700 In this talk, we've used sessions mainly to identify users, making this an attractive target for attackers. This dynamic introduces additional considerations beyond simple key-value pairs.
00:23:18.210 We need to keep our cookies secure. The Rails server has to trust cookies because it has nothing else to rely on. If someone gains access to your session cookie, there’s no way for Rails to recognize that they're not you.
00:23:32.600 During many public Wi-Fi sessions, it's easy to snoop on other users' network traffic. Therefore, if you send cookies insecurely over insecure networks to unsafe servers, someone could grab your cookies and impersonate you.
00:23:47.410 This vulnerability became notable a few years ago with a proof of concept called Firesheep, which demonstrated how easily this could happen. Just a double-click could instantly log you in as someone else, which is quite frightening.
00:24:06.000 The remedy for this is to run your site over HTTPS, ensuring that all cookie and session data is secured as part of your internet traffic. You can enable this quickly in Rails.
00:24:21.490 It does require some additional infrastructure, but by flipping the config line to ensure SSL is true in your production setup, Rails will auto-enable this.
00:24:36.400 With free SSL options like Let’s Encrypt and several companies supporting it, there’s hardly an excuse to avoid SSL. Once you implement SSL in Rails, it'll mark session cookies as secure.
00:24:49.680 This ensures that cookies only get sent over HTTPS. This works similarly to trying to send a cookie to a different domain.
00:25:04.160 However, sniffing data through Wi-Fi connections isn’t the sole method of vulnerable data exposure; JavaScript can also read cookies. So, if you’re Google, you can use document.cookie to read cookies.
00:25:22.230 Anyone capable of executing JavaScript can access those cookies and relay them elsewhere. Myspace serves as an illustrative example of this vulnerability. They once had many scripting flaws, permitting attackers to easily embed JS or Flash on user profiles.
00:25:40.890 This ability allowed them to harvest profile information for anyone visiting their pages, such as names and account IDs. Arguably, they may have even accessed login credentials, but that part eludes my knowledge.
00:26:00.010 Rails safeguards against several attacks by escaping HTML and automatically marks session cookies as HTTP only. What this means is that if a cookie is marked as HTTP only, it's only accessible to the server and not to any JavaScript.
00:26:20.210 Thus, it helps counter many of these vulnerabilities. If that’s insufficient, note that you cannot fully trust users. Consider running a music store where users earn credits to purchase songs.
00:26:40.730 If you eliminate signups to make the purchasing process smoother, everything seems fine until you have a user sending this cookie with 400 credits.
00:26:56.460 After a purchase, if you return a cookie with a new balance of 300 credits, there’s nothing preventing the users from disregarding that and using their old cookie with 400 credits, granting them infinite credits.
00:27:19.190 This scenario doesn’t render itself to straightforward solutions; you could ensure unique numbers in the session to prevent reusability, which isn’t the most favorable approach, or shift to a database store.
00:27:46.470 A better alternative is just not to store such data in the cookie in the first place—this is what databases are meant for. These issues represent some fascinating attack vectors, but I recommend the Rails security guide for additional exploration.
00:28:06.390 It may seem overwhelming to navigate cookies and sessions, yet I can share a few rules of thumb to minimize issues. First, always prepare for sessions to disappear unexpectedly. Sessions are on the user’s device.
00:28:25.120 This poses an issue as you have no control over when users clear their cookies or change devices. Always keep in mind that sessions might not be there anymore when you use them.
00:28:44.050 Program defensively, because it will happen. The second rule: never store actual objects in the session. For example, if you store a cart item in session, later modifications may fail if the structure changes.
00:29:03.490 Such discrepancies have caused me to take down large portions of a site before. You usually have two undesirable options: reverse your changes or reset all session data.
00:29:12.360 Unless you absolutely need to retain specifics in the session, store only references to objects, not the objects themselves. Finally, use sessions with intention.
00:29:28.960 It's easy to overpopulate sessions with miscellaneous data, which can create havoc. One troubling bug surfaced after deploying a change, showcasing exceptions that seemed random.
00:29:52.780 A thorough investigation revealed conflicting data in the session stemming from old code. The session was convenient, and we neglected to design a new database table, which ultimately led to major issues.
00:30:11.880 When you forgo using sessions for specific data, you eliminate potential problems nested within them. Even when following best practices, issues can arise.
00:30:31.570 When debugging unexpected outcomes, isolate the problem area. Is a function receiving the correct input? Is it delivering the expected output? You can keep narrowing down until you pinpoint the issue.
00:30:52.000 The best tools for debugging session issues are those that display what your server sends and what it receives. Who here uses tools like Curl, Postman, or Paw in web development? These are invaluable for troubleshooting session issues.
00:31:08.860 These tools allow you to see session data that your server sends and let you submit arbitrary sessions to observe server responses. If those tools confirm your server works correctly, you can assume a browser-related anomaly.
00:31:24.780 For debugging network-related issues, MITM Proxy is my preferred tool. It’s a small server that sits between the browser and app, displaying all network connections, letting you investigate requests and responses interactively.
00:31:43.580 Last week, I used it to analyze a session race condition due to conflicting Ajax requests. With MITM Proxy, I was able to create an actual timeline of the requests.
00:32:01.560 If your browser isn’t successfully sending cookies, it’s often a sign that domain settings are misconfigured. This is easily overlooked and can complicate development.
00:32:10.720 To assist in resolving these issues, I’ve made a gem for use in development mode that can decrypt cookie strings using the Rails key generator.
00:32:30.130 All this reinforces that sessions are vital to the modern web—albeit a modern that dates back to 1995! When issues with session data arise, they might appear intricate and perplexing.
00:32:48.640 However, session data isn’t too complicated at its core; sessions hinge on a basic primitive: a single key-value pair and some metadata.
00:33:06.910 Upon this foundation, you gradually build more features, serializing data to fit it into one cookie or using the cookie value as a reference to stored data elsewhere.
00:33:22.200 While sessions can become extensive and intricate, at their core, they are relatively simple components that can be combined together.
00:33:35.470 This, to me, is one of the fascinating aspects of software development—it’s all just code. Concepts that seem complex often have foundations crafted around specific needs or problems.
00:33:46.960 And as you become familiar with these concepts, they begin to feel much more understandable. Whenever you encounter frustrating bugs, instead of enduring them passively, treat them as challenges.
00:34:06.350 Dig into the components of the issue until you grasp them. Transform those moments of confusion into enlightening lessons that you can leverage for your programming career.
00:34:22.700 If you’ve recently discovered something intriguing or want to discuss programming—or really anything—feel free to reach out. My email address is up here, and I cherish receiving emails; I read and respond to them all.
00:34:45.280 And if you’re ever in Seattle, I’d love to grab coffee. The final link I’d like to share is the resources for this talk, with slides, the gem to decode encrypted sessions, and several other relevant links.
00:35:02.400 It looks like I have a minute or so remaining for questions, if anyone has them. I’m happy to answer or creatively deflect them. Thank you so much again for your time!