00:00:09.570
Good afternoon, everybody! How was lunch? Good? Good.
00:00:16.890
Awesome! This is the Junior Developer track, and I'm curious about everyone's backgrounds.
00:00:22.990
How many of you came to Rails from some kind of boot camp? Okay, how long was that boot camp? Eight weeks? Six weeks? Twelve weeks? Six months? Seven months? That's a long time, that's basically a degree.
00:00:34.870
How many people have a traditional computer science degree? Okay, cool.
00:00:49.390
And how many of you are learning by doing things, like watching internet videos and so forth? Well, just so you know, that never changes. I've been developing for a long time, and internet videos and just trying things out is pretty much how you have to roll.
00:00:57.309
Behind almost every Rails app, and almost every app in general, is a powerful language called SQL. I got the idea for this talk when I was working with a junior developer who was trying to build a dashboard page to provide some stats on user activity.
00:01:09.640
He was using the tools that are built into Ruby, because that was what he was familiar with. The problem with that is that when you put that code in production, all of a sudden instead of downloading just a couple dozen rows you have in your development system, you end up asking Ruby to pull down millions of rows for calculations.
00:01:25.210
This is a situation where clearly the database is the right place to be doing the heavy lifting. When we started pairing on this task, he explained the problem and I immediately jumped into a SQL prompt and started writing a SQL query that gave us the answers we were looking for for the dashboard.
00:01:39.159
The thing he said to me was, 'Wait, how did you do that?' This is a key moment for me because I've found that any time I say, 'Wait, how did you do that?' I'm about to learn something. It became a learning moment for both of us. I realized not everyone has the same SQL background, and he learned a lot about the tools available behind the scenes.
00:01:58.659
We spent the whole hour writing SQL queries, and I think we both came away learning a few things. SQL, to put it another way, is the hero that Rails deserves.
00:02:11.110
Let's address the pronunciation. Is it SQL or is it Sequel? I go back and forth between both; it doesn't really matter.
00:02:15.700
SQL is officially the correct term, but if you say Sequel and someone corrects you, that's totally fine. It used to be called Sequel and changed to SQL due to trademark disputes.
00:02:28.489
So, today we're going to talk a bit about the backstory of SQL, review some great examples, and discuss where to go from here.
00:02:42.560
A little about me – my name is Ryan DeLucas. I didn't sneeze through my name; it's actually pronounced that way. Anyone else here Polish? All right, we’ll find some barbecue pierogi after this.
00:02:56.000
I've been programming for about twenty years, with fifteen of those being professional work. I live in Cincinnati, Ohio with my family.
00:03:05.290
You can find me on Twitter, I'm @Lebowski. So, I don't really need to tell you my favorite movie.
00:03:15.420
I'm a consultant, which pretty much means I say yes to things before I actually know how to do them. Learning as you go is a skill that you'll develop as a developer.
00:03:24.430
Recently, I have started working as an instructor for Girl Development, which is an organization that helps adult women launch into technology careers. If that's something that interests you, talk to me afterward.
00:03:36.220
Now, let’s discuss the world before ORMs.
00:03:47.920
ORM stands for Object-Relational Mapping, which is an all-encompassing term for things like Active Record.
00:03:54.940
In the late '90s, we were building web applications, but we weren't using tools like Active Record.
00:04:01.000
Instead, we were putting the SQL directly in line with our code, using programming languages like Perl, ColdFusion, and PHP. PHP is still in use today, and more recently, the structure of PHP apps is much better than it used to be.
00:04:15.260
We would embed these SQL strings in the code and it got messy, complicated, and frustrating. Let me give you an example.
00:04:31.080
Say we had an application and wanted to display a list of all the people in our system. We could write a very basic SQL query to select the name and some parameters from our people table.
00:04:42.210
That's great; we have a query, but now we must execute it against our database and get back a bunch of data collected into an array of arrays.
00:04:54.510
We have all this data, but now to use it in our application in an object-oriented way, we need to create those people objects and collect them into an array.
00:05:07.439
We'll iterate through each row of the results to create a new person object and manually assign the data elements into the corresponding attributes of our objects.
00:05:27.400
Additionally, we must consider things like translating values, such as how in the database, it's often stored as 0 or 1 but in our object, we require boolean types.
00:05:38.700
This process of creating objects gets repetitive, monotonous, and error-prone. The code also becomes fragile.
00:05:52.560
If you change something in the query structure, it's easy to break that mapping. As developers, we often try to avoid repetitive tasks by looking for ways to automate.
00:06:02.060
I must admit something hilariously naive from my first paid application, back in the late '90s, it was a PHP app, a standard CMS-type of application for a small business.
00:06:15.800
Many of the forms looked pretty basic; we collect a few fields and submit them. But take a look at the top line where it submits to something called save_data.php.
00:06:28.000
My workaround was to pass a hidden field called table_name, and on the backend I would take that and dynamically build SQL queries to insert, update, or delete data from the database.
00:06:44.900
This way of coding is incredibly unsafe. If you don't understand why this approach is a bad idea, I suggest you attend some application security courses this week; you'll find them very insightful.
00:06:56.760
My solution went something like, 'Please don’t hack me on submit'. Although this was behind an admin interface, it’s still not the way to code.
00:07:09.170
Fast-forward to the mid-2000s, and ORMs were starting to gain popularity. In the Java space, we had Hibernate and JPA, while in Rails, we have Active Record.
00:07:25.080
With Active Record, we can take a giant mess of code and replace it with a simple one-liner.
00:07:37.020
If Active Record can do all that, why should I even bother learning SQL? One reason is because SQL is everywhere.
00:07:49.580
SQLite, which ships with Rails for development, is among the top five most popular pieces of software in the world.
00:08:00.340
SQLite is installed on every Android phone, every iPhone, every Mac, every Windows 10 PC, across all Linux distributions, and in many cars, airplanes—it's literally everywhere.
00:08:11.740
But even with that, why learn SQL if you can interact with it through Active Record and ORM tools? The truth is, ORMs come and go.
00:08:26.260
I've used three or four different ORMs, each with different APIs that change every time. Once you finally learn how to do something beyond the basics, you're looking everything up in documentation.
00:08:39.750
Three or four years from now, you might be using a different language and ORM, and you won’t be able to apply what you've learned about Active Record because the structures change.
00:08:51.570
However, the one constant language is SQL; it will always be the same.
00:08:59.100
So, apps, frameworks, and databases may come and go, but SQL will endure.
00:09:10.780
If you think this narrative is cheesy, you can thank my friend Chris Nelson from the Cincinnati user group for it.
00:09:17.470
The main point is that you want to learn the native language of whatever you are working on.
00:09:29.700
Richard Hipp, the creator of SQLite, mentioned in a podcast that being a tourist can get you by with pointers and a few keywords, but to succeed, you must learn the language.
00:09:38.750
By learning SQL, the native language of databases, you'll transition from a junior programmer to a seasoned pro who understands the right tool for the job.
00:09:56.170
Today, we'll look at some examples to demonstrate this.
00:10:04.310
To do this, we need to create a basic schema for authors and books using Active Record, as it's great for this kind of thing.
00:10:22.730
In our schema, an author has many books, while a book belongs to an author.
00:10:30.160
We will start by defining fields for authors, such as name, country, email, and birthdate.
00:10:40.070
For books, we’ll have fields like titles, pages, publication date, topics, ISBN number, and most importantly, an author ID.
00:10:47.340
This helps establish the relationships in the database where books are tied back to their corresponding authors.
00:10:56.140
Let’s assume you’re on a new project, or perhaps it's an existing one, and your boss or client wants information about the data you have.
00:11:04.480
For instance, they might want to know how many books you have. The SQL for this is pretty straightforward, and similarly, here's the Active Record call to achieve it.
00:11:25.360
To count how many authors we have, the SQL statement would just be a simple select to count the IDs from the authors table.
00:11:31.660
In Active Record, you would just call Author.count.
00:11:41.650
Now, let's see how we would handle this in a database terminal. It depends on your database, but if you're using Postgres, which I suggest, you can launch it against your database.
00:11:54.150
You will find yourself in another prompt where you can type commands to interact directly with the database. You would input your query and receive the results.
00:12:03.270
For example, we could discover that we have about four thousand books. Similarly, we can check the authors' table query and find that we have around five hundred authors.
00:12:11.030
That’s all very basic, but let’s dive deeper. We could ask how many books we have in each topic, which adds an interesting layer.
00:12:19.760
To do this, we would use a tool we haven't yet discussed called GROUP BY.
00:12:26.080
We would select our topic from the books table, count the occurrences, and group them by that topic.
00:12:38.160
This gives us a set of results with one line for each topic, along with the count of books associated with each topic.
00:12:48.630
Does that make sense to everyone? You can also execute this in Active Record. However, both are straightforward queries.
00:13:00.510
As you get more complicated in your queries, like the one we're about to create, you may find that writing the Active Record code takes as long as writing it directly in SQL.
00:13:14.060
Let's explore a more complex query—we want to find out how many books we have in each topic while also finding the minimum, maximum, and average page count.
00:13:23.410
To achieve this, we would start our SQL query like this: select topic, count(topic), minimum pages, maximum pages, average pages.
00:13:34.060
We will call minimum as a function the database provides, along with maximum and average calculations.”
00:13:43.930
After aggregating, we will group by topic. When viewing the results, we will have our topics listed along with their respective counts and our calculated page statistics.
00:13:56.360
While you may not explicitly calculate pages count, you might track user activity behind the scenes, requiring you to analyze metrics scattered across various tables.
00:14:05.660
Allowing the database to do the heavy lifting is far more efficient than trying to pull extensive data sets into your Rails application.
00:14:16.020
Now, let's discuss how to join data together.
00:14:29.390
How can we answer the question, 'Which authors have written five or more books on a given topic?' This question is relevant for outreach opportunities.
00:14:41.180
A join combines two tables based on criteria; in this case, an author's ID and a book’s author ID.
00:14:53.700
For our SQL, we'd start with selecting the author's name and the count of their books.
00:15:05.950
We'll select from the authors' table and join it with the books table linked by our criteria, grouping by both the author's name and the topic.
00:15:17.750
Next, we will want only those authors who have written more than four books, so we'll use a HAVING clause to limit our result set.
00:15:29.020
This gives us the data needed to see which of our 500 authors have actually written five or more books.
00:15:36.530
While reviewing the results, we might notice that topics like Ruby and Rails didn’t show up, but that’s just the randomness of our test data.
00:15:46.600
You can build upon these queries and nest them as needed. If you had sales data, for example, you could refine your search to include how many authors sold over a thousand copies.
00:15:54.890
Joining data is essential. We just covered a basic join, but there are multiple types of joins to explore.
00:16:09.570
Suffice it to say, there are ways to retrieve unmatched data alongside matched data.
00:16:19.470
Let's discuss a tool called EXPLAIN, which shows how the database is working behind the scenes.
00:16:36.300
This is useful for diagnosing performance issues. When you run into a situation where your local application runs well but the production version does not, this tool can help.
00:16:47.350
You can easily run EXPLAIN by prefixing any query. It will analyze the execution and tell you how it plans to find the information.
00:17:03.000
You can also use EXPLAIN ANALYZE, which runs the query and provides actual execution details.
00:17:12.020
EXPLAIN outputs can look dense and complicated, but a key takeaway is that if you see a 'sequential scan', that's usually a red flag.
00:17:23.050
A sequential scan means the database has to look through every single row, which is very slow. You can add an index to your database to help with this.
00:17:34.890
There is definitely an art to creating effective indexes, but the first step is running EXPLAIN on any slow queries or even just as a habit.
00:17:43.700
You can also check your development logs used by Rails; they log actual SQL queries, letting you investigate further.
00:17:50.720
By understanding the database's workings, you can learn to optimize.
00:18:01.840
Now, we have to discuss direct SQL calls from within a Rails application.
00:18:10.570
I want to emphasize that executing raw SQL directly can be dangerous; there are many edge cases.
00:18:21.440
You might use it in migration when Rails does not know how to create certain column types, such as geography type columns.
00:18:36.950
In this case, you would acquire a connection from the Active Record connection base to execute your SQL.
00:18:45.300
Another method can be using find_by_sql, but you must be cautious not to include unsanitized user input.
00:18:57.630
Using find_by_sql would return instantiated objects of the model type you've invoked.
00:19:04.480
For any data not specified in your parameters within your SQL, they return as virtual attributes on your model.
00:19:14.550
If you don’t know how to sanitize inputs, be careful: remember Bobby Tables?
00:19:27.050
You never know when little Bobby tables will show up in your application, so always sanitize your user inputs.
00:19:41.010
So, where do you go from here? There is obviously a lot to learn about SQL.
00:19:52.510
If you don’t have a book on databases and SQL at your desk, pick one up. SQL doesn’t change much, so a good book will serve you well.
00:20:02.610
Check the documentation for your specific database to learn about features and aggregation functions.
00:20:15.390
Pair programming is the best way for a junior developer to level up quickly; working alongside senior developers exposes you to real-world problems.
00:20:31.260
If you don’t have an opportunity to pair in person, there are many online platforms to connect with others.
00:20:41.800
Remember to scour your logs for database queries, understand them, and run EXPLAIN to get insight into your application's performance.
00:20:55.370
The takeaway is to learn the native language of your database.
00:21:02.460
Understanding what Active Record does for you behind the scenes is crucial.
00:21:10.420
Use Active Record as a great tool, but know when to jump in and use the database directly.
00:21:18.190
You have many tools at your disposal; knowing when and how to use them is key.
00:21:29.380
Thank you very much! If you have questions, I’d be happy to take them; we have a couple of minutes.
00:21:57.170
Yes?
00:22:02.460
My recommendation is don't execute raw SQL from within your Rails app without knowing what you're doing.
00:22:15.160
There are a few different ways to do it. You can get a connection from Active Record’s connection base and execute raw SQL.
00:22:22.950
This is useful inside migrations, especially when you need to create specific types of columns.
00:22:31.850
Using find_by_sql is another way to incorporate raw SQL, but be careful to avoid user data.
00:22:41.950
Using find_by_sql allows for returning instantiated objects.
00:22:48.600
These objects will have virtual attributes based on what you specify in your SQL.
00:22:56.370
Are there any other questions?
00:23:04.880
I don’t have a specific example for when table rewrites are necessary.
00:23:12.400
With any migration, you should be cautious to ensure your application remains stable.
00:23:20.670
For example, in Postgres, creating indexes concurrently is one way since Rails doesn’t usually support this.
00:23:31.260
You can build your migration by hand that utilizes this feature.
00:23:36.350
Does anyone have any additional questions?
00:23:45.550
It's not necessarily faster to write SQL directly; Active Record usually does the right thing.
00:23:55.450
You can observe this when looking at how your application operates.
00:24:06.250
Active Record translates your commands directly into SQL queries executed against the database.
00:24:15.730
In some scenarios, stepping in with raw SQL will provide better efficiency.
00:24:24.790
The key is to observe what Active Record does and learn from it. Thank you all!