00:00:14.240
Hello! So, are you hungry? I have good news for you.
00:00:20.840
After my session, you can have lunch. First, I want to thank the organizers for the conference.
00:00:28.000
As I mentioned in yesterday's panel discussion, to become a senior developer, we need solid and broad knowledge.
00:00:39.840
This year, Ruby on Rails has turned 10 years old and remains one of the best frameworks in the world.
00:00:46.320
The Rails core team keeps adding new features and best practices into the framework.
00:00:54.160
However, nowadays, there are more voices claiming that Rails is not for beginners anymore. This statement is both true and not true.
00:01:05.560
It is true because there is so much background knowledge required besides Rails itself. It's not true because Rails has done a good job of hiding those details.
00:01:20.280
Regardless, if we want to become senior developers or architects, we must learn those foundational concepts.
00:01:26.479
Today's topic is about databases in Rails. Let me introduce myself.
00:01:33.720
I'm Ding Ding Ye from mainland China. You can find me on Twitter and Facebook.
00:01:40.399
I am the founder of a simple and productive CL tool for product teams, and I'm also the organizer of Rails China.
00:01:45.439
We will hold the sixth Rails China conference on November 1st and 2nd, so welcome everyone!
00:01:57.000
I also run a podcast called Tiawa, which targets engineers in China.
00:02:04.240
Now, let’s start my topic. We all know that real-world applications are complex, especially Enterprise applications.
00:02:11.440
Our goal is to have a low load but a low response time. This is the architecture most commonly used in real-world applications.
00:02:16.920
We have the front-end web server handling requests from browsers or clients and then passing them to applications.
00:02:29.519
The application server accepts these requests and interacts with the database server for data exchange.
00:02:43.000
Many studies test where the performance bottlenecks are among these components. Based on appearances, we see that 44% of performance bottlenecks occur in the database server, 32% in the application server, 14% in the web server, and 10% on the network.
00:03:05.360
This result suggests that nearly 76% of performance bottlenecks appear in the application and database servers.
00:03:12.159
Identifying performance bottlenecks is crucial for optimizing application performance.
00:03:20.000
As we can see, database performance is critical for overall application performance. There are many factors that contribute to poor database performance.
00:03:34.239
These include inefficient SQL statements, poor data model design, inadequate database configurations, missing or inefficient indexes, inefficient data caching, and long concurrent transactions.
00:03:48.959
To avoid these problems, let's explore some best practices.
00:04:01.239
Every application starts with data modeling. This is the job of the database designer who transforms user requirements into application data models.
00:04:15.120
Let's look at a simple example. Here, everyone should be familiar with this code snippet. It creates a table called 'users' with attributes such as user ID, name, email, confirmation status, and account ID.
00:04:28.520
We define several constraints in our schema, such as 'name' should not be null, and 'confirmed' should be true by default. This ensures data integrity.
00:04:40.880
Data integrity constraints protect against accidental damage to the database.
00:04:52.840
For our example, we also create a primary key called 'ID' for the user table, which is an auto-incrementing field.
00:05:04.240
The second type of constraint is 'NOT NULL' because we mark the 'name' of the user as required. The third type is 'UNIQUE' to ensure that the 'uid' field is unique across the table.
00:05:19.199
The fourth type is the 'REFERENTIAL' constraint, ensuring that the 'account ID' of the user must also appear in the 'accounts' table.
00:05:39.039
However, database integrity is managed within your application models rather than solely in the database itself. That's why we use validations to enforce data integrity.
00:05:52.479
The benefit of handling validations at the application level is that we can easily customize error messages when a constraint is violated.
00:06:05.680
Let's elaborate a bit more on referential constraints. Before we create a user record, we want to ensure that the corresponding account record exists.
00:06:18.319
If we delete the account without any control, the user record will remain in a dirty state because it cannot find the account anymore, violating the referential constraint and causing bugs in our application.
00:06:39.240
That’s why we use dependent options to control what happens to associated objects when the account is destroyed.
00:06:53.000
There are several options available, such as 'delete', which destroys associated user records when the account is destroyed, or 'nullify', which sets the user’s account ID to null.
00:07:21.160
The 'restrict' option raises an error if there are associated users when attempting to delete the account.
00:07:28.720
Nonetheless, it is still possible to violate referential constraints if we use statements such as 'account.delete', because no callbacks will be triggered.
00:07:40.800
If we require real foreign key support, we should use the 'foreign key' option that is included in the Rails 4.2 framework by default.
00:08:07.599
Let’s discuss another topic: views. This is not the view in MVC but rather a database view.
00:08:24.000
Database views are not widely used in real applications but can be very useful in certain situations.
00:08:35.440
A view is essentially a virtual table based on the result set of a SQL query. For example, we might want to create a view to list user memberships in projects.
00:08:53.280
This can be complex with multiple joins, but by creating a view, we simplify the query.
00:09:05.600
In essence, by defining a view, we can streamline our database interactions, avoiding unnecessarily complex queries.
00:09:27.400
However, remember that the view is not a real table and does not store data itself.
00:09:34.680
Views can often be read-only and cannot be directly inserted or updated.
00:09:41.560
Another type of view is the materialized view, which should be used when the result of a view does not change frequently.
00:09:54.560
Materialized views will store the result in the database, allowing for faster query results at the cost of additional maintenance.
00:10:06.360
Now, let's talk about indexing. Recall that nearly 10% of database performance problems arise from missing, ineffective, or inefficient indexes.
00:10:20.000
Indexes are the primary mechanism to improve performance in a database.
00:10:32.480
When working with a specific table, it’s essential to know which indexes to create and the trade-offs involved.
00:10:49.600
Index types vary based on their algorithms, with B+ trees being the most widely used in nearly all databases.
00:11:11.440
B+ tree structures allow efficient searching, improving retrieval times over linear scanning.
00:11:27.720
When searching for a record, the index helps to efficiently locate the desired data without scanning every record.
00:11:42.240
In conclusion, while indexes can significantly boost performance, they come with trade-offs regarding storage space and maintenance overhead.
00:12:03.360
Understanding these factors will help in determining whether to create an index.
00:12:15.520
Finally, let's discuss transactions, which are crucial for maintaining data integrity within a database.
00:12:30.280
A transaction is a sequence of operations performed as a single logical unit of work.
00:12:46.840
For example, in Rails, when updating a user’s name, if a validation fails, the transaction can be rolled back.
00:13:05.440
These transactions help ensure that the database remains consistent.
00:13:19.080
Understanding how transactions are wrapped around operations will help avoid confusion surrounding callbacks.
00:13:33.440
The ACID properties (Atomicity, Consistency, Isolation, Durability) of transactions ensure that data integrity is maintained.
00:13:51.080
For example, during concurrent transactions, isolation ensures that transactions do not interfere with each other.
00:14:08.160
In practice, there are different isolation levels which determine how transactions interact.
00:14:23.600
Understanding these isolation levels is important to prevent unexpected results.
00:14:35.760
Transactions are a fundamental part of working with databases and knowing how they work will make you a better developer.