git

Documentation Tradeoffs and Why Good Commits Matter

Documentation Tradeoffs and Why Good Commits Matter

by Greggory Rothmeier

In the talk titled "Documentation Tradeoffs and Why Good Commits Matter" at RubyConf 2018, Greggory Rothmeier discusses the importance of git commit messages as a form of documentation in software development. He illustrates how effective commit messages can bridge the gap in the documentation process, particularly in complex codebases.

Key points discussed in the video include:

- The Role of Documentation: Rothmeier acknowledges the common challenges faced in understanding legacy code and the significance of comprehensive documentation. He emphasizes that a solid understanding of prior decisions enhances future development work.

- Metrics for Evaluation: He introduces two critical metrics for evaluating documentation:

- Accessibility: Ensures information is easy to find. Rothmeier notes that while wikis can be good for onboarding, they often become outdated and difficult to navigate.

- Accuracy: Refers to whether the documentation is current and reflects the actual state of the codebase. He points out that comments and issue trackers, while accessible, can also suffer from accuracy issues.

- The Importance of Git Commit Messages: Rothmeier argues that git is a powerful tool for documentation, as commit messages directly link changes to the modified code. He provides an example of using the git blame feature to uncover historical context behind code changes.

- Creating Useful Commit Messages: He discusses best practices for writing commit messages, highlighting the necessity of a descriptive title and contextual information that could help future developers understand the intentions behind the changes.

- Managing Commit History: Rothmeier explains how tools like rebasing can maintain clean project histories by allowing developers to refine commits, merging multiple updates into a single commit while preserving contextual messages.

- Promoting Good Practices: Rothmeier encourages teams to prioritize clear and informative commit messages, which can significantly enhance project clarity and facilitate better collaboration.

In conclusion, effective commit messages serve as critical documentation tools that improve understanding and collaboration in development environments. By mastering the art of commit messaging alongside utilizing git features, developers can create a clearer historical narrative within their projects, greatly benefiting future work on the codebase.

00:00:15.410 Thanks everyone for joining me today.
00:00:18.270 In this discussion, I want to talk about git commit messages and some unexpected value that they can provide as a form of documentation.
00:00:21.420 Starting out, I want to address complexity.
00:00:24.930 I started at Stitch Fix a little over a year ago, and we have pretty good onboarding documentation. I read the wiki, went through the new hire processes, set up my computer, and felt like I had a good sense of the teams and the apps.
00:00:35.610 However, when I got my first story, which involved working on an application that's been around for about six years, I realized that it was a complex system.
00:00:41.940 When I began addressing a bug fix, I felt overwhelmed by the various interrelated components of the system. It wasn't clear why or how they were connected, which led me to ponder the role of documentation and where to find the answers I needed.
00:01:00.059 Two metrics stood out to me in evaluating documentation: accessibility and accuracy. Accessibility means that I don't need to work too hard to find the information. If it takes me a minute to find an answer and I’m doing that multiple times, that's inefficient.
00:01:05.570 On the other hand, accuracy relates to whether the information is current or if the code has evolved separately from the documentation. One way to secure clarity is through wikis.
00:01:19.920 Wikis are great for onboarding and high-level guidance, but they can quickly become outdated and are not very accessible. Searching through a wiki rarely seems effective in my experience, and often, the information can lack accuracy.
00:01:46.690 Next, we have issue trackers. For instance, at Stitch Fix, we use JIRA. While I can glance at recent issues to understand changes, these are even less accessible than wikis. Searching through issue trackers can be challenging, but they can be more timely and accurate.
00:02:06.410 GitHub is another option because it generally offers better search capabilities and gives us timely information on recent issues and pull requests.
00:02:20.910 Comments often get a bad rap, understandably so. They are highly accessible, which is why we tend to rely on them for documentation. However, their accuracy may suffer as the code evolves but the comments do not.
00:02:41.420 This brings us back to git and the value of commit messages. Git offers the best of both worlds. I'll show you some ways it's extremely accessible through text editor integrations, and it's very accurate—as you modify the code, you write a commit message that describes the change.
00:03:04.040 This message is tied to the changed code. For instance, if you're overriding a method, the new message will automatically associate with the new implementation.
00:03:29.740 Here’s an example I encountered. While working on generating purchase order numbers, I noticed the comment indicating that the application should wrap back to one after reaching 99999.
00:03:52.210 My first instinct was to question whether overriding P.O. numbers was the right approach. As I reviewed the code, I could follow how it functioned, but understanding the original reasoning was elusive. By checking the git blame, I found an old commit message denoting it as a work-in-progress, which was written four years ago.
00:04:37.410 In that scenario, the commit history matters. The work I’m doing now may still need to bear that historical context. For example, recently, I was changing CSS for a page by adapting it to a new framework. It's not prioritized in our sprints, so I tackle small parts when I have time.
00:05:04.700 That practice led to numerous work-in-progress commits. These half-finished efforts don't provide adequate context to my co-workers who might review the commit history.
00:05:43.030 One way to improve our commits is through the git log. Executing it can yield a verbose output that, while not always useful by itself, is paired effectively with command line options to navigate our commit history.
00:06:05.730 Many people prefer using GUIs to explore git history, which can also be a practical way to interface with commit data.
00:06:30.830 Integrating git blame into your development environment can provide useful annotations alongside each line of code. This shows who authored those lines and when, further streamlining development.
00:07:13.610 Yet, committing good messages is crucial for effective documentation. We need more than just the date of a commit; we need context around what the changes entail.
00:07:37.150 An essential part of a good commit message begins with a descriptive title, often starting with a capitalized verb. For example, you might write 'Add blog post' or 'Update CSS.' This is a practice I learned from Tim Pope's blog post.
00:08:02.170 Additionally, developers should take a moment to consider what questions they or their peers might have in the future when reviewing the code. Including potential inquiries in the commit message can be highly beneficial.
00:08:26.230 The use of issue trackers like JIRA or GitHub Issues is also recommended. By referencing an issue tracker in the commit message, others can find the original issues discussed and understand the business context.
00:08:56.690 For instance, if I started a commit message with 'Add posts,' I must ensure to take time to flesh out details about that task. Don't forget to mention if associating posts with users is planned for a future story.
00:09:27.910 So even if a commit is simple, a good title and some context can help others maintain clarity when looking back at the history.
00:10:06.920 Now, what about bad commits? If we look at our messy history, what can be done? Tools exist to manage git history effectively, like rebasing.
00:10:29.340 The method of rebasing has been a topic of confusion for many. In the past, I heard dire warnings against rebasing for fear of losing history. But once I grasped how it functioned, it empowered my workflow immensely.
00:11:05.410 For instance, rebasing allows you to take a series of commits and reapply them on top of another branch, thus maintaining a clean history. In a straightforward case, we can illustrate this with a commit that creates a new feature.
00:11:49.060 If I were to work on adding a post model and make some initial commits, it's essential to ensure those commits convey coherent, cohesive messages. If I've merged changes from another feature branch, rebasing allows me to update the commits without losing context.
00:12:33.500 As an example, after a feature branch has diverged from the master, I can switch back to my feature branch and run 'git rebase master' to take my work and align it with the latest changes in the master branch.
00:13:00.860 After rebasing, any future commits will now trail the most recent changes. This means those commits will carry a new SHA, which is crucial to understand when collaborating with others.
00:13:59.060 Perhaps I've made three commits to my feature branch, and I want to tidy up its commit history and avoid leaving in-progress work. In this case, I can run an interactive rebase where I pick, squash, or fixup commits based on my preference.
00:14:48.150 Once I've finalized modifications, I can merge changes back into the master branch, ensuring only relevant, contextual commit messages are recorded. This enhances the clarity of our project's history.
00:15:24.410 Another option for cleaning up a messier commit history before pushing to master is by using the squashing option available during a merge.
00:15:41.890 This allows me to take a cluster of commits and combine them into a single commit while ensuring I still provide valuable context through a thorough commit message.
00:16:09.270 Returning to our earlier discussion on accuracy and accessibility, promoting good commit messages within your team can yield significant benefits.
00:16:35.270 Thank you for being here today. I work at Stitch Fix, and although I'm not a huge Twitter person, you can reach out to me on that platform or via email.
00:16:52.470 Additionally, I will publish my slides after this talk. Please check out some of the resources I’ve referenced today, including articles on commit messages and using git effectively.
00:17:14.670 If there are any questions, I’m happy to address them now.