Talks

Maintaining a big open source project: lessons learned

Maintaining a big open source project: lessons learned

by Leonardo Tegon

In the video titled "Maintaining a Big Open Source Project: Lessons Learned" presented by Leonardo Tegon at RailsConf 2019, the speaker shares his experiences and insights from maintaining the Devise Ruby gem. He emphasizes that anyone can contribute to open source projects, regardless of their prior experience, and offers practical steps on how to get started.

Key Points Discussed:
- Beginnings of Open Source Involvement: Leonardo began maintaining Devise after joining Plataforma Tech, highlighting the challenges faced with inactive open source projects and the heavy lifting required to reactivate them.

  • Importance of Issue Triage: He emphasizes the value of triaging issues to prioritize which problems are most critical, suggesting that new contributors can start by requesting more information from users who report issues.

  • Common Contributions: Leonardo outlines various ways to contribute without writing code, such as reviewing pull requests, sharing solutions, writing documentation, and engaging in issue triage.

  • Learning Through Mistakes: He discusses significant lessons learned through mistakes, particularly a case where a seemingly simple code change led to widespread issues. The incident reinforced the need for careful consideration of backward compatibility and proper documentation of decisions made.

  • Communication and Community: The importance of effective communication within the open source community is stressed. He advises maintaining a respectful atmosphere and following a code of conduct to foster positive interactions.

  • Self-Care and Collaboration: Leonard encourages maintainers to take care of themselves and seek help when needed, stressing that one cannot do everything alone in open source work.

  • Encouraging Company Support: The talk also suggests ways to advocate for institutional support for open source contributions, such as highlighting the professional development benefits and enhanced company branding associated with open source work.

Conclusions and Takeaways:
- Everyone is capable of contributing to open source projects, and starting with small, manageable tasks can build confidence.
- Documenting decisions and coding practices is essential for maintaining the project in the long term.
- Positive and respectful communication fosters a welcoming environment for all contributors.
- It’s crucial to ask for help because the community is there to support each other, enabling personal and collaborative growth in coding practices and project management.

00:00:20.840 Hi! So, two years ago, I joined Plataforma Tech, and this is the most recent picture of our team.
00:00:25.890 If you zoom in a little bit, you're gonna see me right there. Don't ask me why they look so satisfied in this picture because I had no idea.
00:00:31.320 Anyway, this is how you can find me on the Internet. By looking at the previous slide, you can guess that I post a lot of silly stuff on Twitter.
00:00:36.900 If you want to follow me there, do it at your own risk. But also, if you want to give feedback on this talk, that would be really appreciated.
00:00:42.210 You might have heard of us due to our open-source work, but the Plataforma Tech team is known for some Ruby gems like Devise and Simple Form. The opportunity to work on those gems was one of the reasons that made me want to join the company in the first place.
00:00:54.840 However, when I got there, most people maintaining those projects had either left or moved on to a new language. The truth is, we get paid by clients to deliver projects, so most contributors did open-source work in their spare time. This meant that once those contributors left, the projects became inactive.
00:01:12.600 Issues and pull requests would sit there for weeks without any response, and I wanted to help, but I didn't know where to start.
00:01:24.479 It was only at RubyConf in Brazil in 2017, where a fellow named I gave a keynote about the future of the Ruby community and talked about how Ruby is not the cool new thing anymore. Everyone now seems to want to talk about Go, Elixir, or Rust.
00:01:51.119 So, the question was, what's the future? He stated that the future of the Ruby community is brilliant, but it depends on us. Basically, he was saying that everyone can help! We can do open-source work, write blog posts about cool things we've learned, organize and attend local meetups, or simply talk to our colleagues and share something cool at work.
00:02:35.790 These are all ways to keep the language interesting for developers and to give back to the community. So, after that, I made the decision to maintain the Devise gem. Just to give you an idea, at that time, we had about 150 issues and 51 pull requests.
00:03:10.440 Now, this is how things look, which I believe is significantly better. Thank you. The aim here today is to share my journey with you, the challenges I faced, the mistakes I made, and the lessons I learned from them.
00:03:28.000 I want to tell you that if you've ever wanted to contribute to open source but felt you're not good enough, hopefully, my story can convince you otherwise. I genuinely believe that everyone can do open source.
00:03:51.410 I know it's hard to start, and part of that is because we think we need to code to help. We feel we have to comprehend the entire codebase before we can contribute, but I think the quickest way to start is through issue triage.
00:04:07.890 What does that even mean? Triage is the process of examining problems to decide which ones are most serious and need to be addressed first. For open-source, this means ensuring that issues follow the recommendations in the project's contributing guide.
00:04:30.179 Some projects have a file like this one, which contains guidelines on how to contribute and report issues. For example, in Rails, they ask you to send feature requests to a mailing list, but we accept feature requests in Devise, so these guidelines can differ among projects.
00:04:48.750 I recommend reading that file before you do anything else. If the project doesn’t have one of these documents, you could create an issue asking for it; that would be a great way to start contributing.
00:05:00.180 Here is what we ask for contributors in Devise when reporting issues: include a title and a clear description, provide as much relevant information as possible, and either a test case or a simple Rails app that replicates the issue. This is essential because it's hard to fix bugs without that information.
00:05:16.920 If someone tells us, 'Our site isn't working,' one of the first things we ask is 'Which browser are you using?' In Devise, we also request a failing test case because we want people to try reproducing the issue in isolation, without any other gems—just Devise and Rails.
00:05:38.460 We ask this because the most common case we encounter stems from misunderstandings about how the code is expected to work, and we have tests in place for it, or at least, we hope so. Unless a lot of people are experiencing the same issue, chances are it's caused by a very specific combination of their application setup.
00:05:56.559 Now, this doesn’t mean it can't also be a bug in the library, but by doing this triage, we can find out what's wrong with the combination of things causing the issue. This provides us with a better direction for where to look.
00:06:21.050 So, the first thing I did was look for issues that were missing information and ask the author for it. I would then add a 'needs more information' label to those issues.
00:06:37.260 I want to give a side note here: you probably see I've replaced this person's handle with a GitHub ghost user profile. If you want, you can probably find out who they are, but I ask you not to, as I don't want to expose anyone in this talk; it's easier for me to discuss situations and not individuals.
00:06:58.560 Around two weeks after I asked for that information, I'd look for issues with the 'needs more information' label to see if they received any response. If they didn't get back to me, I'd close them.
00:07:23.100 I know two weeks is an arbitrary value, but it seems like a reasonable time for people to respond. If they haven't, it could mean the issue isn’t a priority right now or they couldn't reproduce it in isolation.
00:07:42.120 Another thing the Devise contributing guide suggests is to avoid opening issues to ask questions. Please go through the project's documentation and source code first, or try to ask your question on Stack Overflow.
00:08:02.300 I want to emphasize that it is incredibly important to ask questions, but the issue tracker may not be the best place for them. This is because the only people receiving notifications there are the maintainers and the watchers—typically, a small group of people.
00:08:25.290 If you open an issue asking a question, it might take a while for me to answer, so it’s usually better to ask on public forums like Stack Overflow. Again, this will depend on the project, as some projects may accept questions in their issue trackers.
00:08:43.320 That's one of the things I focused on during issue triage—constantly looking for problems that were being pushed through without sufficient explanations and trying to provide additional guidance.
00:08:57.290 I did a lot of this in the beginning, but I've been trying to do something different lately. When I realize that a person might be a beginner, I try to give them a more comprehensive answer.
00:09:18.180 It's easier, after a few years of experience, to forget what it was like when we started. I remember being afraid to ask questions at work—I worried I would sound stupid.
00:09:39.310 I think it takes a lot of courage to ask a question, especially in a public forum like GitHub. I don’t want someone's first experience to feel daunting or discouraging, so when I can, I provide them with thoughtful responses.
00:09:56.360 For example, once someone opened an issue saying they were not seeing the flash message in Devise. I explained to them that Rails doesn’t show those flash messages by default—you have to set it up in the configuration.
00:10:15.220 I wish I could do this more often, but sometimes I simply can’t because time constraints arise, and some help requests can be more complex than this one.
00:10:32.070 What I'm saying here is that doing issue triage is a great way to start contributing. It made me feel productive very quickly, and it doesn't require having intimate knowledge of the codebase.
00:10:48.470 After doing it for a while, I gained a lot of context about the project and the current problems it was facing. Through this process, you'll get to know how the maintainers work and the common queries they encounter.
00:11:07.560 I highly recommend reading Steve Klabnick's blog post—he explains how he started doing issue triage in Rails. An interesting fact from his article is that he dedicated just 15 minutes per day to it, either before work or after lunch.
00:11:26.030 I often do issue triage while waiting for continuous integration to finish running. Now you know how to start doing this without having to spend a lot of time.
00:11:44.570 But you might be thinking, 'But I’m not a maintainer!' I need to tell you that the only thing you can do is click on the merge button. Seriously, even if you don't have push access to the repository, you can still contribute by doing triage in the same manner that I do.
00:12:05.390 In fact, some people are already doing this in Devise, like one person who requested a sample application and another who asked questions to clarify the issue.
00:12:22.700 These are just examples of how you can begin contributing. There are many other ways to pitch in, like reviewing pull requests.
00:12:38.410 You can look for possible issues in pull requests that lack tests. Honestly, if I see a pull request that doesn’t have tests, that’s the first thing I look to address.
00:12:57.600 Another important contribution is to verify any reported issue. If you experience the same issue, and a pull request is available, just test it. Run your application with that code to see if it works.
00:13:17.900 Provide feedback on the pull request by sharing what you found during testing. This greatly aids maintainers by confirming that other users experience the issue.
00:13:38.350 You can also assist by discussing what you have found on the issue page or by sharing your solution—this could involve writing blog posts, asking questions on Stack Overflow, or simply pitching in on the GitHub issues tracker.
00:14:04.790 For instance, if you know the answer like this person did, don’t hesitate to share your knowledge with the community.
00:14:25.609 Lastly, you can contribute by writing documentation. When I began, I struggled a lot with the Devise test suite. I spent a good amount of time figuring out how to run tests on specific Rails and Ruby versions.
00:14:46.670 After I managed to get it working, I wrote a piece of documentation for others, which I hope helps future contributors to navigate similarly.
00:15:04.060 Those are just a few examples of how you can contribute without having to write code. For me, open source is about helping people, and there are many ways to do this without facing the barriers of writing code.
00:15:23.750 But you may wonder, 'What happens when they send us the required information?' This is when it's time to reproduce the issue.
00:15:40.880 You can run a failing test case or follow the reproduction steps to confirm whether the issue is occurring. If it is, we start digging in to understand the root cause.
00:16:01.680 The next step depends on the nature of the issue. If it's a regression, meaning it was working in a previous version, we can use the 'git bisect' command to identify which commit introduced the regression.
00:16:23.070 I won't dive deep into how to use 'git bisect' here, but I recommend reading JSON Draper's article, as he explains it very well.
00:16:46.140 Sometimes, we deal with issues that never worked before, and finding the root issue can be challenging. It’s like being given a ticket at work without any context, forcing us to dive into the code to see what's happening.
00:17:05.490 For this reason, a resource that's been very helpful for me is Pry. Many of you may already be familiar with it, but I want to share a couple of articles to help others get started as well.
00:17:25.750 One is 'Debugging Rails with Pry', and the other is GitLab's 'Pry Debugging Guide.' Both provide insight into effectively using Pry for debugging purposes.
00:17:43.640 Another recommendation is a piece titled 'Reading Your Framework's Source Code' by Alicia Keys. It explains techniques that help us better understand and get to know the codebase, like looking at unit tests, which can serve as documentation.
00:18:05.460 Now, let's discuss what happens when we locate the code that's causing the issue. I believe it's always challenging to make sense of code we are not familiar with.
00:18:29.310 As my colleague Hanji noted, there's a steep learning curve when we read unfamiliar code, leading us to wonder why specific decisions were made rather than considering alternatives.
00:18:43.570 The fact is we lack context about the code; the only information we have is the commit date and the author—it doesn’t inform us about the intentions behind those decisions.
00:19:02.890 Let me share a situation to illustrate this. Someone reported an issue indicating it was possible to create records in the database during signup, even when validations failed. This feature in Devise tracks certain statistics about users.
00:19:27.230 It turns out that when we do this, it ignores validations. Someone submitted a pull request, which was a one-line change—what seemed to be an obvious fix and a minor change to us.
00:19:49.710 After merging it, we didn’t anticipate that many applications depended on this behavior, and once we made that change, things started to break. We encountered multiple issues afterward.
00:20:10.590 This made me realize our limited vision—we often make assumptions based on past experiences. In all the Rails applications I had worked on in the past, having valid modules was routine.
00:20:26.290 So, I thought it was uncommon to have invalid users in the database; however, based on the number of reported issues, it was evident that it wasn’t so rare after all.
00:20:42.910 This led me to reconsider our changes when we encountered another issue where a person called an external service within a validation callback.
00:20:59.350 Since the server charged them per request, when we started validating the model in places we hadn’t before, it resulted in substantial unexpected costs for them.
00:21:16.240 I felt terrible about this situation and began to question my decisions because clearly, we lacked the capacity to handle this properly and, as a result, someone ended up spending unnecessarily.
00:21:31.560 In these scenarios, it's easy to feel overwhelmed, so it’s crucial to reach out to others for advice. I consulted with some colleagues and other open-source maintainers.
00:21:49.660 They all agreed that it would be best to roll back that change, as we could not predict that many applications relied on this behavior; similarly, such useful changes should be done in a major version.
00:22:06.290 So, I resolved the original issue in a different way, which removed the validations. I also encouraged that person to keep those types of external requests controlled, away from validation callbacks.
00:22:24.960 This is because they could be triggered in contexts we might not expect, like debugging in production or while performing batch data operations.
00:22:41.890 Moreover, new team members might introduce code that triggers those validations without being aware of potential side effects.
00:23:00.790 Testing and code reviews don’t always prevent such issues. This brings us to my next lesson: it's okay to make mistakes.
00:23:16.250 It's okay because we can learn from them. When we make mistakes, we should focus on the future instead of dwelling on the past. This means asking ourselves, 'What can I do from now on?' instead of 'What should I have done?'
00:23:31.430 After this incident, I became much more cautious about changes that could disrupt backward compatibility. Now I think about ways in which the code could disrupt existing applications more than I had before.
00:23:49.060 This mistake fostered growth for me, both as a developer and as an open-source maintainer.
00:24:03.400 Another lesson I learned after that incident was to investigate the commit message that introduced the faulty code—to understand why it was included in the first place.
00:24:16.400 While it wasn’t very helpful, it made me realize just how old the code was, being nearly a decade old when we accepted the pull request.
00:24:32.670 This brings us to the next lesson: document your decisions. We cannot solely rely on our memory.
00:24:48.600 While it would be wonderful to ask others whenever we have questions about their code, if we can’t ask them, they may not remember either. Can you recall the reasoning behind any code you wrote a decade ago? I certainly can't.
00:25:04.950 This is why it’s crucial to document decisions: explaining why we did something and listing the alternatives we considered. The code itself doesn't tell the complete story.
00:25:21.200 One of the best methods to document your decisions is in commit messages. If you haven’t yet, I strongly recommend reading Caleb Tonsil's article, which offers valuable tips on writing good commit messages.
00:25:38.820 Messages should include answers to fundamental questions—for example, 'Why is this change necessary?' and 'How does it resolve the issue?'
00:25:54.780 Another article I greatly like is 'How to Write a Good Commit Message' by Chris Beams. This piece focuses more on the structure of messages and proposes guidelines for improving commit messages.
00:26:08.640 To summarize, a good commit message presents a brief overview of what changed, and details that provide context on why it was necessary—it should also indicate that it could be removed without disrupting anything.
00:26:23.420 From the validation issue and others we encountered, I learned that code is expensive. A single line in a pull request might seem harmless, but it can cause a lot of trouble.
00:26:41.470 For open-source libraries, once we have a public API, we can only remove it in a major version, and we want to avoid creating a nightmare for users during upgrades.
00:26:58.300 Therefore, I now only merge feature requests under specific circumstances. If you have a great idea, that’s fine, but I’ve begun asking more questions about its usability.
00:27:09.390 Is it reusable for other applications? Is it an edge case? If it is, maybe it’s not worth adding to the library. Is it flexible, or is it too complex?
00:27:28.320 All of these questions help me understand whether the cost associated with the code is justified or not.
00:27:44.660 The next point I want to make is to be nice when doing open source. Many people have had bad experiences; some individuals can be rude or disrespectful.
00:28:03.500 It’s important to have a code of conduct in the repository to ensure everyone has a harassment-free experience. As a maintainer, you should act when this code of conduct is violated.
00:28:20.650 Additionally, I recommend reading the book 'Nonviolent Communication' by Marshall Rosenberg. His strategies for communication and conflict resolution are simply remarkable.
00:28:38.790 In conclusion, I strive to respect the opinions of others, not mock them just because I disagree. I ask questions and seek out diverse perspectives before jumping to conclusions.
00:28:59.800 Furthermore, I try to have empathy, as the challenges someone faces can be contributing factors to their behavior. For example, tight deadlines and production outages can raise stresses.
00:29:20.120 There was a time when someone commented on an issue, saying: 'It’s a bug, but it seems like unless it affects many people, no one cares.' All I could see was the 'no one cares' part, and it truly upset me.
00:29:39.050 To handle situations like that, I follow the advice of JSON Fried, which suggests taking a five-minute pause before responding. This works exceptionally well for me when I feel angry about something.
00:29:55.590 Waiting gives me time to let my emotions settle, so I can respond more thoughtfully. After addressing the earlier comment, I explained that the issue was closed, and I may not keep track of every notification.
00:30:11.780 I suggested opening a new issue so it would definitely catch my attention.
00:30:26.000 So, my final point is, if you're going to do this, please take care of yourself. There will always be another issue to work on or another pull request to review.
00:30:41.560 Don’t feel guilty if you can’t address everything. Also, utilize automated tools like issue templates to provide guidance to contributors on the information they need to supply.
00:30:58.020 These tools can alleviate your need for extensive issue triage. In addition, I rely heavily on GitHub's saved replies to improve efficiency as handling numerous issues can be tedious.
00:31:13.450 I also use GitHub's feature that automatically closes issues after being inactive for a while. If someone comments on a closed issue, it reopens, which is quite handy.
00:31:31.320 Lastly, don't hesitate to ask for help. It’s easy to assume we should know everything as maintainers, but that’s not the case; in fact, it is not something we can realistically do.
00:31:49.500 Seek advice from fellow open-source maintainers, ask your colleagues, and engage the community for assistance. It took me a while to understand this.
00:32:07.490 For example, when a user discovered a security issue in Devise, I created a patch, and we merged it. However, someone asked if a CVE was created for this issue.
00:32:30.470 Although I had seen CVEs before, I didn’t know what they were or how to create one. I was too insecure to admit this in the issue, so I did a lot of research and responded a month later.
00:32:50.740 But it turns out the person asking was Reid Laden, who works at HackerOne, and was available to help us get the CVE.
00:33:09.090 So, if I had openly stated my lack of knowledge early on, we could have resolved that much sooner.
00:33:26.090 Thank you, Reid, for your assistance with this! Another important lesson to remember is to show your company why open source is significant.
00:33:41.930 It can be quite demanding to work on open source in our spare time, and we don’t want to keep doing it that way indefinitely.
00:33:57.740 You can explain to them that participating in open source helps us learn so many skills. This gives us opportunities to write blog posts and present talks, all contributing to the company’s branding.
00:34:12.920 These efforts can attract more clients, and allowing developers to engage in open source part-time also enhances motivation, as repetitive tasks become dull.
00:34:30.440 Furthermore, open source helps us learn about Rails internals, documentation writing, and much more, all critical for our professional development.
00:34:45.800 Because of these benefits, some companies have initiated 'OSS Fridays' to allow employees to work on open source without doing so in their free time.
00:35:02.330 Other companies offer APIs so developers can work on open source during hours, improving morale because they get paid for contributing.
00:35:18.600 In conclusion, this has been my story. If you find it uninteresting, I want to convey to you that you don’t need to be a coder to help out—there are many ways to contribute.
00:35:35.390 However, when you do contribute, document your decisions, as other contributors and future you will be grateful for that.
00:35:52.150 Be kind, be respectful, and remember to ask for help. Don’t attempt to resolve everything alone—the community is there to assist you.
00:36:08.140 Lastly, if you're looking to start contributing and need help, feel free to approach me as I was fortunate to receive assistance in my journey.
00:36:23.010 Rosie helped me begin by showing me how to do issue triage, and Raquel reviewed many pull requests while answering challenging questions.
00:36:38.510 Also, my former colleague Felipe has been an invaluable partner in maintaining the Simple Form gem; he truly encouraged me to start with Devise and give this talk.
00:36:57.050 Without him, I probably wouldn’t have accomplished any of this.
00:37:05.780 So, I want to express my gratitude to all these individuals, to everyone who contributes to Devise, and to you for attending this talk. Thank you!