Sinatra

Helping Redistrict California with Ruby

Helping Redistrict California with Ruby

by Jeremy Evans

In this presentation titled "Helping Redistrict California with Ruby," Jeremy Evans discusses the redistricting process in California, focusing on the systems he developed using Ruby to manage the application process for the Citizens Redistricting Commission.

  • Introduction to Redistricting: Redistricting is the process of redrawing electoral district boundaries to reflect population changes, ensuring equal representation. In California, it was historically conducted by elected officials until the 2008 proposition established the Citizens Redistricting Commission, appointed by the California State Auditor.
  • Initial System Development: In 2009, with limited time and budget, Evans created an automated system to handle the application process for the Citizens Redistricting Commission. He explained the two-tier application process, consisting of a quick initial application and a more comprehensive supplemental application, which included extensive background information.
  • Challenges and Solutions: He outlined the challenges of developing the system on obsolete hardware and using outdated libraries. The system ultimately utilized Ruby with Sinatra as the web framework. Audit logging emerged as a critical feature, which improved accountability and transparency.
  • 2010 vs 2020 Systems: The system for the 2010 redistricting was seen as a success, but issues prompted a retrospective as a prototype for the newer 2020 system. The latter was designed to automate tasks and included enhanced features for better demographic data handling and security improvements, such as three-factor authentication.
  • Lessons Learned: A key takeaway was the emphasis on thorough testing and documentation in the development process. The 2020 system demonstrated improvements in accessibility and security measures compared to its predecessor. Additionally, the aim for continuous maintenance and updates ensures that the system evolves with each new redistricting cycle.
  • Future Directions: Evans concluded with a brief overview of upcoming enhancements for the 2030 redistricting process, focusing on further increasing security measures and streamlining operations based on insights gained from the earlier systems.

Overall, Evans illustrated how Ruby has facilitated and improved the redistricting application processes in California, underscoring the importance of technological advancements in public sector transparency and efficiency.

00:00:00.000 Ready for takeoff.
00:00:17.180 All right, well hello everyone. In this presentation, I'm going to talk about redistricting in California and the systems I built using Ruby to assist California's redistricting process.
00:00:19.800 My name is Jeremy Evans. I'm a Ruby committer who focuses on fixing bugs in Ruby. I'm also the author of "Polished Ruby Programming," which was published last year. This book is aimed at intermediate Ruby programmers and focuses on teaching principles of Ruby programming as well as trade-offs to consider when making implementation decisions.
00:00:49.739 So first, what is redistricting? In order to answer that question, I'm going to give a brief civics lesson. In this context, a district is an area of land where people residing in that area vote for a person to represent them. The United States is a representative democracy with a bicameral legislature known as Congress. The lower chamber of Congress is the House of Representatives. Each state has a variable number of representatives in the House based on the state's population. Representatives in the House are supposed to represent the citizens in their particular district to ensure that each citizen has a local representative in Congress. These districts, called congressional districts, are an example of the type of district we are discussing. Congressional districts are supposed to be roughly equal in terms of population to ensure that each citizen in the United States has roughly equal representation in the House of Representatives.
00:02:10.080 Because population changes over time, ensuring roughly equal local representation requires modifying the district boundaries. The process of modifying the district boundaries to account for changes in population is referred to as redistricting. Historically, the process of modifying the district boundaries in California was often performed by elected officials. This allowed the people currently in power to modify the district lines in a way that kept themselves in power, which is an obvious conflict of interest.
00:02:28.080 In 2008, California citizens voted in favor of a proposition to change the redistricting process so that it was performed by an independent group of citizens. This group is named the Citizens Redistricting Commission. Using an independent group of citizens avoids the conflict of interest issues that previously existed. However, how could citizens of California be sure that the members of the Citizens Redistricting Commission would be the most qualified citizens to perform the redistricting work?
00:02:55.260 The responsibility for soliciting applications to become a member of the Citizens Redistricting Commission, as well as determining the most qualified candidates, was given to the California State Auditor. That's how I became involved in this process. At the time that the 2008 redistricting proposition was passed, I was the sole programmer and lead systems administrator at the California State Auditor's Office.
00:03:18.300 Now in July 2009, with less than five months until launch, the team handling the redistricting project requested that I develop an automated system to handle accepting and reviewing applications to be a member of the Citizens Redistricting Commission. So now that you have that background, the rest of the presentation is going to focus on the design and implementation of the systems that I built to handle the application process for redistricting commissioners. We'll start with the design and implementation of the system for the 2010 redistricting process. Naturally, the first part of any systems design is to gather requirements.
00:03:48.740 In terms of the initial requirements, beyond the basic authentication requirements that most systems have, the most important part of the system is the ability for it to accept applications to be a member of the commission. There are actually two applications: an initial application that takes about five minutes to fill out, and then a supplemental application with four essay questions and a requirement to list all closed family members, previous addresses, largest financial contributions, and a full education, employment, and criminal history.
00:04:40.020 So the supplemental application took most applicants many hours to fill out. Only about ten percent of the citizens that submitted an initial application also submitted a supplemental application. All these applications have to be reviewed by our staff to make sure they don’t contain any information that is offensive or confidential. All qualified applications are posted publicly for all citizens to review to ensure transparency in the selection process. An audit log is kept of all changes made in the system with the ability for administrators to search and review the logs.
00:05:04.680 This auto logging feature ended up being critically important during the process, and one of the most important lessons I learned during the 2010 redistricting process was the importance of audit logging. So all production systems that I maintain today have at least a basic audit log showing changes made in the system. The initial design used three separate systems for handling the first three requirements: there was a public system that allowed public citizens to log in and submit applications, an internal system for our staff to review applications, which required being physically present in our office and was not accessible from the internet, and a static site generator to display the applications.
00:06:30.540 Now given the limited time I had until launch and a budget of zero, I decided to run the system on our existing infrastructure. This infrastructure consisted of a single server that we had purchased in 2002. This server had dual 1.4 GHz Pentium 3 CPUs, a single gigabyte of RAM, and an 18 GB 10K hard drive. The server already ran our other internal applications, so not all of the RAM was available for the redistricting process to use.
00:07:06.840 The server ran OpenBSD and used PostgreSQL as a database and Ruby as the programming language for the existing applications. One of the first decisions I had to make when starting to develop this application was what libraries I would use to build it, starting with the library for database access. By mid-2009, I had already been maintaining SQL for over a year and all internal development had already switched to SQL, so SQL was the natural choice. I added a nested attributes plug-in to SQL at the start of this process, which I used to implement the system's supplemental application.
00:07:40.560 Then I had to decide on which web framework to use. At the time, all the other production applications I maintained used Rails. However, I had already become disenchanted with Rails by 2009. By then, I had developed personal projects in Sinatra and observed how much faster it was to develop applications using it as well as how Sinatra was quicker at runtime.
00:08:13.740 Thus, I decided to use Sinatra as the web framework for all the systems. In terms of authentication, there wasn't a good authentication library at the time development started, so I designed a custom authentication system. Like most applications, we needed a job system to reduce the amount of time spent during web requests.
00:08:42.900 And unlike most applications, we used standard Unix Cron for this purpose. Most of the jobs we had were not very time sensitive, with the most frequent jobs running every five minutes. Given the limited time I had for systems development, I decided to perform only integration testing and skipped model testing completely.
00:09:00.840 I also did not perform any coverage testing during the 2010 process. In order to make submission of the supplemental application easier, the application used JavaScript if it was available; however, no part of the application required JavaScript. The integration tests did not use JavaScript, and they still passed. We also did not do any automated testing of our JavaScript; all of our JavaScript was tested manually.
00:09:21.720 Before launching the system, we performed end-to-end load testing to ensure that it could handle the expected load. During testing, I found that we could process about one supplemental application per second per CPU, for a total of two applications per second. I informed the project manager that we could handle 120 applications per minute, which was a much more respectable number.
00:10:04.080 On launch day, the initial application looked very basic, and I was surprised they didn't have our visual design team enhance its appearance. The project manager expressed that it was obvious to stakeholders that we did not waste any of our budget on visual design.
00:10:51.120 When the 2010 systems launched, they had 44 routes, and by the end of the 2010 redistricting process, the system had 131 routes—three times the number of routes at launch. There were 30 database migrations during this period, many of which implemented entirely new subsystems. We launched without a full understanding of how big the system needed to be and did not anticipate many subsystems required automation.
00:11:30.379 Using a just-in-time development approach, I quickly developed systems to manage any bottlenecks that emerged during the process, resulting in rapid deployment of solutions. Overall, the 2010 redistricting system was considered a great success for a government IT project, especially one with essentially zero budget.
00:12:16.920 While the 2010 system was a success, the ad hoc development approach led to challenges; therefore, for the 2020 redistricting process, we considered the 2010 system as a prototype.
00:12:50.760 During mid-2018, I started designing the 2020 system, which would handle everything that the 2010 system handled and automate previously manual tasks. One of the new features in the 2020 system was much more extensive autologging, tracking previous values for all changed columns to allow administrators a clear audit trail.
00:13:42.900 The 2020 system emphasized improved demographic data handling. In 2010, there were only two options for gender and seven options for ethnicity, while by 2020 the system supported three options for gender and 23 options for ethnicity.
00:14:04.140 In developing the 2020 system, I aimed for it to be continuously maintained and utilized for future redistricting processes. The operating system, database, and programming language for the 2020 system remained the same as in 2010, although newer versions were implemented.
00:14:25.260 While the initial 2010 system used Sinatra, I continued to rely on it for subsequent systems development. However, I faced duplicate code challenges because Sinatra lacks solid support for per-routing branch code sharing. To address these issues, I created Rhoda in 2014.
00:14:51.040 Rhoda allows for more efficient code sharing among routing branches, producing maintainable code. Thus, Rhoda was utilized for the 2020 redistricting system. In 2015, I designed an authentication library named Rodoff, implementing a more secure sign-in system that utilized hashed passwords.
00:15:21.480 The 2020 redistricting system employed passwordless login through email and integrated two-factor authentication for added security. The internal systems relied on three-factor authentication, which included requiring physical access via access cards.
00:16:18.900 The job system, again, utilized Unix Cron, and with additional programming support, we developed features significantly more efficiently than during the previous process. A goal of achieving 100 percent code coverage remained, and testing was a priority throughout the development of the 2020 system.
00:16:56.760 Accessibility testing was also performed with an external vendor, allowing us to rectify issues before formal launch. The application interface improved slightly compared to its 2010 counterpart, adding enhanced functionalities.
00:17:35.700 Upon launch, the 2020 system saw a total of 181 routes—only a slight increase over the previous system but reflecting a more stable and efficient delivery of services. During this process, we encountered challenges, but changes were made to increase effectiveness moving forward.
00:18:37.740 The 2030 redistricting plan aims to build on the successes and lessons learned from the 2020 process, addressing unhandled exceptions and further improving the application experience.
00:19:25.800 For the 2030 system, I plan to implement additional security measures using unveil rather than chroot, allowing for more granular control over filesystem access without requiring the application to run as root. By continuing to improve the 2020 system, I believe the 2030 system will meet the criteria for being a successful application.
00:20:14.940 I hope you have fun learning about redistricting in California and how Ruby has helped through the redistricting process in the past and will continue to help in the future.
00:20:41.640 If you enjoyed this presentation and want to read more of my thoughts on Ruby programming, please consider picking up a copy of "Polished Ruby Programming." That concludes my presentation. I'd like to thank you all for listening to me. I think I have about one minute if anyone has questions.
00:21:19.080 Questions?
00:21:20.580 Thank you.