Talks

Helping Redistrict California with Ruby

Every 10 years, after the federal census, California and most other states redraw the lines of various electoral districts to attempt to ensure the districts are fair and have roughly equal population. California uses a system written in Ruby for citizens to apply to become redistricting commissioners, and for review of the submitted applications. Come learn about redistricting and the unique design of the California redistricting commissioner application system, with 12 separate web server process types, isolated networks, 3-factor authentication, and other security features.

RubyConf 2022

00:00:00.000 ready for takeoff
00:00:17.180 all right well hello everyone in this
00:00:19.800 presentation I'm going to talk about
00:00:21.420 redistricting in California and the
00:00:23.880 systems I built using Ruby to assist
00:00:26.279 California's redistricting process
00:00:28.859 my name is Jeremy Evans I'm a ruby
00:00:31.380 committer who focuses on fixing bugs in
00:00:34.260 Ruby I'm also the author of polished
00:00:36.780 Ruby programming which was published
00:00:38.399 last year this book is aimed at
00:00:40.680 intermediate Ruby programmers and
00:00:42.600 focuses on teaching principles of Ruby
00:00:44.579 programming as well as trade-offs to
00:00:46.620 consider when making implementation
00:00:48.300 decisions
00:00:49.739 so first what is redistricting in order
00:00:53.039 to answer that question I'm going to
00:00:54.840 give a brief civics lesson so what is a
00:00:57.600 district
00:00:58.320 well in this context a district is an
00:01:00.719 area of land where people residing on
00:01:03.000 that land vote for a person to represent
00:01:05.939 them
00:01:06.720 the United States is a representative
00:01:08.580 democracy with a bicameral legislature
00:01:10.860 known as Congress the lower chamber of
00:01:13.320 Congress is the House of Representatives
00:01:15.500 each state has a variable number of
00:01:18.299 representatives in the house based on
00:01:20.939 the state's population
00:01:22.340 representatives in the house are
00:01:24.479 supposed to represent the citizens that
00:01:27.119 are in their particular District to
00:01:29.220 ensure that each citizen has a local
00:01:31.500 representative in Congress
00:01:33.720 these districts called congressional
00:01:35.820 districts are an example of the type of
00:01:38.280 District that we are discussing
00:01:39.960 congressional districts are supposed to
00:01:42.060 be roughly equal in terms of population
00:01:44.759 to ensure that each citizen in the
00:01:46.740 United States has roughly equal
00:01:48.780 representation in the House of
00:01:51.000 Representatives
00:01:52.500 because population changes over time
00:01:55.280 ensuring roughly equal local
00:01:57.780 representation requires modifying the
00:02:00.720 district boundaries and the process of
00:02:03.060 modifying the district boundaries to
00:02:04.979 account for changes in population is
00:02:07.320 referred to as redistricting now
00:02:10.080 historically the process of modifying
00:02:12.720 the district boundaries in California
00:02:14.780 was often performed by elected officials
00:02:18.360 and this allowed the people that were
00:02:19.920 currently in power to modify the
00:02:22.200 district lines in a way to keep
00:02:23.940 themselves in power which is an obvious
00:02:26.160 conflict of interest
00:02:28.080 so in 2008 California citizens voted in
00:02:32.160 favor of a proposition to change the
00:02:34.920 redistricting process so that it was
00:02:37.140 performed by an independent group of
00:02:39.660 citizens and this group is named the
00:02:42.360 citizens redistricting Commission
00:02:45.180 Now using an independent group of
00:02:47.519 citizens avoids the conflict of interest
00:02:49.920 issues that previously existed
00:02:53.160 however how could citizens of California
00:02:55.260 be sure that the members of the citizens
00:02:58.019 redistricting commission would be the
00:02:59.940 most qualified citizens to perform the
00:03:02.400 redistricting work
00:03:04.140 well the responsibility for soliciting
00:03:06.180 applications to be a member of the
00:03:08.040 citizens redistricting commission as
00:03:10.019 well as determining the most qualified
00:03:11.580 candidates was given to the California
00:03:13.860 state auditor and that's how I became
00:03:16.319 involved in this process
00:03:18.300 at the time that the 2008 redistricting
00:03:20.879 proposition was passed I was the sole
00:03:23.159 programmer and Lead systems
00:03:24.420 administrator at the California state
00:03:26.280 auditor's office
00:03:27.840 now in July 2009 with less than five
00:03:30.659 months until launch the team handling
00:03:32.819 the redistricting project requested that
00:03:35.220 I develop an automated system to handle
00:03:37.739 accepting and reviewing applications to
00:03:40.260 be a member of the citizens
00:03:41.400 redistricting Commission
00:03:43.140 so now that you have that background the
00:03:45.540 rest of the presentation is going to
00:03:46.739 focus on the design and implementation
00:03:48.540 of the systems that I built to handle
00:03:51.180 the application process for
00:03:53.280 redistricting commissioners
00:03:55.080 so we'll start with the design and
00:03:56.640 implementation of the system for the
00:03:58.319 2010 redistricting process
00:04:01.200 naturally the first part of any systems
00:04:03.239 design is to gather requirements in
00:04:05.940 terms of the initial requirements beyond
00:04:07.920 the basic authentication requirements
00:04:09.659 that most systems have the most
00:04:11.879 important part of the system is the
00:04:13.860 ability for the system to accept
00:04:15.540 applications to be a member of the
00:04:17.880 commission and there are actually two
00:04:19.620 applications in initial application that
00:04:22.320 takes about five minutes to fill out and
00:04:24.660 then a supplemental application with
00:04:26.639 four essay questions and a requirement
00:04:29.100 to list all closed family members
00:04:30.840 previous addresses largest large
00:04:33.660 Financial contributions and a full
00:04:35.880 education employment and criminal
00:04:37.320 history
00:04:38.280 so the supplemental application took
00:04:40.259 most applicants many hours to fill out
00:04:42.540 only about 10 percent of the citizens
00:04:45.360 that submitted an initial application
00:04:47.040 also submitted a supplemental
00:04:48.900 application
00:04:50.400 all these applications have to be
00:04:52.139 reviewed by our staff to make sure they
00:04:54.240 don't contain any information that is
00:04:56.100 offensive or confidential and all
00:04:58.620 qualified applications are posted
00:05:00.300 publicly for all citizens to review to
00:05:02.820 ensure transparency in the selection
00:05:04.680 process
00:05:05.940 an audit log is kept of all changes made
00:05:08.699 in the system with the ability for
00:05:10.320 administrators to search and review the
00:05:12.360 logs now this Auto logging feature ended
00:05:14.820 up being critically important during the
00:05:16.680 process and one of the most important
00:05:18.600 lessons that I learned during the 2010
00:05:21.360 redistring process was the importance of
00:05:23.699 audit logging so all production systems
00:05:25.860 that I maintained today have at least a
00:05:27.840 basic audit log showing changes made in
00:05:30.479 the system
00:05:31.620 now the initial design used three
00:05:33.900 separate systems for handling the first
00:05:36.120 three requirements there was a system
00:05:38.160 called the public system that allowed
00:05:40.020 the public citizens to log in and submit
00:05:42.840 applications
00:05:44.100 there was a system called the internal
00:05:45.780 system that allowed our staff to log in
00:05:48.000 and review the applications and access
00:05:50.520 to the internal system required being
00:05:52.680 physically present in our office it was
00:05:54.539 not accessible from the internet
00:05:57.360 now I assumed when designing the system
00:05:59.100 that 99 of the web requests would be for
00:06:02.280 the public viewing the applications and
00:06:04.740 since viewing the applications does not
00:06:06.360 require a login and applications could
00:06:08.699 not generally be modified after being
00:06:10.680 submitted I decided the easiest approach
00:06:13.139 would be to generate a static site for
00:06:15.360 the applications so the third system was
00:06:17.340 a static site generator
00:06:19.320 now given the limited time I had until
00:06:21.539 launch and a budget of zero I decided to
00:06:24.360 run the system on our existing
00:06:25.740 infrastructure
00:06:26.940 so this infrastructure consisted of a
00:06:28.680 single server that we had purchased in
00:06:30.360 2002 This Server had dual 1.4 gigahertz
00:06:33.960 Pentium 3 CPUs a single gigabyte of RAM
00:06:36.720 and an 18 gigabyte 10K hard drive
00:06:39.600 so the server already ran our other
00:06:41.220 internal applications so not all of the
00:06:43.440 ram was available for the redistricting
00:06:45.180 process to use
00:06:47.039 the server ran openbsd and used a
00:06:49.979 post-resql as a database and Ruby as the
00:06:52.139 programming language for the existing
00:06:53.699 applications
00:06:55.319 one of the first decisions I had to make
00:06:57.240 when starting to develop this
00:06:58.440 application was what libraries I would
00:07:00.419 use to build this starting with the
00:07:01.800 library for database access
00:07:03.840 so by mid-2009 I had already been
00:07:06.300 maintaining SQL for over a year and all
00:07:08.699 internal development had already
00:07:10.319 switched it to SQL so SQL was the
00:07:12.240 Natural Choice I added a nested
00:07:14.400 attributes plug into SQL at the start of
00:07:16.319 this process which I used to implement
00:07:18.660 the systems supplemental application
00:07:22.139 then I had to decide on which web
00:07:23.699 framework to use so at the time all the
00:07:26.460 other production applications I
00:07:28.020 maintained used rails however by 2009 I
00:07:31.740 already become disenchanted with rails
00:07:34.319 as I mentioned there were separate
00:07:36.060 public internal and static site
00:07:38.220 applications all of which use the same
00:07:40.740 models which is something that rails
00:07:42.900 still does not support well
00:07:45.360 I had some experience developing
00:07:46.860 personal projects in Sinatra and I saw
00:07:49.500 how much faster it was to develop
00:07:51.000 applications in Sinatra as well as how
00:07:53.280 Sinatra was faster at runtime so I
00:07:55.860 decided to use asnotra as the web
00:07:57.660 framework for all the systems
00:08:00.120 in terms of authentication there wasn't
00:08:02.340 a good authentication library at the
00:08:03.840 time development started so I designed a
00:08:06.120 custom authentication system
00:08:08.039 like most applications we needed a job
00:08:10.740 system to reduce the amount of time
00:08:12.419 spent during web requests and unlike
00:08:14.759 most applications we use standard Unix
00:08:17.039 Cron for this purpose
00:08:18.660 most of the jobs that we had were not
00:08:20.160 very time sensitive with our most
00:08:22.020 frequent jobs running every five minutes
00:08:25.080 in terms of testing due to the limited
00:08:27.599 time I had for systems development I
00:08:30.060 decided to perform only integration
00:08:31.800 testing and I skipped model testing
00:08:33.419 completely
00:08:34.560 I also did not perform any coverage
00:08:36.300 testing during the 2010 process
00:08:39.120 now to make submission of the
00:08:40.680 supplemental application easier the
00:08:42.120 supplemental application used JavaScript
00:08:44.039 if it was available however no part of
00:08:46.440 the application required JavaScript the
00:08:48.839 integration tests did not use JavaScript
00:08:50.700 and they still passed
00:08:52.560 we also did not do any automated testing
00:08:54.240 of our JavaScript all of our JavaScript
00:08:55.920 was tested manually
00:08:58.080 now before launching the system in order
00:09:00.120 to ensure that the system could handle
00:09:01.800 the expected load we performed some
00:09:04.080 end-to-end load testing
00:09:06.120 and remember when I discussed the
00:09:07.440 infrastructure with the single server
00:09:09.060 with the dual core Pentium 3. well
00:09:11.519 during end-to-end load testing I found
00:09:13.440 that we could process about one
00:09:15.600 supplemental application per second per
00:09:17.459 CPU for a total of two applications per
00:09:20.580 second
00:09:21.540 well that sounds really bad right I mean
00:09:24.000 you cannot possibly report numbers that
00:09:26.519 low so I told the project manager we
00:09:28.860 could handle 120 applications per minute
00:09:32.580 and that's a much more respectable
00:09:34.260 number
00:09:35.040 and the project manager agreed that we
00:09:37.260 would be fine using the existing
00:09:39.120 infrastructure
00:09:40.680 now we only received a total of 4 800
00:09:43.260 applications supplemental applications
00:09:45.540 during the 2010 redistricting process
00:09:47.880 and the existing infrastructure did work
00:09:49.800 fine
00:09:50.940 now here's what the initial application
00:09:52.440 looked like in 2010 and as you can see
00:09:54.899 we did not spend anything on visual
00:09:57.300 design
00:09:58.320 I was kind of surprised they didn't have
00:10:00.180 our visual design team make this look
00:10:01.560 prettier but the project manager told me
00:10:03.779 that he liked that it was obvious to
00:10:05.339 stakeholders that we did not waste any
00:10:06.899 of our budget on visual design
00:10:10.920 so when the 2010 systems launched the
00:10:13.140 systems had 44 routes
00:10:15.480 and by the end of the 2010 registering
00:10:17.640 process the systems had 131 routes three
00:10:21.180 times the number that the systems had at
00:10:22.980 launch
00:10:24.060 so between system launch and the end of
00:10:26.100 the process there were 30 database
00:10:27.959 migrations many of which implemented
00:10:30.120 entirely new subsystems in the
00:10:31.920 application
00:10:33.000 when we launched we really had no idea
00:10:35.339 how big the system actually needed to be
00:10:37.440 there were many subsystems that we did
00:10:39.600 not anticipate in needing an automated
00:10:41.519 system for when we launched so let me
00:10:43.620 give some examples
00:10:45.360 when we launched we had no idea how many
00:10:47.640 full applications we would receive we
00:10:49.920 ended up receiving many more than we
00:10:51.420 expected to receive and we needed to
00:10:53.399 build a system to review supplemental
00:10:55.680 applications to determine which were the
00:10:57.899 most qualified
00:10:59.579 when we launched we knew we would have
00:11:01.440 to accept public comment on applications
00:11:03.660 but we didn't know what form such public
00:11:05.940 comment would take we ended up having to
00:11:08.160 build a system to accept review and post
00:11:10.560 public comments
00:11:12.360 then we determined we needed a system so
00:11:14.519 that applicants could respond to those
00:11:16.079 public comments and for us to review and
00:11:18.480 post their responses
00:11:20.459 there was an appeals process for
00:11:22.440 applicants who were disqualified and
00:11:24.540 enough applicants requested
00:11:25.800 reconsideration of their applications
00:11:27.480 that we had to design a system for that
00:11:30.420 people being human make mistakes as it
00:11:33.540 turns out a lot of mistakes we had to
00:11:36.600 build a system for handling submitting
00:11:38.700 reviewing and posting amendments to
00:11:40.860 Applications
00:11:42.240 now the registering team and I used a
00:11:44.339 just-in-time development approach as
00:11:46.380 soon as the team had a bottleneck and
00:11:47.880 they needed something automated they
00:11:49.620 would contact me and I develop a system
00:11:51.180 to handle it usually with one or two
00:11:53.640 days between request and production
00:11:55.500 deployment
00:11:56.760 in general the 2010 redistricting system
00:11:59.579 was considered a huge success for a
00:12:01.320 government I.T project especially one
00:12:03.240 that had pretty much zero budget
00:12:05.459 and while the 2010 system was a success
00:12:07.860 and I still consider it a major
00:12:09.420 accomplishment given the constraints the
00:12:11.640 just-in-time development approach
00:12:13.440 resulted in a system developed in an ad
00:12:15.720 hoc manner
00:12:17.040 so when it came time to develop the 2020
00:12:19.320 system I considered the 2010 system a
00:12:21.720 prototype for one the 2020 2010 system
00:12:24.959 was not maintained after the 2010
00:12:27.120 redistricting process ended
00:12:29.519 now in most of my systems design I use
00:12:31.980 an evolutionary prototype approach
00:12:33.660 constantly refining the Prototype until
00:12:36.240 it becomes fully production ready
00:12:38.100 and that's usually because I'm not time
00:12:39.779 constrained when I'm developing
00:12:40.980 prototypes this is one case where I
00:12:43.800 decided to throw away the Prototype and
00:12:46.019 develop a completely new system and we
00:12:48.300 didn't throw away the Prototype
00:12:49.320 completely we did keep it for reference
00:12:50.820 but the 2020 system shared no code with
00:12:54.000 the 2010 system
00:12:56.579 now in mid-2018 I started designing the
00:12:59.339 2020 system now the 2020 system would
00:13:01.860 handle everything that 2010 system
00:13:03.899 handled and in addition it would
00:13:05.639 automate some tasks that were not
00:13:07.380 automated in 2010.
00:13:09.480 one new feature in the 2020 system was
00:13:12.060 much more extensive Auto logging so
00:13:14.519 instead of just logging what type of
00:13:16.200 change was made the 2020 autolog kept
00:13:19.019 previous values for all changed columns
00:13:21.180 so administrators could view the basic
00:13:23.339 audit log and they could drill down for
00:13:25.740 any entry to see the previous column
00:13:27.899 values for all rows updated in the same
00:13:30.420 transaction this feature was implemented
00:13:32.700 using database triggers and a Json B
00:13:34.920 column to store the previous values
00:13:37.440 another Improvement was a change in
00:13:39.300 which demographic options that the
00:13:40.800 application success supported so the
00:13:42.959 system recorded and displayed
00:13:44.459 demographic information for all
00:13:46.019 applicants as required by law now in
00:13:49.139 2010 there were only two options for
00:13:51.120 gender and seven options for ethnicity
00:13:52.860 in 2020 the systems had three options
00:13:55.200 for gender and 23 options for ethnicity
00:13:58.620 and when designing the 2020 system I had
00:14:01.019 the expectation up front that it would
00:14:02.880 not be a prototype I designed the 2020
00:14:05.579 system with the expectation that it
00:14:07.139 would be continuously maintained after
00:14:09.360 the 2020 redistricting process and
00:14:11.760 eventually used for the 2030 and future
00:14:13.800 redistricting processes
00:14:15.779 so in terms of the 2020 system design
00:14:17.700 the operating system database and
00:14:19.860 programming language remain the same as
00:14:21.779 the 2010 system just newer versions of
00:14:23.820 each SQL was still the Natural Choice
00:14:26.220 for database access
00:14:27.899 now the 2010 system used Sinatra and I
00:14:31.019 ended up using Sinatra for all new
00:14:33.120 systems development between 2009 and
00:14:35.459 2013. while still maintaining older
00:14:37.980 applications that were developed in
00:14:39.540 rails
00:14:40.560 and one issue I ended up having and all
00:14:42.480 of my snatcher applications is that many
00:14:44.940 of the routes would have duplicated code
00:14:46.800 because Sinatra didn't and still doesn't
00:14:49.440 have good support for sharing code on a
00:14:52.920 per routing Branch basis I still prefer
00:14:55.620 Sinatra to rails as in general my
00:14:57.720 Sinatra applications were easier to
00:14:59.579 maintain and understand
00:15:01.800 to address the issues I had with Sinatra
00:15:03.959 I ended up creating Rhoda in 2014. and
00:15:07.019 one advantage of Rhoda is you can more
00:15:08.699 easily share code between routing
00:15:10.500 branches resulting in code that is even
00:15:12.360 easier to maintain than Sinatra Rota was
00:15:15.240 used as the web framework for the 2020
00:15:17.160 redistricting system
00:15:18.779 now in 2015 I had a bunch of
00:15:20.639 applications that all used custom
00:15:22.380 authentication designs using a more
00:15:24.720 secure approach than most applications
00:15:26.699 were the database user for the
00:15:28.560 application does not have direct access
00:15:30.660 to the password hashes now there was not
00:15:33.420 an existing authentication library that
00:15:35.220 supported this approach and the only
00:15:37.199 other Ruby authentication libraries that
00:15:38.880 existed required rails
00:15:41.279 so in 2015 I designed a new
00:15:43.440 authentication Library named Rod off
00:15:45.060 that used this more secure approach
00:15:47.000 rodoth was used for authentication in
00:15:49.500 the 2020 system
00:15:50.880 word off is now Ruby's most advanced
00:15:52.800 authentication Library it's built on top
00:15:54.839 of SQL and wrote up but it's usable as a
00:15:57.300 rack middleware to handle authentication
00:15:59.220 for any Ruby web application
00:16:02.279 one of the big problems that we had
00:16:03.779 during the 2010 process was that users
00:16:06.120 could not remember their passwords
00:16:09.120 so the Project Lead wanted to default to
00:16:11.760 a passwordless authentication system in
00:16:14.100 the 2020 system so early on in the
00:16:16.440 development of the 2020 system I added a
00:16:18.720 feature 2 wrote off to support past
00:16:20.160 wordless authentication using email
00:16:22.100 users could still choose to create a
00:16:24.120 password if they wanted to
00:16:26.339 to increase security of the internal
00:16:28.079 systems at the application Level
00:16:29.959 two-factor authentication was used with
00:16:32.339 a password and totp code required to log
00:16:35.160 in
00:16:36.000 there was an additional factor that was
00:16:37.860 needed to access the internal systems so
00:16:40.019 the internal systems had a total of
00:16:41.339 three Factor authentication I'll discuss
00:16:43.680 the third factor in a bit when I go over
00:16:45.420 the system is Advanced security features
00:16:47.940 now the use of cron as a job system
00:16:49.740 worked well for the 2010 system so we
00:16:52.079 used it again for the 2010 2020 system
00:16:54.660 we had about twice as much time to
00:16:56.940 develop the 2020 system starting about
00:16:58.620 nine months before launch
00:17:00.720 in addition there were two programmers
00:17:02.220 working on the system and not just me
00:17:03.660 and due to libraries such as Rhoda and
00:17:05.819 rodolf in the 2020 system develop enough
00:17:08.280 features was significantly faster
00:17:10.679 as we had sufficient time I had a goal
00:17:12.600 of 100 line coverage for the 2020 system
00:17:15.179 we tested coverage on a regular basis
00:17:17.339 adding any tests needed to get to 100
00:17:19.740 coverage
00:17:21.480 one process that we undertook in 2020
00:17:24.000 that we did not undertake in 2010 was
00:17:27.000 having the system accessibility tested
00:17:29.160 by an external vendor and this vendor
00:17:31.380 found many accessibility issues it was
00:17:33.960 very interesting for me to work side by
00:17:35.700 side with one of the accessibility
00:17:37.080 testers who was blind and was filling
00:17:39.419 out the applications using an iPhone
00:17:41.160 with text to speech I got to see
00:17:43.559 firsthand how accessibility issues made
00:17:46.559 filling out the applications much more
00:17:48.120 difficult
00:17:49.020 pretty much all the accessibility issues
00:17:50.880 they found were easy to fix and we were
00:17:52.980 able to ensure that blind and low vision
00:17:55.140 users were both able to successfully
00:17:57.600 complete the application process using
00:17:59.640 the system
00:18:00.960 now here's what the initial application
00:18:02.400 looked like in 2020 . it does look
00:18:04.620 slightly nicer than it did in 2010 since
00:18:06.720 we were provided a couple of graphics
00:18:08.220 for the top however the rest of the
00:18:10.080 visual design remained very plain just
00:18:12.240 black and white still had very much just
00:18:14.400 the facts vibe
00:18:16.380 now it launched the 2020 systems had 167
00:18:19.320 rounds and that doesn't count any routes
00:18:21.240 related to authentication as those were
00:18:23.460 handled by rodolf at the end of the 2020
00:18:26.400 redistring process the systems had 181
00:18:29.100 routes less than 10 percent more than it
00:18:31.380 had at the start and quite a bit
00:18:33.000 different than the three times increase
00:18:34.440 that we had during the 2010 process
00:18:37.679 during the 2020 process there were nine
00:18:40.020 database migrations most of which were
00:18:42.240 small and none of which added any new
00:18:44.280 subsystems the most significant changes
00:18:46.440 were focused on decreasing the amount of
00:18:48.780 time it took to review applications
00:18:50.460 public comments and other things that
00:18:52.440 the system dealt with
00:18:54.419 the most significant change in 2020 in
00:18:56.820 terms of the system design was a focus
00:18:58.500 on security
00:18:59.940 so during the 2020-20 2010 system design
00:19:01.860 the focus was on getting something that
00:19:03.419 worked when I met with the project lead
00:19:05.580 for the 2020 system he specifically
00:19:07.679 tasked me with making the system as
00:19:09.240 secure as I could possibly make it
00:19:11.400 so the 2020 system contains many
00:19:14.039 security features that you do not see in
00:19:15.960 typical Ruby web applications
00:19:18.059 one of the largest changes is that the
00:19:20.100 internal system was split into separate
00:19:22.620 applications based on what type of
00:19:24.299 database access was needed in addition
00:19:26.580 separate operating system users are used
00:19:28.740 based on how much access the process
00:19:31.020 needs to the file system
00:19:33.120 so there are actually 12 separate
00:19:34.500 processes used in the 2020 system
00:19:37.559 there was still a public system that was
00:19:39.600 used by citizens to submit applications
00:19:41.640 and public comments and other things
00:19:43.140 however in 2020 this system is locked
00:19:45.720 down for example the database user for
00:19:48.480 the for this process could create new
00:19:50.460 applications and public comments but
00:19:52.440 could not modify existing ones
00:19:55.140 there was a staff system for staff
00:19:56.760 performing the initial review of
00:19:58.200 applications flagging applications that
00:20:00.419 needed higher level review there was a
00:20:03.120 public comment system for staff
00:20:04.620 performing review of public comments
00:20:06.299 which require additional scrutiny as
00:20:08.640 they're often antagonistic
00:20:11.160 there was administrative system for
00:20:13.080 higher level staff to manage users and
00:20:15.480 to perform secondary reviews of
00:20:17.340 applications and public comments that
00:20:19.320 had been flagged during the initial
00:20:20.880 reviews
00:20:22.320 there was a single system supporting
00:20:23.940 file uploads no other systems supported
00:20:26.160 uploading files for security we did not
00:20:28.260 accept uploaded files from the public
00:20:30.780 some of the other internal systems could
00:20:33.000 manage files that had been uploaded into
00:20:35.039 the upload system
00:20:36.660 there was a system to handle removing
00:20:38.640 applicants and this was the most
00:20:40.080 sensitive system which only a couple of
00:20:41.940 staff had access to
00:20:43.980 the staff that reviewed applications to
00:20:46.020 determine which were the most qualified
00:20:47.640 were called ARP members ARP being short
00:20:49.919 for applicant review panel and by law
00:20:52.440 there were three art members and none of
00:20:54.840 them were allowed to see what the other
00:20:56.400 art members were working on they could
00:20:58.559 only have discussions about applicants
00:21:00.059 during public meetings so each art
00:21:02.520 member has their own system and the
00:21:04.380 database user for each system only
00:21:06.059 allows access to their own reviews it
00:21:08.039 does not allow them access to see the
00:21:09.780 reviews for other ARP members
00:21:12.000 all the art members have an assistant
00:21:13.440 who assists them with their reviews
00:21:15.020 these people referred to as art helpers
00:21:17.520 had less access than art members they
00:21:20.039 could add notes to the applications that
00:21:22.260 were viewable by the art member they
00:21:23.700 were assisting but they could not modify
00:21:25.559 the art members decision in the system
00:21:28.140 so in addition to these systems there
00:21:29.760 was also a static site generator since
00:21:31.620 the 2020 system used basically the same
00:21:33.659 static site approach as the 2010 system
00:21:36.000 to allow the public to view submitted
00:21:38.340 applications
00:21:39.659 with the exception of the public and the
00:21:41.640 upload systems each of the other systems
00:21:43.799 was only accessible on a per system
00:21:45.840 specific VLAN
00:21:47.640 for example if you want to access the
00:21:49.320 public comment system there was only a
00:21:51.240 single workstation connected to the
00:21:53.220 public comment VLAN so you had to be
00:21:55.440 physically present at that workstation
00:21:57.360 to access the public comment system
00:21:59.640 each of the vlans were isolated and had
00:22:01.799 no internet access so the only way to
00:22:04.320 access these internal systems was to
00:22:06.360 have specific physical access to
00:22:08.039 workstations in our office and access to
00:22:10.440 those workstations was limited to
00:22:12.419 specific staff with physical access
00:22:13.860 cards so I mentioned earlier that Rod
00:22:16.200 off was using two-factor authentication
00:22:17.820 in the application requiring both
00:22:19.500 password and totp code to log in
00:22:22.679 by separating all these other systems
00:22:24.539 into their own vlans each requiring
00:22:26.400 their own physical access card all
00:22:28.500 internal systems had three Factor
00:22:30.120 authentication with the third Factor
00:22:31.919 being accessed to a specific physical
00:22:33.780 location
00:22:35.580 I mentioned earlier that each of the 12
00:22:37.080 systems use separate database users with
00:22:39.780 reduced privileges in some cases the
00:22:42.780 systems needed limited access to tables
00:22:45.059 that were recorded in other systems but
00:22:47.820 could not have full access to the tables
00:22:50.340 so the systems use security to find our
00:22:52.440 database functions to make changes that
00:22:54.539 were not normally be allowed by their
00:22:55.980 database permissions for example no
00:22:58.260 internal system was allowed to change
00:23:00.299 the contents of a submitted application
00:23:03.000 however if applicants made a mistake
00:23:04.980 when submitting their application they
00:23:06.840 could request that their application be
00:23:08.220 unsubmitted in order for it to be fixed
00:23:10.260 for this case there was a security
00:23:12.120 definer database function that modified
00:23:14.159 the application to unsubmit it
00:23:16.919 all the systems were restricted by both
00:23:18.960 Ingress egress and loopback firewall
00:23:21.240 rules most of the systems as I mentioned
00:23:23.340 were completely isolated and had no
00:23:24.960 internet access at all the public system
00:23:27.299 was limited to receiving https
00:23:29.159 connections from two front-end servers
00:23:31.620 that were located in one of the state's
00:23:33.240 data centers
00:23:34.860 to mitigate arbitrary file access and
00:23:37.260 remote code execution vulnerabilities
00:23:38.940 all systems ran CH rooted the
00:23:41.520 applications would start as root and
00:23:43.320 after they loaded but before they
00:23:45.360 started accepting connections they would
00:23:47.340 CH root to the working directory of the
00:23:49.020 application and then they would drop
00:23:50.640 privileges to the applications operating
00:23:52.679 system user file system permissions were
00:23:55.200 used inside the folder so that attackers
00:23:57.600 could not read configuration files with
00:23:59.460 sensitive information
00:24:01.260 to make exploitation more difficult and
00:24:03.659 to reduce the kernel attack surface for
00:24:05.520 privilege escalation all systems limited
00:24:08.520 the allowed set of Kernel system calls
00:24:10.320 to the minimum that the system needed to
00:24:12.000 function using an open BSD specific API
00:24:14.760 called pledge after the systems are
00:24:17.340 initialized none of the systems are
00:24:19.380 allowed to Fork exec or send signals to
00:24:22.020 any other processes if the system did
00:24:24.179 not need to accept or manage uploaded
00:24:26.220 files it was not allowed to create or
00:24:28.260 modify any files
00:24:30.480 to make it more difficult to execute
00:24:32.280 blind return oriented programming
00:24:33.960 attacks that are based on exploiting
00:24:35.640 consistent memory layouts each process
00:24:37.980 handling requests would exec after
00:24:40.380 forking and before loading the system so
00:24:43.020 that all processes have unique memory
00:24:45.360 layouts now this has a fairly High
00:24:47.700 memory cost but we were still able to do
00:24:50.039 this and remain within our memory budget
00:24:52.740 in the 2020 process I requested a budget
00:24:55.200 for a new server so I could run the
00:24:56.820 systems on isolated hardware and the
00:24:58.860 server was fairly modest by the
00:25:00.299 standards of the time with a 2.1
00:25:02.340 gigahertz 8 core processor and only
00:25:04.080 eight gigabytes of RAM
00:25:05.580 importantly it had a raid one of solid
00:25:07.799 state discs for storage
00:25:09.659 now if you remember the 2010 system
00:25:11.520 could process about 120 supplemental
00:25:13.559 applications per minute
00:25:14.880 due to more Modern Hardware newer Ruby
00:25:17.280 and postgresql versions and more
00:25:18.780 optimized Ruby libraries the 2020
00:25:21.000 process could handle 7 200 supplemental
00:25:23.700 applications per minute this proved to
00:25:25.620 be more than sufficient considering we
00:25:27.120 only received a total of 2200
00:25:29.159 supplemental applications during the
00:25:31.080 2020 redistricting process
00:25:33.419 so I mentioned earlier that I aimed for
00:25:35.159 a hundred percent line coverage in the
00:25:37.200 2020 system but we did meet that however
00:25:39.480 while less than 100 code coverage means
00:25:42.059 something 100 code coverage means
00:25:44.220 nothing and it's specifically 100 code
00:25:46.140 coverage does not mean that you have no
00:25:47.580 bugs
00:25:48.600 during the 2020 redisting process we had
00:25:50.820 43 unhandled exceptions with the
00:25:53.340 majority in the public system
00:25:55.080 19 of these exceptions were traced to
00:25:57.059 five separate race conditions and in all
00:25:59.400 cases these race conditions were caught
00:26:00.840 by database constraints
00:26:02.760 19 of these exceptions were due to form
00:26:04.980 submissions with invalid child
00:26:06.539 associations in cases where sql's nested
00:26:09.360 attributes plug-in hadn't yet supported
00:26:11.580 handling the invalid Association
00:26:13.380 automatically
00:26:14.760 three of the exceptions were due to
00:26:16.380 asking null being submitted in form
00:26:18.419 submissions during login and two of the
00:26:20.520 exceptions were due to a deployment
00:26:21.779 issue
00:26:23.039 all told the 2020 redistring process
00:26:25.080 went much smoother than the 2010 process
00:26:27.419 and that's mostly because we knew what
00:26:29.220 to expect and in general anticipated it
00:26:31.440 and planned for it however during the
00:26:33.539 process the team noticed many
00:26:35.100 possibilities for improving the process
00:26:36.779 now in critical cases we did make
00:26:38.760 changes the 2020 system but in the
00:26:41.039 majority of cases we decided to delay
00:26:42.900 improvements until the 2030 system
00:26:45.600 so shortly after the end of the 2020
00:26:47.700 process we had a series of meetings
00:26:50.159 discussing various possible changes to
00:26:52.740 the system for 2030 and before the end
00:26:55.080 of 2020 I had already started making
00:26:56.880 some internal changes to fix the
00:26:59.159 unhandled exceptions that occurred
00:27:00.840 during the 2020 process
00:27:02.820 I had support to sequels nested
00:27:04.919 attributes plugin to handle the issues
00:27:06.779 exposed during the 2020 process so that
00:27:09.360 they would not raise exceptions
00:27:11.340 to handle the race conditions that we
00:27:13.320 experienced during the 2020 process I
00:27:15.240 added a code injection framework so that
00:27:17.340 I could simulate the race conditions
00:27:18.840 during the tests and then made changes
00:27:21.179 to avoid the race conditions in addition
00:27:23.580 to fixing the five separate race
00:27:24.659 conditions that resulted in exceptions
00:27:26.760 during the 2020 process I audited all
00:27:29.460 the other routes in the system for
00:27:30.779 similar conditions and found an
00:27:32.760 additional race condition which I also
00:27:34.380 fixed
00:27:35.460 now to prevent the unhandled exceptions
00:27:37.260 due to ASCII null bites I modified Rod
00:27:39.600 off to handle this case I also added
00:27:41.580 similar handling in Rota and SQL
00:27:44.279 my goal in the 2030 system is to have
00:27:46.320 100 line and Branch coverage we started
00:27:49.200 off with a hundred uncovered branches
00:27:50.820 and over a two-week period we covered
00:27:52.919 all hundred branches finding and fixing
00:27:55.320 four previously undetected bugs in the
00:27:57.600 uncovered branches so I'll sit again for
00:27:59.820 emphasis 100 line of Branch coverage
00:28:01.919 means nothing but less than 100 percent
00:28:04.260 that means something
00:28:06.360 another goal for 2030 which I have not
00:28:08.400 yet implemented is to get to 100 line of
00:28:10.799 Branch coverage for All Views in the
00:28:12.960 system
00:28:13.860 now the 2020 system used CH root and
00:28:16.980 privilege dropping to limit the file
00:28:18.960 system access to the application
00:28:20.880 directory
00:28:22.200 for the 2030 system I switched to using
00:28:24.960 unveil which is an open BSD specific API
00:28:27.779 for more granular file system access
00:28:29.820 limiting this makes it so that only a
00:28:32.460 small portion of the application
00:28:34.020 directory is accessible at runtime
00:28:36.179 instead of the entire application
00:28:37.799 directory being accessible additionally
00:28:40.380 switching to unveil I mean made it so
00:28:42.539 the application did not need to be
00:28:43.740 started as root so privilege dropping
00:28:45.900 was no longer needed
00:28:47.940 by continuing to maintain and improve on
00:28:50.640 the 2020 system I think the 2030 system
00:28:52.860 will be just as much of a success hope
00:28:55.860 you have fun learning about
00:28:56.880 redistricting in California and how Ruby
00:28:59.039 has helped through the string process in
00:29:00.720 the past and will continue to help the
00:29:02.340 redistricting process in the future
00:29:04.980 if you enjoyed this presentation and
00:29:06.360 want to read more of my thoughts on Ruby
00:29:07.919 programming please consider picking up a
00:29:09.659 copy of polished Ruby programming and
00:29:12.000 that concludes my presentation I'd like
00:29:13.740 to thank you all for listening to me I
00:29:16.020 think I have about one minute if anyone
00:29:18.299 has questions
00:29:20.580 questions
00:29:24.140 thank you