Big Data
Big Data and the Coming Golden Age of Humanity
Summarized using AI

Big Data and the Coming Golden Age of Humanity

by Byron Reese

In Byron Reese's talk titled "Big Data and the Coming Golden Age of Humanity," presented at the LoneStarRuby Conf 2013, he explores the transformative potential of big data driven by the proliferation of inexpensive sensors. The main theme revolves around how this data collection could lead to enhanced decision-making capabilities for humanity, creating a collective memory that will allow us to learn from every cause and effect without needing to rely on memory alone.

Key Points Discussed:
- Evolution of Data Handling: Reese outlines how the costs of storing, moving, and calculating data are rapidly approaching zero. Additionally, collecting data will also become exceedingly cost-effective due to sensor technology.
- Digital Echo: Everyday actions create a "digital echo," such as credit card transactions or phone call logs. This information could be logged comprehensively to build a detailed account of individual lives.
- Collective Memory: With pervasive data collection, humanity could establish a robust collective memory. This would allow for improved learning across generations, eliminating the need to relearn information over time.
- Interconnections in Data: Reese elaborates that unseen connections often exist within vast data arrays. For example, the serendipitous conversion of Wellbutrin to Zyban showcases how data can reveal patterns unintended by initial creators.
- Future Applications: The talk presents hypothetical scenarios, such as researching the effects of diet on health, where vast data pools enable breakthroughs in understanding complexities of life and improving well-being through personalized recommendations.
- Technological Assistance: Reese anticipates a future where insights from data guide our decisions, making us wiser than previous generations. He addresses the implications this would have on various sectors including medicine, education, and beyond, fundamentally changing our approach to solving societal problems.
- Trust and Privacy: Despite potential concerns, Reese conveys optimism that as society becomes more aware of privacy measures, people will willingly share their digital information for the greater good.

In conclusion, Reese argues that soon we may reach a point where we can harness vast data to understand the intricate connections between actions and outcomes, ultimately leading to more informed decision-making and problem-solving strategies for global challenges. He believes this marks the dawning of a new phase in human history where knowledge becomes abundantly accessible and impactful.

00:00:16.160 Thank you so much for that introduction. I first heard of Ruby in 2007 when I was asked to become an adviser for a rapid development company that was using Ruby in the Bay Area. I was really impressed with the technology, but even more so with the people I discovered using it. As I got into that role, Ruby, with its emphasis on convention over configuration, struck me at the time as a seismic shift in development languages. I come from the world of programming tools; the older ones among you might remember. I was with a company called Rogue Wave, where we made C++ application frameworks and various tools. I kind of came out of that world and stumbled upon Ruby. In those days, back in the late '90s, I was concerned about two trends that seemed quite at odds with each other. One was that applications needed to become more robust and futuristic, while websites were trying to do more and more.
00:00:45.360 Individually, that's fine, but superimposed on that were massively shortened development cycles. I wondered which of those two imperatives would break first. It turns out neither did, because the languages and environments solved for both of them. In its soul, Ruby is a language for the web, for the cloud, and since it predates the cloud, we can view its arrival as prophetic. Many things that are ahead of their time never live to see their time, think about Da Vinci and his drawings of helicopters, for instance. But this was not the case with Ruby; even in the late '90s, you could see the future form and outline at that distance. I knew it would play a huge role in the epic drama that would unfold in the next decade.
00:01:16.320 However, more than being impressed by the language itself, I was struck by the people who were adopting it. They were not just your wide-eyed, suspender-wearing, optimistic theorists excited by some shiny new toy; they were individuals who had jobs to get done and who wanted to perform those jobs as best they could, with as little fuss as possible. Ruby's intentional promotion of programmer happiness indicated that it was a technology not optimized for machines, but for humans. If these individuals liked it, I had a good feeling about its prospects. Yet, they were not merely pragmatists; the Ruby programmers I knew half a decade ago had one eye squarely on the future.
00:01:42.040 In that spirit of looking ahead while staying grounded, they influenced the thoughts I wanted to share with you today. My comments will focus on the future, data, and how it will change the world. I want to give you a glimpse into a world I believe you will all live to see. I have faith that this audience, unlike any other technical crowd, will walk that tightrope with me — that balancing act of envisioning the future as big as it can be while never venturing into mere speculation or wishful thinking. I intend to show you a future that is both inexplicably impossible and yet inevitable, and I would like your feedback on whether I've achieved that.
00:02:47.479 There is a drug known as Wellbutrin that is prescribed for depression which has a long and successful clinical history. Some years ago, users taking Wellbutrin noticed that their cravings for cigarettes decreased. This led the makers of the drug to test it and find that it was, in fact, a highly effective smoking aid. They then repackaged the exact same drug under the brand name Zyban, specifically prescribed to help people quit smoking. Now, think about that for a moment: nobody ever contemplated Wellbutrin as an anti-smoking drug. It only became one because users of the medicine happened to notice this side effect. It raises a question: how many unseen connections exist in the universe, causes and effects that we cannot perceive because they're buried in the vast data populating our world? I believe they are limitless.
00:03:23.880 Consider another example: schools with fluorescent lighting. Children in schools lit by fluorescent lights have fewer cavities than those in schools with incandescent lighting. Why is this? It turns out that fluorescent lighting increases the body's production of saliva, which combats tooth decay. The subtle interplay of everything in the universe with everything else is dramatically more complex than we can comprehend. Our ability to find cause and effect within it is highly suspect. We often stumble upon causal associations almost by accident, much like someone finding the scientific equivalent of 'your chocolate is in my peanut butter.' Nevertheless, I believe this state of affairs will change due to the convergence of four trends.
00:04:02.720 The first three of these trends are well-known in this room: the cost of storing data will go to zero, the cost of moving data will go to zero, and the cost of performing calculations on data will go to zero. The fourth is a bit more subtle; it involves a reduction to essentially zero in the cost of collecting data in digital form. Presently, if you think about it, there is a minuscule amount of data available in digital form. I'm not referring to simply converting paper-based medical records into digital format; I'm talking about all the trillions of pieces of data created every second on this planet. Soon, all of this data will be collected passively due to the widespread proliferation of cheap sensors whose costs are also decreasing rapidly.
00:04:56.600 This covering of the planet with sensors is not being orchestrated by any central nefarious organization; it is already happening because of the tremendous improvements that data can bring to our lives. For example, a camera mounted on my house that scans the faces of everyone who comes and knocks on my door to check if they have a criminal record is inherently useful. Additionally, it passively collects data. Similarly, a skillet that detects botulism and remembers everything you've ever cooked in it is essentially useful by itself and generates data. Soon, the cost of collecting this data will drop to zero; before long, it will become ubiquitous. The impact of this, combined with the other three trends, will be profound.
00:05:56.600 You see, already, as you pass through modern life, you leave a digital echo of what you do—a picture of who you are, where you are, and what you are doing. More and more of your life leaves such an echo and will continue to do so. For example, you can see it today when you fill up your car at a gas station; when you pay with a credit card, that records where you were and what you were doing at that time. Even though abbreviated, it is revealing of your comings and goings. Your cell phone already logs every person you call and how long you talk to them. Take a moment to imagine if this practice is taken to its logical extreme: what if everything you do is recorded?
00:06:47.640 What if it were all logged? Let's explore this: imagine every word you said was written down and transcribed for your own personal reference. This would be incredibly useful; you would never have to try to remember what you promised to a client last Friday – you could simply look at the transcript. Next, imagine GPS accurately tracking everywhere you go, even within your own home. Then, imagine not just that you visited a certain address, but you went to a movie theater and, based on where you sat, you saw Episode Three of Star Wars. Consider watching TV, flipping through channels—every channel you pause on, every channel you watch, and every channel you return to would all be perfectly logged. In the kitchen, that pan logs everything you cook in it; every piece of silverware you will own in the future will recognize you as the person holding it and calculate the calories and nutritional value of every bite you consume. Your house would be aware of every room you are in and what you are doing in it.
00:07:55.840 Imagine everything being recorded—every single thing you buy, every meal you order, every restaurant you visit, every word you type, every book you read, the timing of your sprinkler, the last time you went to the dentist. Everything you see, everything your eyeball tracks, and not just what your eyeball tracks but how long you looked at it and your physiological response—did your muscles tighten? Did you smile? Picture if it were all recorded: every breath you take, every bite you eat, every heart beat. The concept of a real digital echo of your life may seem daunting; love it or hate it, this is the inevitable direction that technology is taking us. This logging is a byproduct of all the countless things we want computers to assist us with.
00:09:05.440 It is profound because if we can actually collect all the data our lives generate, we will have created a collective memory for the planet—a record of every cause and effect. This would be a monumental accomplishment for humanity. This isn’t merely the end of forgetting; it means that for the first time, humanity can possess a robust collective memory where we don’t have to learn and forget, learn and forget, over and over again across eons. We could actually learn things and remember them, allowing everyone's life experiences—every action and its outcome—to become data that improves everyone else's life.
00:10:05.360 Now, let’s take a moment to discuss privacy. I am genuinely interested in the innovations we can develop on the internet that have no offline counterpart. These are the things that teach us about ourselves. I already knew people wanted to sell the stuff in their attic or send money to people before eBay or PayPal came along because the offline world had already created garage sales and Western Union. But services like Twitter reveal to us aspects about ourselves that we didn't know. Overall, the things we are building on the internet reflect positively on us. We are creating tools to connect with, share information, collaborate, and support each other.
00:10:28.880 Now, I recognize there are numerous nefarious uses of the internet. Nonetheless, on the whole, we are developing it for positive purposes, and one surprising revelation is how willing people are to expend time and energy helping total strangers—people they will never meet. You can see this when someone posts a personal problem online and, unexpectedly, people take the time to write extensive responses, all for a complete stranger. It is heartening to think that we have a natural desire to help one another. So, I believe if people feel their privacy is safeguarded, they will be willing to contribute their data to the greater good.
00:11:27.400 You already see this happening; when I buy a book on Amazon, I'm aware they use that information to anonymously suggest that book to others. Over time, people will realize they're contributing to a comprehensive knowledge base that records every action of every member of humanity, and this information will be used to enhance lives meaningfully. Let's assume most people do share their digital echoes—this complete experiential logging of their lives—on an anonymous basis, and that we have the ability to ask it questions that have never been asked before, receiving instant answers.
00:11:56.920 We could begin to uncover connections between seemingly unrelated data. For example, what if it turned out that people who drink a certain type of coffee have, on average, higher incomes and those who switch to that coffee see a rise in income? What if it emerged that another specific group of individuals participating in diverse activities report greater happiness than their peers? Or imagine if gasoline produced at a particular refinery burns cleaner than others. Questions could arise about why traffic jams are fewer in certain cities compared to comparable ones, or why certain schools have lower dropout rates than demographic matches elsewhere. Let’s dive a bit deeper into one of these hypothetical scenarios, say with medical implications.
00:12:49.840 What if everyone treated for breast cancer were willing to share their digital echoes—anonymously logged information about their genome and every aspect of their daily life, including everything they ate and every exercise they did? Breast cancer, like many diseases, has patients who improve and others who worsen, but we often lack clarity on why. Imagine a computer analyzing this vast and inconceivably large amount of data, pulling out patterns—finding, for instance, that individuals who eat radishes tend to recover more often than those who do not. Of course, we might not know if radishes actually contribute to better health, or if some underlying factor that drives the craving for radishes is the true cause.
00:14:36.199 Now, digging deeper, what if we found that this trend is localized to certain geographical regions? The data points to pockets where the effectiveness of radishes is significantly higher, while elsewhere, it's non-existent. Analyzing the data would reveal that most recovering patients sourced their radishes from stores supplied by specific farms. Further analysis could reveal that these farms utilized a certain pesticide containing a particular chemical component. In such cases, we would see that brilliant doctors strive to uncover patterns, leading drug companies to commission studies to determine whether that chemical in radish pesticides could potentially treat breast cancer.
00:15:54.000 These companies would run clinical trials and discover that the treatment is effective in a majority of cases, less so in a few, and determine that it works best for individuals with a specific genetic marker. What an extraordinary moment that would be! Suddenly, we gain knowledge about a treatment applicable under specific criteria, which could be transformative. However, findings like this remain unattainable today because our current ability to decipher the complex interactions between everything on the planet is limited. Currently, we rely on happenstance and luck, often stumbling upon insights that could guide us, making serendipitous discoveries.
00:17:07.320 This evolution can redefine our approach. Instead of formulating a hypothesis, constructing experiments, and retesting our ideas, we will treat all of history as an expansive experiment from which we can glean meaningful insights. The world is saturated with unexplainable anomalies. For example, why do individuals who win Academy Awards statistically live longer than those who are nominated but do not win? Why is it that first basemen tend to live statistically longer lives than other players on their team? Research suggests tall individuals have shorter lifespans; is this because they tend to fall more, or does their greater height play a role?
00:18:06.560 These questions might have latent explanations, and while they may not lead to actionable insights, they illustrate an underlying principle. Scientists will uncover subtle associations between diet, behavior, genetic dispositions, and other factors that will ultimately lead to insights into how to live longer, happier lives. I suspect many of us won't need to absorb all of this information manually; instead, systems will tailor recommendations for us based on our unique data and life experiences. If the system determines that individuals with my profile experience more happiness and health with certain nutrients, a custom vitamin formulation could emerge to monitor and optimize my well-being.
00:19:12.880 This technology offers an unprecedented level of awareness and optimisation. It is the end of relying on anecdotal evidence, where one might say, 'Aunt Martha takes a teaspoon of honey every morning and hasn't been sick in years; I should do the same.' Instead, we could comprehensively understand outliers; for instance, why do some people reach the age of 113, often while having habits that seem antithetical to longevity? Are there unique genetic factors at play? Do certain mental exercises stem from exposure to a specific chemical through routine activities? Understanding these nuances may provide invaluable guidance.
00:20:08.480 We will learn how to align our lifestyles to maximize happiness and longevity, and while preferences will differ—some may opt for deep-fried butter on a stick—we will finally have the option to make data-driven choices. This future will unfold gradually, with more people participating, more data passively collected, and the collective repository growing. As we gather successes and insights, this momentum will accelerate further data collection. Once this ball starts rolling, it will quickly gain speed, bringing about a scientific revolution that can be accessed not only by trained scientists but by anyone.
00:21:38.840 Individuals will develop hypotheses, consult the machine for answers, and refine their inquiries until they uncover causal relationships. This approach allows us to view the whole world as a vast experiment, where increasingly intelligent specialists will perform even more complex tasks, resulting in massive discoveries—not only in health but in fields such as crime patterns. For instance, what makes some homes more prone to burglary? Why are certain schools achieving higher graduation rates than others? I find comfort in knowing that the answers to society's most challenging issues can be traced in the data.
00:22:57.680 Most of our hardest problems can be framed as technical problems—hunger, disease, poverty—all inherently solvable through technical means. Thus, I believe we will soon be wiser than anyone who has ever lived, and we will tackle what I refer to as the 'lasagna problem.' Before explaining that, I want to briefly discuss our history with knowledge and information and the profound costs associated with mistakes. Take Archduke Franz Ferdinand, who was assassinated by a 19-year-old named Gavrilo Princip. The driver of their car made an incorrect turn, and because this man took a wrong detour, it directly instigated World War I, which in turn led to the Great Depression and a global rise in fascism, eventually culminating in the Cold War and the atomic bomb's deployment.
00:24:06.039 Throughout this time, it is estimated that around 80 million people died—a staggering death toll owing to one man's single wrong turn. This exemplifies the critical nature of knowledge; historically, knowledge has often been fragile. From a past perspective, the totality of knowledge was confined to your village—perhaps 300 or 400 people. If someone in a neighboring village held valuable information, it didn't have much bearing on you since it was effectively non-existent.
00:25:16.960 For humanity, the sum of knowledge was bound to perhaps a few hundred individuals. Moreover, knowledge was exceedingly fragile; it died with the person who held the information unless they shared it, which could lead to misremembering. We learned things, and then upon someone's passing, those learnings would be lost. Progress was made as the world developed, leading to a significant leap forward with the invention of the modern book. This innovation, paired with its widespread dissemination at relatively low costs, made ideas persistent; they began to outlive their creators and could travel, making them more accessible and widespread.
00:26:24.280 However, this also birthed problems. For instance, it led to the establishment of libraries, which highlighted a challenge I call 'the truth is out there' problem. Essentially, if the answer to your question exists within a library but you can't find it, it becomes as good as non-existent. This conundrum beckons us to examine the origins of search engines.
00:26:56.240 The first search engine dates back to around 500 BC in Greece, where an Oracle resided. The famous Oracle at Delphi had many followers, including King Croesus of Lydia. He wanted to test the Oracle’s legitimacy, so he selected seven trusted individuals and sent them to seven different oracles. On a designated date, they were instructed to ask, 'What is King Croesus doing right now?' Without revealing his actions, this served as a test of the Oracle's capabilities.
00:28:04.720 Interestingly, according to history, the Oracle at Delphi was able to describe Croesus at that very moment, noting he was in the process of preparing his favorite goat and turtle stew. This remarkable accuracy brought proliferation and funding to the Oracle’s practices. Unfortunately, there were some drawbacks; the Oracle was only operational in spring and summer, which, by our standards today, signifies uptime difficulties. Moreover, seekers often had to undertake significant journeys to consult the Oracle, and its responses were frequently cryptic, leading to potential misinterpretations.
00:29:07.040 For example, when Croesus asked whether he should engage the Persians by crossing a river, the Oracle ambiguously responded that crossing the river would result in a mighty empire's downfall. It wasn't until after the cross that he realized it was indeed his own empire that fell. Fast forward to the future; within the Star Trek universe, we’re introduced to a computer that serves as an Oracle on the Enterprise, answering questions directly without hesitation or ambiguity.
00:30:10.080 In this future-centric narrative, our notion of oracles and knowledge acquisition evolves. No longer are we seeking advice; instead, we ask about queries in terms of data and factual information. This leads me to discuss what I refer to as the 'lasagna problem.' Imagine asking for recommendations for Italian restaurants while in an unfamiliar city. In the past, your only resource would have been a Yellow Pages listing, offering minimal guidance on quality or personal preference because you had no real knowledge about the options available.
00:31:47.720 Then, the internet emerged, providing not just data but also knowledge through platforms like Yelp. While this made decision-making more informed, it was still based on anecdotal evidence and personal bias. Today, with developing technologies, we can automatically apply unique values to our choices. Imagine if an advanced system analyzed all your friends' restaurant selections, checking which places were frequented and enjoyed, filtering through complex variables you might never even consider.
00:32:54.680 As a result, this system would be capable of making sophisticated recommendations that precisely reflect your preferences and historical choices. The next day, it would evaluate whether you enjoyed that recommended restaurant and refine its suggestions. Over time, the algorithms behind this process will learn from your interactions, leading to increasingly accurate recommendations. Of course, the extent of the system’s influence depends on your willingness to trust and act on its suggestions.
00:34:03.680 Doctors and other professionals experience a similar trust-based relationship, relying on the comprehended knowledge of their field. Choosing a dining establishment may not seem as monumental as deciding on a college, but like that restaurant choice, educational decisions will be guided by the collective experiences of countless individuals. This type of data will illuminate trends, allowing one to assess better which colleges yielded desirable outcomes and what variables contributed to those results.
00:34:42.700 The ability to analyze vast amounts of complex data will facilitate well-informed decisions. Even though knowing what to do doesn't guarantee action, it can provide insights that overshadow the risks of ignorance—like a mere wrong turn leading to the death of millions. This future, one where we embrace these technologies and the exponential gathering of information, is not just imminent; it is unfolding before us. The process is already beginning to gather momentum as we passively collect data about ourselves.
00:36:03.760 Now, I appreciate that it might be challenging to envision a world transformed by such seemingly insignificant changes, but consider the Industrial Revolution. Someone once proposed that breaking tasks into smaller components could lead to increased efficiency and innovative manufacturing; a notion hard to perceive at the time but proven revolutionary in its impact. Similarly, our upcoming advancements in data utilization promise to yield profound change. You will witness the day the world evolves. Indeed, we live at a pivotal time in civilization, and you will be able to tell future generations that you were present at this moment of transformation.
00:36:30.920 Thank you very much.
Explore all talks recorded at LoneStarRuby Conf 2013
+21