We have been fragilizing the economy, our health, political life, education, almost everything… by suppressing randomness and volatility. Just as spending a month in bed (preferably with an unabridged version of War and Peace and access to The Sopranos’ entire eighty six episodes ) leads to muscle atrophy, complex systems are weakened, even killed when deprived of stressors.
-Nassim Taleb, Antifragility (draft of prologue)
Such protectionist policies enforce stability at the cost of stifling both resilience and progress. They eliminate the checking process essential to trial-and-error learning, the way by which we identify the “failures” that new forms might correct.
-Virginia Postrel, The Future and Its Enemies
Google’s server architecture is very robust against failures. The quality of the company’s products, and their bottom line, depend on their ability to process enormous amounts of data without interruption and with a low risk of losing any of it. The danger is not hypothetical–companies have been wiped out because some freak accident they were unprepared for destroyed a large fraction of the data they relied on.
Steven Levy’s book on Google makes it clear that they were forced to become robust by their circumstances. Most companies at the time would pay for expensive, high-end servers that had a very low rate of failure. Google did the opposite–they went for inexpensive servers with an extremely high rate of failure. In order to survive, they had to create software for their servers that would preserve their data and keep their workflow from being interrupted even as servers failed left and right.
Google owes their resilient infrastructure to the fragility of their early servers.
In an active quest for resilient infrastructure, Netflix imposed disorder by design upon their servers.
Imagine getting a flat tire. Even if you have a spare tire in your trunk, do you know if it is inflated? Do you have the tools to change it? And, most importantly, do you remember how to do it right? One way to make sure you can deal with a flat tire on the freeway, in the rain, in the middle of the night is to poke a hole in your tire once a week in your driveway on a Sunday afternoon and go through the drill of replacing it. This is expensive and time-consuming in the real world, but can be (almost) free and automated in the cloud.
This was our philosophy when we built Chaos Monkey, a tool that randomly disables our production instances to make sure we can survive this common type of failure without any customer impact. The name comes from the idea of unleashing a wild monkey with a weapon in your data center (or cloud region) to randomly shoot down instances and chew through cables — all the while we continue serving our customers without interruption. By running Chaos Monkey in the middle of a business day, in a carefully monitored environment with engineers standing by to address any problems, we can still learn the lessons about the weaknesses of our system, and build automatic recovery mechanisms to deal with them. So next time an instance fails at 3 am on a Sunday, we won’t even notice.
Netflix understands that failure is feedback. Until something goes wrong, they won’t be able to figure out what problems exist in their ability to cope with failure. So rather than resting no their laurels, they put themselves through a constant trial by fire to force themselves to be ready and improve their system. It is no different than getting small doses of a disease or poison in order to build an immunity, or working your body out above and beyond the demands your life makes on it in order to increase its fitness. There are many things in human life where stressors are a prerequisite for improvement–or simple maintenance.
Yet stressors are precisely what we seek to hide from in the world of policy. It is my contention that we are too terrified of short term risk and volatility in this country. Rather than embracing Chaos Monkeys of our own, we simply keep a spare in the back of the car and assume everything will go well if we ever have a flat. The only way to grow stronger, wealthier, and more resilient in the long run is to expose ourselves to a lot more risk and volatility than we have lately shown a willingness to cope with.
Deafening Ourselves
It’s not my purpose to single out the environmental movement, but that does embody a certain mentality about risk that has become so tied up in intellectual knots that it has the net long term effect of making things more risky. It is my thesis that a small number of people have to be willing to shoulder greater risks in order to create changes that eventually reduce risk for civilization as a whole.
Stephenson’s point about risk is part of his larger argument that innovation in this country has stagnated, a view he shares with Tyler Cowen and Peter Thiel, among others. Putting his general conclusion to the side, I think the importance he places on at least some subset of the population needing to shoulder more short term risk to reduce overall long term risk is absolutely true.
Instead, we take measures to “manage risk”, deafening ourselves to feedback in the process.
For example, there are risks associated with allowing people to build what they want on the property that they own. They could introduce something that disrupts the neighborhood, either by taking up all the parking, or making noise, or both. So we have zoning laws, building permits, and various business licenses. As a result, real estate supply cannot respond to the massive demand for city living, and prices skyrocket.
Moreover, fewer business experiments are possible when everything has to fit a cookie-cutter business license. In Fairfax County, Virginia, a small theater had to wait nearly a year to open because the county had never had a theater before and wasn’t sure how to license one. That’s an enormous opportunity cost to impose on an operation of that size.
The political process through which license or zoning categories can be changed, and permits are issued, is extremely slow to respond to changes on the ground. While a more open system would hear the demand for denser development as loud as a scream, we’re so busy protecting ourselves from short term disruptions that we have essentially left ourselves deaf to it, and to all the potential beneficial innovations that could have happened.
This is no academic point; the toll of this aversion can be measured in wealth as well as lives. Nothing is more emblematic of our attitudes towards risk than the 12 year, multimillion dollar process that new drugs must go through before the FDA allows them to go to market. This lag has led to countless unnecessary deaths (PDF), not to mention making new drugs enormously more expensive once they finally do reach the market. And the ability of the FDA trials to even truly keep us safe is questionable–the data are not really random, and any effect that might seem small for a sample of thousands might never the less effect a huge number of people once it hits a market of millions.
The bottom line is that there are things that cannot really be known until you take the drug to market. Doctors should have to perform their due diligence of informing patients of the risks and unknowns, but delaying entry by over a decade and piling on enormous costs accomplishes very little. Unless your goal is to drastically reduce the number of new treatments we are capable of discovering per year.
We put off the short term risks and increase our long run costs.
Ditch Stability
The economy, politics, and job market of the future will host many unexpected shocks. In this sense, the world of tomorrow will be more like the Silicon Valley of today: constant change and chaos. So does that mean you should try to avoid those shocks by going into low-volatility careers like health care or teaching? Not necessarily. The way to intelligently manage risk is to make yourself resilient to these shocks by pursuing those opportunities with some volatility baked in. Taleb argues— furthering an argument popularized by ecologists who study resilience— that the less volatile the environment, the more destructive a black swan will be when it comes. Nonvolatile environments give only an illusion of stability
-Reid Hoffman and Ben Casnocha, The Start-up of You
We need more risk and volatility, and we need to give up our fruitless quest to hide from them.
In many ways this quest reflects a lack of historical perspective. We bail out the US automakers again and again because they were once the symbol of American greatness, and we think that once they are gone we will never shine again. Yet we forget that at the turn of the 20th century, 41 percent of our labor force was employed in agriculture, and at the end of it, it was down to less than 2 percent. We have undergone massive sectoral shifts before. There is no guarantee that it will go as well this time, but there’s also no reason to think that it won’t.
We restrict immigration and imports because they pose an immediate risk to specific workers and businesses in the short run. Yet we forget that during periods of far more open immigration and trade, we experienced historically unprecedented levels of growth. Moreover, opening these channels opens us to feedback–from the ideas, new business models, the scientific and technological breakthroughs occurring worldwide and that might occur here if we would allow people to come here.
We should not be focusing our efforts on fighting risk and volatility, but on fighting fragility. We should fight for feedback.
It is only in the face of volatility that we are able to innovate and grow resilient.