Science can follow more than one path
If Albert Einstein had never been born, would others have discovered relativity in his stead? Or was his contribution so singular that it would take a very long time for others to do the same?
I’ll leave the question of Einstein to a historian, but today I want to think about the more general question of contingency in science. How much does any particular scientist or group matter? On the one hand, nature is nature, and perhaps it is inevitable that any group of people dedicated to learning her secrets would find them, perhaps in roughly the same order. On the other hand, there is quite a lot of nature out there to learn about; maybe different people will never really end up exploring the same questions.
One strand of evidence on this comes from the history of multiple simultaneous invention. I discuss that literature in some detail here. My basic conclusion there is that the chances of multiple independent discovery are reasonably high for any discovery that people believe, in advance, will be important, since these ideas end up getting a lot of “shots on goal.” But for all the rest, the chances of multiple independent discovery are pretty low: if you have an idea but don’t end up getting around to it, the probability someone else will do so isn’t very high. Taken together, there is some level of redundancy in the big important ideas, but only if there is a consensus in advance on what those will be. All the other details and the ideas that we see are important only after the fact probably don’t have much redundancy.
But there are two reasons to be cautious about leaning too heavily on evidence from multiple independent discovery.
First, rather than reflecting low redundancy in innovation, low levels of simultaneous discovery might be the outcome of scientists/inventors dividing up the intellectual landscape to avoid incursions into the “territory” of their rivals. In practice, scientists are aware of the different specializations and interests of other labs, and may well eschew work in those areas to avoid being scooped. But that doesn’t mean they are incapable of discovering ideas in those areas; they just choose to avoid them. It may be this avoidance that drives low rates of simultaneous discovery, but low rates of simultaneous discovery don’t actually imply low redundancy in innovation.
Second, even if this isn’t the case, calculations that extrapolate from simultaneous discovery assume the probability of getting scooped is constant over time. Maybe that’s wrong. Maybe, as more knowledge around an area gets filled in, it gets increasingly likely we’ll make a discovery we might otherwise have missed. On the other hand, maybe the opposite is true; as science and technology move on, it might become increasingly unlikely that we’ll make a missed discovery.
Fortunately, we can get some complementary evidence from alternative literatures that essentially look at “divergent paths in the history of science.” To me, these also suggest contingency is important in science.
Suppose scientists proactively choose to avoid topics they believe have been informally claimed by others, even though they are perfectly capable of making the same discoveries in that topic. This has quite different implications for the contingency of science. For example, suppose a scientist has informal ownership of a specific topic - they were among the first to publish in the area, and everyone knows they have very good data and skills for continuing to do excellent work there. During this scientist’s life, others avoid work in the area, leading to a very low rate of simultaneous discovery. But if the scientist dies, we might expect a new scientist to take over the topic and make the same discoveries the deceased would have made.
It turns out, there is work that looks at an something quite like this example.
Azoulay, Graff Zivin, and Wang (2010) and Azoulay, Fons-Rosen, and Graff Zivin (2019) both study a sample of roughly 100-500 eminent life scientists who died in the midst of an active research career. When these scientists died, did others step into the gap and make the discoveries they would have made, had they lived? Of course, we can’t know that (absent access to a multiverse). But we can do the next best thing and match each of these deceased life scientists to another set of eminent life scientists and follow the trajectory of science across these individuals. For example, if the death of a scientist is associated with observable changes in the kind of research that is performed, relative to what we see among those who live, then that suggests those who follow in the footsteps of the deceased are not merely replicating what the deceased would have done.
Azoulay, Graff Zivin, and Wang (2010) focuses on what happens to the collaborators of eminent life scientists when they pass. One theory we might have about redundancy in science is that when an eminent life scientist passes away, collaborators who work on closely related ideas will pick up the baton and continue the work, at least after a period of grief and mourning. But if that’s the case, it doesn’t show up clearly in the data.
Azoulay, Graff Zivin, and Wang show that when you compare the publications of collaborators working with eminent life scientists who live and eminent life scientists who die, those working with the deceased publish steadily less work over time, as much as 15 years after. Moreover, this publication penalty is actually more severe for those working on the most similar topics as the eminent life scientist (as judged by the overlap in topics they work on). That all suggests the collaborators with an eminent life scientists are not easily able to “replace” the discoveries that would have been made if a life scientist had lived.
What about non-collaborators?
Azoulay, Fons-Rosen, and Graff Zivin (2019) uses the “related articles” algorithm in PubMed to define thousands of little microfields, each consisting of dozens of closely related articles. In some of these microfields an eminent life scientist working in it died amidst an active research career and in others an otherwise similar eminent life scientist lived. Azoulay, Fons-Rosen, and Graff Zivin then look to see how these microfields evolve from that point forward.
Consistent with the notion that scientists do respect intellectual property rights, when a scientist dies new people move into the field. The figure below plots the extra publications published by non-collaborators in fields where a life scientist dies, as compared to fields where an eminent life scientist does not die. That’s consistent with the notion that one reason the probability of simultaneous discovery is not higher is because people avoid working in areas where they know prominent scientists are active.
However, there are several indicators that the ideas these people pursue in this field differ from the ideas that would have been pursued by the deceased. The citation profile of publications changes when an eminent scientist dies; there is an increase in very highly cited new publications, relative to fields where the eminent scientist lives.1 And the new publications also look different in terms of what they cite themselves: there are fewer citations to the work of the eminent life scientist and fewer citations to pre-existing work in the field. Lastly, the topics under study in this field change. The biomedical sciences use a standardized lexicon of keywords for classifying articles, and the keywords attached to articles in fields where a scientist dies tend to be younger and to feature more newer combinations of keywords. It all suggests people are not interchangeable; when one person exits a field, those who come after do not seem to do the same thing.
This is some micro-level evidence on “paths not taken” in science. But we also have some more macro-evidence.
Normally, one of the challenges in studying science is that we don’t have multiple scientific ecosystems, which might allow us to see how different characteristics are correlated with different variables of interest. That’s because modern science is usually best thought of as a single global endeavor where new discoveries are rapidly communicated across the world and people readily move between different organizations and countries. But geopolitics gives us a small number of cases where communication of discoveries is significantly impeded for a long time. These cases broadly confirm that it is possible for an idea to be discoverable without being discovered for a long time.
Iaria, Schwarz, and Waldinger (2018) study disruptions to international science in the wake of World War I. As described in more detail here, World War I had the effect of cleaving the international scientific community into two comparatively isolated communities. Iaria, Schwarz, and Waldinger show, for example, that delivery of scientific journals published in enemy countries faced delays in excess of a year after the onset of war, and international conferences featuring speakers from different sides largely ceased until well after the war. Evidence that the war split the scientific community into two groups can also be seen in citations to the work of scientists from the other side. After the onset of war, the share of citations to papers from the other side fell 85%, relative to the share of such papers cited before the war.
While this wasn’t a complete separation, we can see signs that scientists on the two sides quickly begin to work on different things, rather than independently discovering the same things in parallel. To get at this, Iaria, Schwarz, and Waldinger use the titles of published academic work, after being translated into a common language (English). They condense these words to their roots, so that each title is now associated with a list of word stems. Using latent semantic analysis, they can infer the similarity of different article titles, based on whether they include word stems that belong to similar topics. In the figure below, they plot the average similarity of a paper title to the 5 most similar titles from one of two groups of countries. In blue, the similarity to papers published in other countries which are on the same side of the war. In red, the similarity to papers published in other countries on the opposite side.
To interpret this figure, imagine you’ve got some random paper published in the USA. To be concrete, let’s assume it’s a paper on electricity. Now imagine there are three stacks of papers in front of you, with five papers in each stack. One stack is the set of five papers, published in the USA, whose titles are most similar to the title of your paper on electricity. Another stack is the set of five papers, published by allied powers, but not in the USA, whose titles are closest to your paper on electricity. The last stack is the set of five papers, published in Germany and the other Central Powers, whose titles are closest to your paper on electricity. For each of these stacks, compute how similar your title is, on average, to the titles in the stack.
Prior to 1914, the similarity between your electricity paper and the USA stack was not really different from the similarity between your electricity paper and the Central Powers stack. This corresponds to a period when science was international and so ideas flowed relatively freely around the globe. People in both sets of countries worked on similar stuff. (Note there was a notable difference between the title of this paper and the titles of the allied papers stack, so it seems the USA worked on different kinds of science from its allies, even prior to the war)
Anyway, in 1914 war breaks out. From that point forward, if you repeat this exercise each year, the similarity between your paper and the Central Powers papers steadily declines, relative to the similarity with the USA papers or even the allied papers, until the end of the war. What’s the upshot? During the years of most severe disruption, the course of science diverges more and more each year. After WWI, journal communication is restored and we no longer see a continued divergence, though neither do we see complete convergence in topics (relations remained frosty and in-person conferences between sides was rare, so a partial split of science remained).
For an even longer period of separation, we can look at the Cold War. During this period, science and technology on either side of the iron curtain developed under some degree of isolation. This fact has been exploited by a few papers which use the unexpected collapse of the USSR as a way to learn how new knowledge spreads and what is its impact (see here and here).
An early paper in this literature is Borjas and Doran (2012), which looks at the impact of the collapse of the USSR on labor market outcomes for American mathematicians. For our purposes today, what is interesting is Borjas and Doran’s documentation of the extent to which Soviet and US math diverged during the Cold War. A few quotes suggest it was not at all uncommon for discoveries on one side of the iron curtain to remain undiscovered on the other side for decades:
In the Soviet Union, for example, the mathematical genius Andrew Kolmogorov developed important results in the area of probability and stochastic processes beginning in the 1930s. In a scenario common throughout Soviet mathematical history, he established a “school” at Moscow State University, attracting some of the best young minds over the next four decades, such as the teenage prodigy Vladimir Arnold in the 1950s. Arnold himself quickly solved Hilbert’s famous “Thirteenth Problem” and initiated the field of symplectic topology… Because the United States did not have the unique Kolmogorov-Arnold Combination, the amount of work done by American mathematicians in these subfields was far less than would have been expected given the size and breadth of the American mathematics community.
Later, Borjas and Doran quote a New York Times article written in 1990, after the iron curtain fell and ideas began to flow more freely:
Persi Diaconis, a mathematician at Harvard, said: “It’s been fantastic. You just have a totally fresh set of insights and results.” Dr. Diaconis said he recently asked [Soviet mathematician] Dr. Reshetikhin for help with a problem that had stumped him for 20 years. “I had asked everyone in America who had any chance of knowing” how to solve a problem… No one could help. But… Soviet scientists had done a lot of work on such problems. “It was a whole new world I had access to,” Dr. Diaconis said.
Further emphasizing the anecdotal evidence that American scientist found much that was valuable but unknown to them from decades of Soviet research, the share of citations to Soviet papers by American mathematicians also rose sharply after their isolation from each other ended.
Compared to the evidence from simultaneous discovery, it is difficult to quantify how different all these paths are. This is in principle possible - if you take the top 100 most highly cited Soviet mathematics papers that were not communicated outside the USSR, how many of them refer to a comparable discovery made on the other side of the iron curtain? But it would take a lot of specialized knowledge to do this.
Despite this, we can say that noticeable differences quickly emerge when scientists die or research communities are separated. That is consistent with the evidence from the probability of independent discovery that redundancy in science is low.
New articles and updates to existing articles are typically added to this site every two weeks. To learn what’s new on New Things Under the Sun, subscribe to the newsletter.
Azoulay, Pierre, Joshua S. Graff Zivin, and Jialan Wang. 2010. Superstar Extinction. The Quarterly Journal of Economics 125(2): 549-589. https://doi.org/10.1162/qjec.2010.125.2.549
Azoulay, Pierre, Christian Fons-Rosen, and Joshua S. Graff Zivin. 2019. Does Science Advance One Funeral at a Time? American Economic Review 109(8): 2889-2920. https://doi.org/10.1257/aer.20161574
Iaria, Alessandro, Carlo Schwarz, and Fabian Waldinger. 2018. Frontier Knowledge and Scientific Production: Evidence from the Collapse of International Science. Quarterly Journal of Economics: 927-991. https://doi.org/10.1093/qje/qjx046
Borjas, George J., and Kirk B. Doran. 2012. The Collapse of the Soviet Union and the Productivity of American Mathematicians. The Quarterly Journal of Economics 127(3): 1143-1203. https://doi.org/10.1093/qje/qjs015