Evidence that discoveries are getting smaller on average
One of the most famous recent papers in the economics of innovation is “Are Ideas Getting Harder to Find?” by Bloom, Jones, Van Reenen, and Webb. It showed that more and more R&D effort is necessary to sustain the present rates of technological progress, whether we are talking about Moore’s law, agricultural crop yields, healthcare, or other proxies for progress. Other papers that look into this issue have found similar results. While it is ambiguous whether the rate of technological progress is actually slowing down, it certainly seems to be getting harder and harder to keep up the pace.
What about in science?
A basket of indicators all seem to document a trend similar to what we see with technology. Even as the number of scientists and publications rises substantially, we do not appear to be seeing a concomitant rise in new discoveries that supplant older ones. Science is getting harder.
Before diving into these indicators, I want to head off one potential misunderstanding. My claim is that science is getting harder, in some sense, not that science is ending or that we are on the verge of running out of ideas. Instead, the claim is that discoveries of a given “size” are harder to bring about than in the past. Importantly, this does not mean scientific progress has necessarily slowed down: science may be getting harder, but we’re also working harder at it (for example, by spending more on science) and the net effect on the rate of progress is ambiguous.
We’ll actually start with an indicator that shows no evidence of a slowdown though. Since scientists primarily communicate their discoveries via papers, the first place to look for evidence of increasing difficulty of making discoveries is in the number of papers scientists publish annually. The figure below, drawn from Dashun Wang and Albert-László Barabási’s (free!) book on the Science of Science compares publications to authors over the last century.
At left, we can see the number of papers and authors per year has increased basically in lockstep over the twentieth century. Note, the axis is a log-scale, so that a straight-line indicates exponential growth. Meanwhile, at right, the blue dashed line shows that the number of papers per author has hovered around 2 for a century and rather than falling, it is actually on the rise in recent decades. (As an aside, the solid red line at right is strong evidence for the rise of teams in science, discussed more here)
So absolutely no evidence that scientists are struggling to find stuff worth writing up. But that’s not definitive evidence, because scientists are strongly incentivized to publish and what constitutes a publishable discovery is whatever editors and peer reviewers think is publishable. If fewer big discoveries are made, scientists may just publish more papers on small discoveries. So let’s take a more critical look at the papers that get published and see if there are any indicators that they contain smaller discoveries than in the past.
Let’s start by looking at some discoveries whose importance is universally acknowledged. The Nobel prize for discoveries in physics, chemistry, and medicine is one of the most prestigious scientific prizes and has a history long enough for us to see any long-run trends. Using a publicly available database on Nobel laureates by Li et al. (2019), we can identify the papers describing research that is eventually awarded a Nobel prize, and the year these papers were published. Note several papers might be associated with any given award. For each award year, we can then ask, what share of the papers related to the discovery were published in the preceding twenty years. The results of that are presented below, though I smooth the data by taking the ten-year moving average.
Prior to the 1970s, on average 90% of the time, awards went to papers published in the last twenty years. But by 2015, the ten-year moving average was closer to 50%.
So recent discoveries seem to have a harder time getting recognized as Nobel-worthy, relative to a few decades ago. We can also compare the importance of different discoveries that won Nobel prizes. In 2018, Patrick Collison and Michael Nielsen asked physicists, chemists, and life scientists to pick the more important discovery (in their field) from sets of two Nobel prize winning discoveries. For example, they might ask a physicist to say which is more important, the discovery of Giant Magnetoresistance (awarded the Nobel in 2007) or the discovery of the Compton effect (awarded in 1927). For each decade, they look at the probability a randomly selected discovery made in that decade would be picked by their survey respondents over a randomly selected discovery made in another decade. The results are below:1
A few points are notable from this exercise. First, physicists seem to think the quantum revolution of the 1910s-1930s was the best era for physics and it’s been broadly downhill since then. That’s certainly consistent with discoveries today being in a sense smaller than the ones of the past, at least for physics.
In contrast, for chemistry and physiology/medicine, the second half of the twentieth century has outperformed the first half. In the Nobel prize data, within the second half of the century, there is no obvious trend up or down for chemistry and medicine. While that’s better than physics, it remains consistent with the notion that science might be getting harder. As we can see in the first figure here, the number of papers and scientists rose substantially between 1950 and 1980, which naively implies that the number of candidates for Nobel-prize winning discoveries should also have risen substantially. If we are selecting the most important discovery from a bigger pool of candidates, we should expect that discovery to be judged more important than discoveries picked from smaller pools. But that doesn’t seem to be the case.
So Nobel prize data is also consistent with the idea that discoveries today aren’t what they used to be. Whereas it used to be quite common for work published in the preceding twenty years to be recognized for a Nobel, that doesn’t happen nearly so much today. That said, an alternative explanation is that the Nobel committee is just trying to work through an enormous backlog of Nobel-worthy work which they want to recognize before the discoverers die. In this explanation, we’ll eventually see just as many awards for the work of today.
But it’s not clear to me this is how the committee is actually thinking: recent work is awarded half the time still if the committee thinks the discovery is sufficiently important. For example, Jennifer Doudna and Emmanuelle Charpentier were awarded a Nobel for their work on CRISP-R in 2020, less than a decade after the main discoveries. And when you look specifically at the work performed in the 1980s, it doesn’t seem particularly notable, relative to work in the 40s, 50s, 60s, and 70s, despite the fact that many more papers were published in that decade.
Still, perhaps the Nobel prize is simply too idiosyncratic for us to learn much from. Next, let’s look at another indicator of big discoveries, one which shouldn’t be biased by the sort of factors peculiar to the Nobel. This is the most top-cited papers in a given field.2 For example, if we look at the top 0.1% most highly cited papers of all time in a particular field, we could ask how easy is it for a new paper to join their ranks. If that has fallen over time, then that’s further evidence that today’s papers aren’t making the same contributions as yesterday’s.
On the other hand though, we might think it should get harder and harder to climb to the top 0.1%, even if discoveries are not getting smaller. After all, if discoveries are of constant size, earlier works have more time to get citations; it may not be possible for later papers to catch up, even if they are just as good. But there are also some factors that lean in the opposite direction. First, if work is only cited when relevant, then newer work should have an easier time being relevant to newer papers. Since the number of new papers grows over time, that gives one advantage to the new; they can be tailored to a bigger audience, in some sense (new papers also cite more papers than older ones). Second, the most esteemed papers of all time may actually stop being cited at high rates, because their contributions become part of common knowledge: it is no longer necessary to cite Newton when talking about gravity, or even Watson and Crick when asserting DNA has a double-helix shape.
So let’s proceed with seeing if there has been any change in how easy or hard it is to become a top cited paper, noting that won’t be the last piece of evidence we look at.
The closest paper I know of that looks into this is Chu and Evans (2021), which looks at the probability of a new paper ever becoming one of the top 0.1% most cited, even for just one year. But this paper does not plot this probability against time, like the previous charts: instead, it plots this probability against the size of a field, measured by the number of papers published per year. In the scatterplot below, each point corresponds to a field in a year. On the horizontal axis is the number of papers published in the fields in that year and on the vertical axis the probability a paper in that field and year is ever among the top 0.1% most cited. The colored lines are trends for each of these ten fields. Note this figure only includes papers published in the year 2000 or earlier. Since the analysis is conducted with data from 2014, every paper has more than a decade to accrue citations and get into the top 0.1%.
The figure shows pretty clearly that as fields get bigger the probability of jumping to the top 0.1% shrinks. It used to be the case that papers had a greater than 0.1% chance of being in the top 0.1% at some point. That, itself, suggests some degree of turnover and dynamism at the top of the field. But for the largest fields, that isn’t the case anymore. Note, while this chart has the size of a field on the horizontal axis, since fields tend to get bigger every year, this also shows us trends over time: up until the year 2000, newer papers had successively lower chances of supplanting their rivals and becoming one of the top 0.1% most cited.
Another variant of this chart tells the same story. In the figure below, Chu and Evans find the top 50 most highly cited papers in each year. In red below, they then track the proportion of those papers that stay in the top 50 in the next year. As this moves up, that means fewer and fewer papers are supplanting the top 50 most cited. Again, they plot this against the size of the field rather than time, but since the two move together, it also shows what is happening over time.
(In blue is a related measure, the year-to-year correlation of the rank of top 50 cited papers. In an appendix, they specifically show that this measure is correlated with time alone, so that there is less turnover in more recent years)
If being a top-cited paper is an indicator of large scientific impact, then the above suggests it’s harder to have a big impact than in the past.
Other interpretations are also possible though, such as the factors mentioned earlier. Alternatively, perhaps as fields grow a canon of select papers emerges, and everyone frames their work in relation to this canon, so that earlier work is perpetually cited, but citations don’t accurately capture the “size” of a new discovery. As with the factors peculiar to the Nobel prize, it might be that these two explanations can coexist. But in case there is something strange about how earlier work is necessarily canonized, let’s now turn to some indicators that cover science more broadly.
Another approach we could take to evaluate the rate of churn in science more broadly is to compare the citations received by papers over time. If older papers made bigger discoveries than younger ones, then we might expect them to hold on longer and be more highly cited than new papers. One simple way to assess this is to look at the share of citations made in each year to new papers. A simple measure of this is the Price index (named for Derek De Solla Price, not the cost of a good), which computes the share of citations made to papers published in the last 5 years (or 10 years in some variants).
Below, Larivière, Archambault, and Gingras (2007) compute the Price index for all papers on Thompson Scientific over the period 1900-2004. They compute two versions of the index. The 20-year index divides the number of citations to papers published in the preceding 5 years by the number of citations to papers published in the preceding 20 years. The 100-year index does the same thing, but dividing by the number of citations made to papers published in the preceding century.
I think this figure is best understood as describing two periods. From 1900-1955 there was a general increase in the share of citations made to recent work, interrupted by the two world wars. Each world war imposed big disruptions on the production of new scientific work (see the first figure in this post), which had the secondary effect of reducing the share of citations to recent work. But since 1955, the share of citations made to new work has fallen dramatically, by an amount comparable to the distortions caused by the world wars (though spread out over many decades).
Cui, Wu, and Evans (2022) document that this trend has continued until 2014, and is not driven by any single field. It appears to be an almost universal phenomenon. In the figure below, they calculate the share of (all) citations made to work published in the preceding 10 years. It’s down across the board.
This looks pretty alarming, but there are explanations besides a sharp decline in size of discoveries. Whenever the share of something goes down, there are two possible causes: it could be the numerator goes down and/or the denominator goes up. And in this case, it’s mostly the latter. The raw number of citations to recent work doesn’t actually seem to have fallen by very much, at least, according to Larivière, Archambault, and Gingras (2007). But the total number of citations papers make has gone way up, and most of that increase has been citations to older work.
So the real question is less “why have researchers stopped citing new work” and more “why are researchers citing old work at such a high rate.” One explanation is that older work contains the bigger discoveries, and we’re still living in their shadow. Another explanation, put forward by Cui, Wu, and Evans, is that the scientific labor force is aging and older scientists prefer to work on older topics and are resistant to change. This age dynamic could also be a factor in the persistence of top-cited papers of the past. But whether this is about the size of discoveries or an aging scientific labor force’s preferences (and the two explanations are not mutually exclusive either), the more distant past is increasingly influential in contemporary science.
However, a more innocuous explanation is also possible. It may be that citation norms have simply evolved to be more deferential so that academics are more likely to cite older works out of a sense of politeness, even though these citations do not actually inform their work. This could be the case, for example, because aging peer reviewers may insist on older works being recognized, or because aging scientists are simply more familiar with relevant older works than the younger scientists of the past.
We can’t rule out this explanation, but a survey by Teplitskiy et al. (2022) suggests citations to older work are not meaningfully different than citations to younger work. Teplitskiy et al. (2022) surveyed thousands of academics about the citations they made in their own work, asking them to rate how influential a given cited paper was in informing the citing paper. Once you account for various factors about a paper, how old it was doesn’t seem to have any impact on whether authors rate the paper as highly influential or not, at least for papers 5, 10, or 15 years “old” at the time they were cited. That’s inconsistent with the notion that most citations to older work are merely added to satisfy peer reviewers and don’t meaningfully inform research. (For more on Teplitskiy et al. 2022, see Do academic citations measure the impact of new ideas?)
That said, we don’t have evidence on how the influence of an older citation has changed over time: if you conducted Teplitskiy and coauthors survey in 1970, would we have found something different?
One issue with all the preceding evidence is that the quality of a research discovery is determined by how other scientists assess it: do they cite it, give it prizes, or tell us how important it is relative to alternatives. That might be the best we can do; a thorny problem in research is that evaluating the quality of research often requires the skills to do research, and most of the people with those skills are researchers. But it also means our assessment of the quality of research is tied up with any biases and blindspots a field might have about itself.
So let’s turn to a metric that isn’t based on the assessment of the scientists themselves. Rather than looking at the size of discoveries, we could instead try to chart the topics covered by a field over time. If a field is steadily spreading into new topics (from electrons to quarks to strings, for example), that suggests a field is learning new things and pushing out its frontier. On the other hand, if a field remains stuck on the same set of things (from strings to strings to strings, for example) that might be indicative that the field is struggling to make progress.
Milojević (2015) tries to get at this question by looking to the titles of published papers. Most people will probably only ever read a paper’s title, and so scientists usually try to broadcast what the paper is about with the title.3 Milojević uses the text of titles to define the topics a field is studying. Milojević identifies a topic as a string of words in a title that lies between a grammatical phrase delimiter (think .,:;) or a common English word with non-technical meaning (think “about”, “since”, “using”). When a topic has more than three words, she uses only the last three to define it. As an example, using this technique the most cited paper of all time, “Protein measurement with the Folin phenol reagent” would be construed to be about the topics “protein measurement” and “Folin phenol reagent.”
Now that Milojević has a way to define topics in a field, she then goes about counting how many distinct topics are mentioned in the titles of papers published in a given year. Analogous to the number of papers published per person, Milojević looks at the number of unique topics for every set of 10,000. In other words, the algorithm reads random paper titles until it reaches 10,000 topics, and then counts how many topics from this set of 10,000 are unique.
In 2022, Milojević updated the data from her 2015 paper as part of an OECD workshop. Below is the number of unique topics studied in a random sample of 10,000 topics across all of science, by year (which Milojevic calls the cognitive extent).
Through the twentieth century, there was a general rise in the number of unique topics studied in a given sample of scientific titles. Over 1935-1975 this rise was a bumpy one, but it looks like we mostly reverted to trend, so the overall rate of change was steady over a long horizon. But sometime since the 1970s, this upward trend has very gradually slowed to a stop, and even began to reverse slightly. If counting topics is a good way to measure the successful growth rate of a field, then this indicates fields are having a harder time growing today than in the past.
Is counting topics in this manner sensible? Two more papers taking different approaches to counting topics find supportive evidence.
Rather than attempting to identify discrete concepts in the titles of academic papers, Park, Leahey, and Funk (2023) look at individual words and pairs of words in the titles of 25mn different scientific publications. In the figure below on the left, they estimate the diversity of topics under investigation in each field by counting the number of unique words in titles and dividing by the total number of words (after a bit of processing to remove uninformative words like “the”). This measure has declined substantially since 1945, indicating more and more academics are writing papers with titles that use the same words as other academics.
The figure above on the right focuses on new combinations of words in titles. To generate this figure, Park, Leahey, and Funk make a list of every pair of words mentioned together in the title of an academic paper. They then exclude any pair which has been mentioned together in a previous year, and count the remaining pairs, which are new. The share of word-pairs which are novel has also been declining since 1945. Academics are increasingly likely to write papers recycling combinations of words that were used in earlier works.
Carayol, Lahatte and Llopis (2019) identify the topics authors are choosing to study using a much more straight-forward data source: the keywords authors supply to publishers to describe their own papers (some other elements of this paper are also discussed here and here). Unfortunately, this data is only available for 1999-2013. Fortunately, that is precisely the period when Milojević begins to observe the sharpest decline in unique topics. So do we see a similar thing when we look at the keywords authors use to describe their papers? Yes.
In the figure below, Carayol, Lahatte, and Llopis compare the growth rate of three different things, but we are most interested in the green dashed line and the blue line. The green dashed line shows growth in the number of publications over 1999-2013 by comparing annual publication to the number published in 1999. The blue line shows the growth in the total number of unique author-supplied keywords used to describe research, again by comparing the number in a given year to the number in 1999. This figure has a log scale, so that a straight line indicates constant exponential growth.
These two curves have begun to diverge. Whereas the growth in the total number of publications has been steady and exponential, growth in the total number of unique keywords has been slower than exponential. That seems broadly consistent with Milojević’s finding that the number of unique topics for a fixed sample size also stopped growing over this period. In other words, in terms of the number of topics being tackled by science, growth of fields is proceeding more slowly today than in the past.
On the other hand, Bentley et al. (2023) point out that a similar phenomenon is also at work in English outside of science: the number of unique words divided by the number of total words in Google books also exhibits a substantial long-run decline. That suggests what’s happening in science might not have anything to do with science per se: as they point out, a declining “unique words to total words” ratio is inevitable in any setting where the corpus for a relatively fixed language grows. I’m not sure how much weight to put on this objection. It seems reasonable to me to believe that if science is pushing into new fields and topics at a steady rate, then that should be reflected in a greater growth of new phrases and words than in non-scientific language; but I don’t have strong evidence to back up that intuition.
The preceding suggested a decline in the number of new topics under study by looking at the words associated with papers. We can try to infer whether a similar process is underway using the Consolidation-Disruption Index (CD index for short), which attempts to score papers on the extent to which they overturn received ideas and birth new fields of inquiry. Here though, I think the evidence is ambiguous at best.
To see the basic idea of the CD index, suppose we want to see how disruptive is some particular paper x. To compute paper x’s CD index, we would identify all the papers that cite paper x or the papers x cites itself. We would then look to see if the papers that cite x also tend to cite x’s citations, or if they cite x alone. If every paper citing paper x also cites x’s own references, paper x has the minimum CD index score of -1. If some papers cite x and no papers cite any of paper x’s references, paper x has the maximum CD index score of +1. The intuition here is that if paper x overturned old ideas and made them obsolete, then we shouldn’t see people continuing to cite older work, at least in the same narrow research area. But if paper x is a mere incremental development, then future papers continue to cite older work alongside it.4
Park, Leahey, and Funk (2023) compute the CD index for a variety of different datasets of academic publications, encompassing many millions of papers. In the figure below, we have a representative result from 25 million papers drawn from the web of science. Across all major fields, the CD index has fallen substantially. That suggests, on average, papers are becoming more and more incremental, and less and less disruptive.
The trouble is, very similar results are also generated if there is no underlying change in the disruption of papers, but older papers begin to be cited at a higher rate and if the number of citations per paper increases. Both are true. We have already looked at the rising tendency to cite older literature at a higher rate, which might be evidence for science getting harder, but might also be driven by more innocuous factors. But it’s also true that the number of citations per paper has been rising. For example, according to Petersen, Arroyave, and Pammolli (2023) the average length of the reference list of a paper published in Science has increased from 7 in 1970 to 51 by 2020, despite a relatively consistent paper length (well, excluding appendices). Brendel and Schweitzer (2019) show references in a set of math journals rose from 5 per paper in 1950 to more than 25 per paper after 2010.Schweitzer and Brendel (2021) show the average number of references made in articles on an economics paper database grew from around 15 to over 40 between 1970 and 2014.
What’s generating this increase? There are lots of suspects. Maybe it’s the increase in peer reviewers, who suggest or demand additional citations. Maybe it’s the falling cost of finding relevant papers thanks to Google Scholar and other digital search technology. Or maybe it’s because the more papers are incremental tweaks on the same core topics, so there is a greater supply of relevant work to cite.
Whatever the case, if citations per paper rises, that will tend to generate falling disruption scores. To see why, let’s imagine a paper is published in 1950 with zero references. Because it has no references, by definition, no papers that cite it can cite any of its references, and it will score the maximum disruption score of +1. Now imagine an equally disruptive paper published in 2020. Between peer reviewers and Google scholar, suppose this paper identifies 10 relevant citations. If anyone cites any of these 10 citations, then the 2020 version of the paper will get a lower disruption score than the 1950 version (even though the 1950 paper would have made similar citations if it could find them). And this problem isn’t unique to papers that make no citations; in general, as it gets easier to identify relevant citations, the disruption index for all papers will tend to fall as more papers cite your references and your list of references grows.
One way to assess the severity of this issue is to create placebo citation networks by randomly shuffling the actual citations papers make to other papers. So instead of paper y citing paper x, redirect the citation so that paper y now cites some other paper z, where z is published in the same year as x. This kind of reshuffling preserves the tendency over time of papers to cite more references and to increasingly cite older papers. But since the citations are now random, they should not convey any additional information about disruption. These placebo networks will exhibit a decline in disruption, for the reasons described above, but we can see if the actual decline in disruption exceeds this or not.
Park, Leahey, and Funk (2023) and Holst et al. (2024) both do this exercise, though on different datasets. In the figure at left below, we can see the results from Holst et al. (2024); using a different dataset from Park, Leahey, and Funk, they replicate the decline in the average disruption of papers over time (solid brown line). However, they show their randomly rewired placebo citation networks exhibit essentially the same trend (dashed line). That suggests any citation network exhibiting the same general patterns of growth in citations and the same tendency to cite older work would generate a decline in disruption that is on par with what’s observed in the data. In fact, they point out that the decline in the actual disruption index is a bit less than the decline in the placebo (note the gap between the lines shrinks).
On the other hand, Park, Leahey, and Funk (2023) also perform this exercise and quantify the gap between the actual data and the placebo networks in a different way (see the figure above and to the right - Holst et al. 2024 produce a very similar figure). What this figure shows is the difference between the actual data and the average of placebo networks, using a particular kind of measurement called a z-score. In this case, the z-score measures the difference in the true disruption score and the disruption score of the same paper, if we randomly reshuffle the citation network. Unlike the figure at left, it indicates the gap between the true CD index and the one generated by random citation networks is growing.
How do we square these contrary results? It has to do with how each side measures the significance of a given gap. The graph at left holds constant its unit of measure. By using z-scores, the one on the right allows its unit of measure to change when the disruption score of a given paper varies by more across different random networks. The idea is kind of analogous to how winning an individual poker hand is less impressive than winning a whole poker game, which is less impressive than consistently winning across a poker career. In this case, it appears to be the case that the variation in disruption scores among randomly generated citation networks has fallen substantially over time, likely for the reasons we’ve discussed. The figure on the right is asserting that smaller gaps between the true disruption score and the average disruption score of placebo networks should count for “more” as time goes on, because as time goes on, the rise in citations means it is harder for any particular paper to achieve an anomalously low disruption score. If we take that into account, papers seem to be getting less disruptive over time, relative to what we would expect; if not, the reverse is true. Like I said at the outset of this section: ambiguous.
(I think it’s also worth pointing out that this entire simulation approach takes it for granted that papers will increasingly cite older work, and looks for a decline in disruption above and beyond what we would expect purely from that fact.)
Let’s close by turning away, once and for all, from the messy world of changing academic citation norms over time.
Instead, we’ll talk about the messy world of patent citations over time! Patents also cite academic papers, where relevant. While patents have their own issues, one virtue is that inventors face a different set of incentives than academics. Whereas academics might cite papers or choose to work on topics that they know will be viewed favorably by their older peers, an inventor doesn’t face these constraints. In principle, they might just rely on whatever science is most useful for the purposes of getting some technology to work. Moreover, we have some good evidence science really is a useful input to technology, and that this is reflected tolerably well in citations to science. Indeed, Kalyani (2022) shows citations to recent academic articles (published in the last 5 years) are a particularly good indicator of a higher quality patent: patents citing recent academic work are significantly more likely to mention new technical phrases, and to be more valuable by a host of metrics (discussed in more detail here). Citations to older academic work are not. So if inventors are not citing recent science as much as in the past, that’s another indicator that recent science may be struggling to make comparable discoveries to the papers of the past.
I have not been able to find any papers that look at the extent to which patents cite recent academic research (if you know of such work, please send it my way and I can update this). Fortunately, Marx and Fuegi have a publicly available database of patent citations to academic work, which I used to put together the following figure. In each year of the following figure, I pull all citations made to academic papers by US patents whose application was filed in that year, and that was eventually granted within the next five years. I then compute the share of these citations that go to papers published in the preceding five years. Essentially, this is the Price index, discussed in the previous section, but applied to the citations of patents. I started the figure in 1975, as citations to academic papers were quite rare before then.
This figure echoes what we see in the academic price index: citations to recent work have become increasingly less common. Unlike the academic work though, this is not entirely a story of rising citations to older work and steady citations to new work. Instead, we are actually seeing a decline in the number of citations to recent work. Among the subset of patents that cite academic work, the average number of citations to papers published in five years prior to the patent’s filing date dropped from about 4 around 2000 to under 3.5 by 2015.
Stepping back, I’m claiming that science is getting harder, in the sense that it is increasingly challenging to make discoveries that have comparable impact to the ones in the past. Diverse groups - the Nobel nominators, contemporary surveyed scientists, academics, and inventors - all seem to have an increasing preference for the work of the past, relative to the present. And looking at growth in the number of topics covered by scientists also suggests it has become harder to make forward progress. To close, I’ll add two more arguments.
First, we should expect science to get harder because of the “burden of knowledge.” The basic idea is, almost as a tautology, making new discoveries requires new knowledge; otherwise the discovery would likely already have been made. Whenever new knowledge is discovered, it opens the way for new discoveries, and it may also displace or make obsolete some older knowledge. But if new knowledge does not entirely displace old knowledge, then it may be you need steadily more knowledge to make new discoveries. Unfortunately, I think we have quite good evidence this seems to be the case as a general (though probably not absolute) rule, much of it reviewed here. All else equal, if you need more and more knowledge to make a discovery of a given size, then you can probably expect discoveries of a given size to require more time or manpower to bring about. And the evidence of this post suggests that is what we have in fact observed.
Second, if it’s true that innovation in other domains has in fact gotten harder (and I think it has), then to the extent scientific discovery and other forms of innovation are a similar process we shouldn’t be surprised when what applies to the former also applies to the latter. As noted at the outset of this post, if you pick some metric of technological progress, odds are that a constant rate of progress along that metric is accompanied by rising R&D effort. Why should science be so different?
Each of these pieces of evidence has holes in it. But I think they are not the same holes. Stack them all up, and I think you get an argument that can begin to hold water.
In the appendix, I list some additional possible explanations for why science might be harder.
A good overview of some of the newer metrics for quantifying science, which helped me in drafting this post, is Wu et al. (2021).
New articles and updates to existing articles are typically added to this site every two weeks. To learn what’s new on New Things Under the Sun, subscribe to the newsletter.
Innovation (mostly) gets harder
Highly cited innovation takes a team
Publish-or-perish and the quality of science
Do Academic Citations Measure the Impact of Ideas?
Gender and what gets researched
The best new ideas combine disparate old ideas
More Science Leads to More Innovation
Science is good at making useful knowledge
Are ideas getting harder to find because of the burden of knowledge?
Innovation (mostly) gets harder
Are ideas getting harder to find because of the burden of knowledge?
Combinatorial innovation and technological progress in the very long run
What if we could automate invention?
Bloom, Nicholas, Charles I. Jones, John Van Reenen, and Michael Webb. 2020. Are Ideas Getting Harder to Find? American Economics Review 110(4): 1104-1144. https://doi.org/10.1257/aer.20180338
Wang, Dashun and Albert-László Barabási. 2021. The Science of Science. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108610834
Li, Jichao, Yian Yin, Santo Fortunato, and Dashun Wang. 2019. A dataset of publication records for Nobel Laureates. Scientific Data 6: 33. https://doi.org/10.1038/s41597-019-0033-6
Collison, Patrick and Michael Nielsen. 2018. Science is Getting Less Bang for Its Buck. The Atlantic.
Chu, Johan S.G. and James A. Evans. 2021. Slowed canonical progress in large fields of science. PNAS 118(41): e2021636118. https://doi.org/10.1073/pnas.2021636118
Milojević, Staša. 2015. Quantifying the cognitive extent of science. Journal of Informetrics 9(4): 962-973. https://doi.org/10.1016/j.joi.2015.10.005
Park, Michael, Erin Leahey, and Russell J. Funk. 2023. Papers and Patents are Becoming Less Disruptive Over Time. Nature 613: 138-144. https://doi.org/10.1038/s41586-022-05543-x
Carayol, Nicolas, Agenor Lahatte, and Oscar Llopis. 2019. The Right Job and the Job Right: Novelty, Impact and Journal Stratification in Science. SSRN working paper. http://dx.doi.org/10.2139/ssrn.3347326
Bentley, Alexander R., Sergi Valverde, Joshua Borycz, Blai Videiella, Benjamin D. Horne, Salva Duran-Nebreda, and Michael J. O’Brien. 2023. Is disruption decreasing, or is it accelerating? Advances in Complex Systems 26(2) 2350006. https://doi.org/10.1142/S0219525923500066
Wu, Lingfei, Dashun Wang, and James A. Evans. 2019. Large teams develop and small teams disrupt science and technology. Nature 566: 328-382. https://doi.org/10.1038/s41586-019-0941-9
Petersen, Alexander Michael, Felber Arroyave, and Fabio Pammolli. 2023. The Disruption Index Suffers From Citation Inflation and Is Confounded by Shifts in Scholarly Citation Practice. SSRN working paper. http://dx.doi.org/10.2139/ssrn.4486421
Brendel, Jan and Sascha Schweitzer. 2019. The Burden of Knowledge in Mathematics. Open Economics 2(1): 139-149. https://doi.org/10.1515/openec-2019-0012
Schweitzer, Sascha and Jan Brendel. 2021. A burden of knowledge creation in academic research: evidence from publication data. Industry and Innovation 28(3): 283-306. https://doi.org/10.1080/13662716.2020.1716693
Holst, Vincent, Andres Algaba, Floriano Tori, Sylvia Wenmackers, and Vincent Ginis. 2024. Dataset Artefacts are the Hidden Drivers of the Declining Disruptiveness in Science. arXiv:2402.14583. https://doi.org/10.48550/arXiv.2402.14583
Larivière, Vincent, Éric Archambault, & Yves Gingras. 2007. Long-term patterns in the aging of the scientific literature, 1900–2004. Proceedings of ISSI 2007, ed. Daniel Torres-Salinas and Henk F. Moed. https://www.issi-society.org/publications/issi-conference-proceedings/proceedings-of-issi-2007/
Cui, Haochuan, Lingfei Wu, and James A. Evans. 2022. Aging scientists and slowed advance. arXiv 2202.04044. https://doi.org/10.48550/arXiv.2202.04044
Kalyani, Aakash. 2022. The Creativity Decline: Evidence from US Patents. SSRN working paper. https://dx.doi.org/10.2139/ssrn.4318158
Marx, Matt, and Aaron Fuegi. Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331686
Wu, Lingfei, Aniket Kittur, Hyejin Youn, Staša Milojević, Erin Leahey, Stephen M. Fiore, and Yong Yeol Ahn. 2021. Metrics and Mechanisms: Measuring the Unmeasurable in the Science of Science. arXiv 2111.07250. https://doi.org/10.48550/arXiv.2111.07250
I’ve alluded to the burden of knowledge as a potential explanation for what we see here. If it takes more and more knowledge to push the frontier forward, then it may be that the frontier gets pushed forward at a slower rate, or at least that it take more people pushing to keep the rate constant. In this appendix, I want to list very briefly some other possible candidate explanations.
Aging: Cui, Wu, and Evans (2022) present evidence that older scientists are more likely to cite older work and work on older topics, and are more critical of work by younger scientists. It could be that the aging of the scientific labor force is exacerbating some of the forces of conservatism in science.
Scale: Chu and Evans (2021) suggest the sheer scale of modern science poses new challenges. When attention is fragmented across too many papers, it may become harder for scholarly consensus to emerge on the quality of new research topics, which tends to solidify pre-existing consensuses and stymie the development of the new.
Bureaucratic sclerosis: It seems hard for organizations to maintain flexibility and speed as they scale. It may be that as science grew it become overly bureaucratic, and this discouraged the kinds of high risk people and ideas that may have thrived in an earlier era.
Remote Collaboration: Science is increasingly conducted by teams, and those teams are increasingly composed of geographically distant people. It may be that proximity is the main way we form relationships with people with expertise quite different from our own. If the best ideas come from combinations of disparate ideas, that might bias science away from highly creative discoveries.
Rise of Teams: As noted in the first figure of this post, academic work is increasingly the work of ever larger teams of specialists (the burden of knowledge has been proposed as one explanation for this). There is some evidence teams are less likely to come up with disruptive new ideas, which might in turn lay the foundation for new and productive incremental work.
Premise is wrong: Perhaps there is no significant slowdown in scientific progress, because the relationship between progress and citations, prizes, and topic count has changed over time.