Why does patenting yesterday predict patenting today?
There’s this idea that technology is characterized by path dependency: once you start going down one technology trajectory, you kind of get locked in and it’s hard to switch to another, possibly better trajectory. That can happen for lots of reasons, but one possibility is that it’s something about the nature of knowledge itself. The more you know, the more you can learn: knowledge begets more knowledge. So whichever technology trajectory we start on becomes the one we know the most about, and therefore the one it makes most sense to stick with.
One line of evidence about this comes from dynamics of patenting. You know what’s a pretty good predictor of patent activity in the future? Patent activity in the recent past. In this post, I want to see what we can learn from a literature that directly or indirectly looks at this dynamic. But while I think this line of evidence is useful, I want to be up front that it also has significant limitations. Most importantly, in this literature we almost never get anything like an experiment. Instead, we’re reading the tea leaves in observational data.
Specifically, we’re going to look at the conceptual category of a “patent stock” (also frequently called a “knowledge stock”). To illustrate just what a patent stock is, let’s start with a practical implementation of the notion in a paper.
Aghion et al. (2016) is a paper interested in three kinds of technological progress among automobiles: clean tech (think electric cars, hydrogen fuel cells, and hybrid vehicles), fossil fuel technology (think internal combustion engine), and “grey” tech (think more fuel efficient fossil fuel technology). The paper measures innovation in these different technologies by counting valuable patents.1 There are about 2,500 companies and 1,000 individuals who hold one of these patents, and Aghion and coauthors construct three patent stocks for each of these inventive entities, one for each of these three flavors of technological progress. (Constructing patent stocks is hardly the only thing this paper does, but it’s what I’m focusing on today)
Constructing a patent stock basically means adding up all the patents that an inventive entity has taken out in the past, giving more weight to the more recent patents. To be very explicit, suppose Ford Motor Company had a clean technology patent stock of 1,000 last year and obtains 250 new patents for clean technology this year. Then, to construct the patent stock for current year, we take last years’ patent stock, multiply it by 80%, and then add to that the number of new patents. So this year’s patent stock is 0.8 x 1000 + 250 = 1050. Suppose we get 150 patents next year. Then the patent stock next year is 0.8 x 1050 + 150 = 990. The exact value of 80% isn’t that important and people use different numbers, though always less than or equal to 100%. The key idea is it’s telling us, in a single number, something about the prior patent activity of the Ford Motor Company. Note that last year’s patents are worth less than this year’s patents (in this case, 80% as much). And since we apply the 80% discount each year, patents from two years ago are worth even less (80% x 80% = 64%).
For the data that Aghion et al. (2016) have, for clean technology, a 10% increase in last year’s clean tech patent stock is associated with a 3% increase in this year’s new clean tech patents (for that firm). For fossil fuel technology, the link is even better: if a firm has a 10% higher patent stock in fossil fuels last year, that’s associated with 5% more new fossil fuel patents this year.
This is quite a robust (though not universal) finding. Rozendaal and Vollebergh (2021) look at the same context (clean and dirty innovation in automobiles), but with a more recent slice of data (2000-2016, instead of Aghion and coauthors’ 1986-2005). In their sample, they find an even stronger link: roughly speaking, a 10% increase in last year’s patent stock is associated with a 10% increase in patenting this year. Noailly and Smeets (2015) do something similar for innovation in renewable energy and fossil fuel energy (not automobiles). They also find that firms sought many more clean or fossil fuel patents when last year’s patent stock (of the appropriate type) was higher.
You can also go beyond firms and look at whole countries. Looking specifically at US patents, a famous 2002 paper by David Popp uses a slightly different approach to create patent stocks for 11 different technologies related to energy. He finds that a 10% increase in last year’s patent stock for a particular technology was associated with 7% more patents for that technology this year. And Porter and Stern (2000) compute patent stocks for 17 different OECD countries (looking at patents they seek in the USA, to hold consistent the definition of a patent). Again, a 10% increase in a country’s patent stock last year is associated with 8-11% more patents this year.
This isn’t a universal finding. Lazkano, Nøstbakken, and Pelli (2017) look at innovation in renewable energy, conventional energy, and storage (battery) technology. Unlike the work discussed above, they typically find the opposite result: a higher patent stock for energy storage technology last year is associated with less energy storage patenting today. Similarly for renewable energy. That said, this paper is the outlier (I’ll return to it briefly later). For now, let’s proceed with the understanding that a positive link between yesterday’s patent stock and today’s patenting is a pretty robust correlation. But before jumping from a correlation to a conclusion, we need to think a bit harder about what’s really going on here. Why is there this correlation?
(And if you feel like saying “patents don’t measure innovation!” I hear you, but bear with me for a bit)
One potential explanation is quite interesting. Suppose:
Patent stocks measure how much knowledge we have about a technology
When we have more knowledge, it’s easier to discover new things
The new knowledge we discover gets added to our existing knowledge
If this is what’s “really” going on, then it explains why we have a positive link between last year’s patent stock and this year’s patenting. Knowledge begets more knowledge! And this is especially true of more recent knowledge, for which we have not already wrung out all the possible implications (hence the higher weight on recent patents).
If we really believe this story, we can even use the estimated statistical models to make neat little forecasts of how technologies will develop. For every year, we can use last year’s patent stock to predict how many new patents will be discovered. We then use that to predict what next year’s patent stock will be. Rinse and repeat.
There’s also a policy implication. If we can temporarily accelerate the accumulation of knowledge in a field, it can pay huge dividends. That’s because the benefits will compound, since they’ll enable more knowledge creation next year, which will enable more knowledge creation in the following year and so on.
Moreover, this model implies path dependence is a powerful force in technology. If one technology gets a minor head start, it might keep it’s lead for a very long time. In fact, if the relationship is strong enough (a 10% increase in the patent stock increases patenting by 10% or more), then all else equal a technology that is a bit behind can never catch up!
But before we get too far ahead of ourselves, we need to strongly consider some other potentially important explanations.
Let’s jettison the conceptual category “patent stock” for a minute. So far all we have really shown is that patenting in the recent past is correlated with patenting today. To think through why that might be the case, we need to think through what kinds of factors determine the number of patents in a given year (for a firm, technology, or country).
Since I’m an economist, I’m going to divide the potential factors into two categories: supply and demand. Supply factors are anything that affects the cost of getting patents (where cost is broadly construed). Demand factors are anything that affects the value of getting a patent. If it patents get cheaper or more valuable, then we should expect inventors to seek more patents. And if the things that made them cheaper or more valuable persist over time, then patenting in the recent past predicts patenting today.
Yes, it’s true that knowledge is one kind of things that makes it cheaper to discover new things and get patents. But many other non-knowledge factors might also be important. On the supply side, that might include things like more scientists/inventors; more physical capital (computers, laboratories, etc); improved access to financing; more patent lawyers; and so on. On the demand side that might include demand from consumers; new regulations favoring certain kinds of technology; or even just oddball idiosyncratic stuff like demand from a new CEO who thinks the firm should be patenting more of its existing inventions.
If we get any of these factors, we’ll get more patenting today and tomorrow (if the factor sticks around), which will deliver the correlation that patenting today predicts patenting tomorrow. But unlike the interesting “knowledge begets knowledge” theory, this doesn’t necessarily have the same policy implications, nor does it necessarily mean we have strong path dependency in technologies. If these other factors are driving the correlation, then if we increase patenting by hiring more scientists, subsidizing R&D, passing some new regulation, or whatever, the boost in patenting will last as long as the policy, but it won’t necessarily have any further knock-on effects.
So I think it’s worth poking at this correlation to see how much of it can be explained by these various factors. And while our evidence base isn’t fantastic, I think a couple lines of evidence suggest the “knowledge begets knowledge” idea is a big part of the story.
The most straightforward way to parcel out the drivers of this patent stock/patenting correlation is to try and directly measure the plausible factors that might drive it and adjust for them wherever possible.
Let’s start with supply factors other than knowledge itself. For example, a firm that hires a lot of new R&D workers and invests in new labs might be able to crank out a lot of new (patented) inventions. As long as the firm has this elevated set of inputs, patenting might be elevated too, regardless of what’s happening to knowledge.
There certainly is a correlation between patent stocks and what we might call “inputs to invention.” Park and Park (2006) construct patent stocks for 23 different US manufacturing industries over 1976-1996 and compare them to the number of R&D workers in these sectors over the same time period, finding a correlation coefficient of about 0.3. Park and Park don’t have great measures of the non-labor inputs to research (lab equipment, computers, etc), but they do have total R&D spending, which should encompass that as well as the salaries of R&D workers. The correlation coefficient between patent stocks and R&D spending is higher, at 0.5.
So it’s true that patent stocks are partially but not totally explained by other supply side factors. Indeed, precisely one reason many papers use patent stocks is to proxy for these supply-side innovation factors, since it’s often easier to observe a firm’s patents than it’s employee headcount and R&D spend. But the correlation is a lot less than 1; we’re not explaining 100% of the variation just by looking at these supply factors.
Another strand of evidence comes from a paper that explicitly tries to adjust for the scientific labor force. Porter and Stern (2000) constructs patent stocks for 17 different OECD countries and finds a strong correlation between last year’s patent stock and this year’s patenting, even after adjusting for each country’s R&D workforce. Even within the same country, comparing two years where the scientific workforce is unchanged, if there is a 10% larger patent stock in one of those two years, they find that’s associated with 11% more patents in the following year.
So we might consider that some evidence that this isn’t entirely about supply side factors. What about demand?
We have pretty good evidence in the automobile market and in the life sciences that R&D is pretty responsive to changes in market demand. When fuel prices or more stringent emissions standards make conventional cars less attractive, carmakers invest in more clean technology innovation. And when a disease becomes more prevalent or more profitable to treat, private sector companies respond by increasing related research. There’s no reason to think cars and biomedical research are special in this regard, and so we should expect demand side drivers to matter for lots of technologies.
A lot of these papers are specifically concerned with the transition from fossil fuels to clean energy. In that context, the relevant demand-side factors are the price of fossil fuels and government regulations that make it more attractive to invent and use renewable energy. And lots of papers have specifically controlled for these demand side factors, and still find this patent stock / patenting correlation:
For 11 different energy technologies, Popp (2002) finds past patent stocks are strongly correlated with present patenting after controlling for energy prices
For renewable energy, Noailly and Smeets (2015) find the same after controlling for energy prices and the size of the energy market
For clean and dirty auto innovation, Aghion et al. (2016) find past patent stocks are strongly correlated with present patenting after controlling for fuel prices and some proxies of government regulation
Rozendaal and Vollebergh (2021) find the same using more recent data, fuel prices, and an improved measure of the stringency of government regulation
These measures of demand-side factors are always going to be incomplete and imperfect, but they seem like plausible candidates for the first-order drivers of demand for different kinds of energy-related innovation. So that should tilt us a bit to the side of thinking the correlation between a patent-stock and subsequent patenting isn’t just about unobserved demand.
So where do we end up? In this article, we’re interested in how much the correlation between patent stocks and subsequent patenting is driven by knowledge generating more knowledge. We’ve seen this correlation persists when we try to control for some various confounders, like the number of scientists or changing demand for different kinds of innovation. But it’s always possible we’ve missed something. Moreover, it’s a bit frustrating that we have one set of paper that looked at supply and another set that looked at demand. It might be that if we could control for both supply and demand at once, the patent stock / patenting correlation would be much smaller, or even disappear. So to close, let’s turn to a few papers that try to look more directly for positive evidence on knowledge itself.
To begin, there are a few papers that have attempted to show patent stocks really capture something like “technological know-how”. As I noted at the beginning, a potential critique up to now might have been, “patents don’t measure innovation!” For sure lots of patented stuff is junk and lots of brilliant stuff is not patented. But I think there’s signal in the noise.
For example, Park and Park (2006) and Porter and Stern (2000) both look to see how well patent stocks predict not just patenting but also “total factor productivity.” Total factor productivity - hereafter TFP - is a common (albeit highly imperfect) measure of technological capability. The basic idea behind it is if you can squeeze more economic outputs out of the same number of inputs (for example, capital and labor), then that’s a reflection of better technology. The challenge is you can’t measure TFP directly; instead, you directly measure outputs and inputs and try to predict the outputs with the inputs. Something like the gap in your predictions is your measure of technology. It’s a nice concept because it’s a general purpose measure of technology, but it’s very noisy and can be affected by things other than technological progress.
Nonetheless, it’s one of the few broad measures of technology we have. Park and Park (2006) look at how correlated the growth of TFP in an industry is with its patent stock. The correlation is there, though a bit weak; a 10% increase in the patent stock is associated with a 1.5% increase in the TFP growth rate. And Porter and Stern find similar results at the level of the country; a 10% increase in the national patent stock is associated with a 0.5% increase in the national TFP growth rate. To the extent patent stocks are correlated with a completely different measure of technological know-how (that is computed with no reference to patents), that’s some evidence patent stocks are picking up information about technological know-how. That know-how apparently affects economic performance, and so it’s not such a leap to imagine it also affects subsequent technological progress. To the extent the link is weak though, that’s either evidence that the signal about knowledge from patent stocks is pretty small, or that TFP is a crappy measure of technology, or - most likely, in my opinion - a bit of both.
Finally, a few papers try to go beyond merely counting patents, to get at some proxies for the knowledge associated with patented inventions. To the extent these papers actually do a better job of measuring knowledge (which is debatable), they tend to show more knowledge is associated with more patenting.
One of these is the previously mentioned Popp (2002). Popp attempts to use patent citations to measure the value of the knowledge created by patents. The basic idea is that if patents in a particular year have an unusually high probability of being cited, that suggests the patents in that year are particularly valuable as sources of knowledge. And if your knowledge stock contains lots of these high-knowledge patents, then it should generate more patents than another patent stock with the same number of patents but fewer of the high-knowledge type.
This is a tricky exercise; patent citations are a pretty noisy indicator of knowledge flows, though I tend to think they have some signal. Moreover, you need to be careful not to get the causality backwards, so that a surge of patents in later years causes more citations to patents in earlier years, creating this spurious correlation between the number of highly cited patents in a year and subsequent patent activity. Popp avoids this by using the probability a patent gets cited, which takes into account the number of patents that might cite it.
In any event, Popp shows yesterday’s value of this more sophisticated knowledge stock is more tightly correlated with today’s patenting than the standard knowledge stocks that make no such allowances.
Another paper that tries to measure the knowledge in patent stocks is one of my own, Clancy (2017). This paper is based on the idea that innovation is a combinatorial process - new ideas are all about finding useful combinations of older pre-existing ideas. The underlying intuition is that it’s not the number of patents in a technology field that matters, but the number of useful combinations of ideas that the field knows about.
The challenge is finding sensible proxies for these concepts, and then to test them correctly. Fortunately, the US patent and Trademark Office (USPTO) has developed a highly granular system for classifying the technologies described in patents. When I was doing the research, the USPTO had 450 different major technological “classes” and about 15,000 “mainline subclasses” which provided a more detailed description of technology. Importantly, most patents are assigned more than one of these mainline subclasses, which is one indication that two distinct “ideas” (each of which is described by one of these subclasses) have been successfully combined in the single patented invention.
Clancy (2017) defines two different measures of technological know-how for each of the 450 broad technology “classes.” The first of these is just a count of the number of patents assigned to the class, which is basically the same as the patent stocks we’ve been discussing so far. The second is a measure based on the number of pairs of mainline subclasses that are assigned to patents in that class, adjusted in a way to try and capture inventors’ depth of experience with combining the mainline subclasses. The paper then uses both of these measures to predict subsequent patenting.
Here’s an example to illustrate the basic idea. Suppose we have two different technologies: solar energy and fusion energy. Suppose they each belong to a different technology class. Pretend each has roughly the same number of patents associated with it, so that they have similar patent stocks. But suppose solar energy patents combine dozens of different technological categories and the fusion patents only combine a few different categories again and again. That might imply it would be easier to come up with new solar energy patents than fusion patents, because we only know how to do a few different kinds of things with fusion, and those things have been largely explored, while with solar we have dozens of avenues left to explore.
And that’s basically what Clancy (2017) finds. Within a given technological class, a 10% jump in this measure of “knowledge about how to combine pairs of technology” is associated with a subsequent 9% increase in patenting in that field. But now a 10% jump in the number of patents - holding fixed our measure of how much the field knows about how to combine pairs of technologies - is associated with a 30% drop in patenting. The notion here is that new patents can be good for future innovation (if they show how to do new things like combine rarely paired technologies) or bad for future innovation (if they merely exhaust possibilities that are already known). Perhaps this kind of argument can help explain why we occasionally find a higher patent stock implies less patenting, such as in Lazkano, Nøstbakken, and Pelli (2017)?
(As far as I know, Popp (2002) and Clancy (2017) are some of the only papers that adopt the patent stock approach, but experiment with developing improved measures of the “knowledge” in them. I suspect you could now do a lot more now to develop measures of knowledge in a field, for example, by using the text in patents to measure the diversity of ideas in a field, or looking at how much the field is citing new scientific research versus old research.)
So, to step back and appraise the situation, we’ve got this correlation that’s pretty general: yesterday’s patenting tends to predict today’s. One interesting hypothesis to explain this stylized fact is that “knowledge begets knowledge;” a policy that pumps up knowledge production today could have a long echo into the future, as new knowledge enables yet more discovery. Indeed, if the strength of this relationship is strong enough, so that a 10% increase in knowledge leads to at least a 10% increase in the production of new knowledge, then you obtain strong path dependency effects. Once you get a technological paradigm started, such as fossil fuel derived energy, innovation is self-propelling and it becomes very hard for rival paradigms to ever catch up (without active assistance).
But we should be careful about leaping to that interpretation. We don’t have anything like a nice experiment or even quasi-experiment here; just messy correlations in observational data. In particular, it’s quite likely part of this correlation is driven by factors that don’t necessarily have this self-propelling character. It might be that the correlation is driven by the fact that fields that do a lot of patenting in the recent past have a lot of scientists working in them, and so we should expect those scientists to generate a lot of patents in the future as well. But it looks like this isn’t the whole story.
At the same time, it might be that the correlation is driven by the fact that technologies that are in high demand yesterday attract a lot of inventive effort yesterday, and so long as that demand remains elevated then they will continue to attract inventive effort today. But again, at least for energy innovation, this patent stock to patenting correlation seem to exist even after taking into account those demand-side factors.
Lastly, when we do try to drill down and measure knowledge more directly, we do find some evidence consistent with this view that the knowledge is a big part of what matters. Patenting is correlated with some other measures we have of technological know-how, and improved measures of knowledge seem to predict future patenting better than unimproved measures.
Despite all that, nature is tricky and it could well be that all the confounders I’ve already mentioned, plus things we’ve overlooked, drives this correlation. But lastly, we have a few additional suggestive lines of evidence that knowledge matters for innovation, which are discussed at length elsewhere on New Things Under the Sun. For example, more scientific knowledge also seems to facilitate innovation, and knowledge spillovers seem quite important for innovation. That suggests we should approach this line of evidence already pretty open to the notion that we’ll find knowledge is an important explanatory factor.
So here is my main take-away from all this. Some part of the correlation between yesterday’s patent stock and today’s patenting is probably driven by knowledge itself, but not all of it. At the same time, we rarely find a 10% increase in yesterday’s patent stock is associated with a greater than 10% increase in today’s patenting, and when we do, it’s usually not much more than 10%. More commonly, we find a 10% increase in yesterday’s patent stock is associated with a positive but less than 10% increase in today’s patenting. Since I think knowledge probably only accounts for part of this correlation, I suspect, on average a 10% increase in knowledge (if we could measure knowledge correctly) leads to a less than 10% increase in knowledge discovery.
In other words, knowledge begets more knowledge, but with diminishing returns. Innovation tends to get harder. Policies that pump up knowledge do echo into the future; but they don’t echo forever. And technologies that get a head start will probably keep them for a long time; but in general, not forever.
New articles and updates to existing articles are typically added to this site every two weeks. To learn what’s new on New Things Under the Sun, subscribe to the newsletter.
Pulling more fuel efficient cars into existence
Medicine and the limits of market driven innovation
Measuring knowledge spillovers: the trouble with patent citations
Combination innovation and technological progress in the very long run
The best new ideas combine disparate old ideas
More science leads to more innovation
Science is good at making useful knowledge
Science as a map of unfamiliar terrain
Knowledge spillovers are a big deal
Innovation (mostly) gets harder
Learning curves are tough to use
How common is independent discovery?
Aghion, Philippe, Antoine Dechezleprêtre, David Hemous, Ralf Martin, and John Van Reenen. 2016. Carbon taxes, path dependency, and directed technical change: Evidence from the auto industry. Journal of Political Economy 124(1): 1-51. https://doi.org/10.1086/684581
Rozendaal, Rik, and Herman R.J. Vollebergh. 2021. Policy-Induced Innovation in Clean Technologies: Evidence from the Car Market. CESifo working paper no. 9422. http://dx.doi.org/10.2139/ssrn.3969578
Noailly, Joëlle and Roger Smeets. 2015. Directing technical change from fossil-fuel to renewable energy innovation: An application using firm-level data. Journal of Environmental Economics and Management 72: 15-37. https://doi.org/10.1016/j.jeem.2015.03.004
Popp, David. 2002. Induced Innovation and Energy Prices. American Economic Review 92(1): 160-180. https://doi.org/10.1257/000282802760015658
Porter, Michael E., and Scott Stern. 2000. Measuring the “ideas” production function: evidence from international patent output. NBER Working Paper 7891. https://doi.org/10.3386/w7891
Lazkano, Itziar, Linda Nøstbakken, and Martino Pelli. 2017. From fossil fuels to renewables: the role of electricity storage. European Economic Review 99: 113-129. https://doi.org/10.1016/j.euroecorev.2017.03.013
Park, Gwangman, and Yongtae Park. 2006. On the measurement of patent stock as knowledge indicators. Technological Forecasting and Social Change 73(7): 793-812. https://doi.org/10.1016/j.techfore.2005.09.006
Clancy, Matthew S. 2017. Combinations of technology in US patents, 1926-2009: a weakening base for future innovation? Economics of Innovation and New Technology 27(8): 770-785. https://doi.org/10.1080/10438599.2017.1410007