Skip to main content
SearchLoginLogin or Signup

Do studies based on patents get different results?

For the sample on New Things Under the Sun, not really

Published onMar 29, 2024
Do studies based on patents get different results?

Lots of social science research about innovation relies on patents as a way to measure innovation. But it’s not clear that patents are a great way to measure innovation. Probably only a relatively small share of inventions receive patent protection; moreover, while patenting does predict a lot of other measures of innovation, the linkage tends to be a pretty noisy one. Maybe the patent-based innovation literature is built on a foundation of sand?

One way to validate patents as a measure of innovation is to exploit the fact that tons of papers study the same phenomena with different datasets: some use patents, some don’t. Do they tend to arrive at different results? If so, that suggests the papers using patent data might be picking up something unique about patents, rather than something about innovation per se. On the other hand, if analyses built on patent and non-patent data tend to get similar results, that suggests patents are roughly as good a measure of innovation as the available alternatives.

I think New Things Under the Sun can itself be a useful data source on this particular question. At the time of writing (March 2024), New Things Under the Sun consists of 73 articles that synthesize multiple academic papers to examine various narrow claims about innovation. I count 37 articles that discuss studies built on both patent and non-patent data.1 Among these 37, how often do the patent-based analyses disagree with the non-patent analyses?

I looked them over to see.

My takeaway from this exercise is that studies relying on patent data tend to obtain similar results to those that don’t. In 31/37 (84%) of the claims I’ve looked at, I didn’t think there was meaningful disagreement between the patent and non-patent studies: regardless of which type of data is used for a problem, results were broadly consistent. In the other 6/37 (16%), I thought there was generally a mix of agreement and disagreement. The patent and non-patent data differed along some qualitatively important dimension, though even in these cases I didn’t find uniform disagreement. For example, in the article Are ideas getting harder to find because of the burden of knowledge?, non-patent data indicates first discoveries are being made at increasingly older ages, but patent data doesn’t show this. However, both the patent and non-patent data were consistent with team sizes increasing and specialization increasing. Nonetheless, because there was some disagreement, I classified this article as exhibiting some disagreement between the patent and non-patent evidence.

Actually, I’m not sure the differences I found between patent and non-patent data were any more severe than you would find if you were to explore the same phenomena with the same dataset (for example, two papers looking at the same thing with data on journal articles). That said, note that my definitions of agreement and disagreement are kind of loose and subjective; directionally the same, rather than numerically the same. Moreover, not all of the scope for agreement and disagreement was super substantive. Sometimes the bulk of the evidence comes from almost exclusively patent or almost exclusively non-patent data, and the data from the other source only covers a part of the overall claim. Even so, in many cases, it’s a bit surprising to me there aren’t more disagreements, since in some cases there are important differences between the kinds of innovation that are studied by patent or non-patent data.

In the next section, I display how I classified these 37 articles, along with a short description of where I saw agreement or disagreement. Feel free to skip it for some further discussion about potential biases with this exercise, due to selection effects.

Classifications of New Things Articles

At least some disagreement

  1. Age and the impact of innovation: As scientists or inventors age, their work receives fewer citations, from a narrower set of inventors, and becomes less disruptive as measured by both papers and patents. But productivity over an academic lifecycle appears to remain high for a longer period of time (as measured by production of papers) than productivity of an inventor (as measured by patents).

  2. Are ideas getting harder to find because of the burden of knowledge? The age of first scientific discovery has steadily increased, while the age of first patent rose, but then fell. However both patents and academic papers find team size and specialization is on the rise.

  3. How common is independent invention? Evidence from both patents and papers finds the incidence of simultaneous independent discovery is quite rare; but the rate implied by patent interference hearings is orders of magnitude lower than for papers. At the same time, evidence from both patents and papers suggests multiple independent discovery is more likely for more valuable research ideas.

  4. Innovation (mostly) gets harder: The same level of research effort yields fewer successively smaller improvements by most measures. This is not true for raw patent counts, but is true for one measure of particularly innovative patents. 

  5. Teaching innovative entrepreneurship: One study of two particular entrepreneurship training programs looked at many different indicators of successful entrepreneurship. Neither program had a statistically significant effect on patenting by participants. For one of the programs, this was consistent with it having no impact on any other measures; for the other, it had a positive effect on some measures of successful entrepreneurship, but not patents and a few other measures.

  6. The best new ideas combine disparate old ideas: Patents and papers that comprise unusual combinations of ideas are associated with higher impact. There is some evidence that the highest impact papers also make some more conventional combinations than patents.

No disagreement

  1. Adjacent knowledge is useful: Patent evidence from agricultural technology, and a variety of non-patent evidence, suggest knowledge spillovers tend to be most often pulled from fields that are not too “far” away.

  2. Age and the nature of innovation: Evidence from academia and patentees is consistent with older innovators relying on older ideas in their work. Measures of how disruptive a paper or patent is also decline with the age of the author.

  3. An example of high returns to publicly funded R&D: Comparing companies that barely win an SBIR grant to those that barely lose, the winners get more patents, but also do better on a variety of other measures of business success.

  4. Big firms have different incentives: Analyzing the text of patents indicates larger firms have more process patents; survey data also indicates larger firms spend a greater share of R&D on processes.

  5. Building a new research field: Scientists who pivot in their research topics are less likely to produce highly cited research; inventors who jump to working in a new technology class receive fewer citations to their patents.

  6. Do academic citations measure the impact of new ideas? Patents, like government policy papers, are disproportionately likely to cite academic research that is highly cited within academia.

  7. Entrepreneurship is contagious: People exposed to entrepreneurial peers are more likely to become entrepreneurs themselves, as measured by entrepreneurial activity. Postdocs with advisors who have patents are more likely to patent themselves as well.

  8. Free knowledge and innovation: Patents incorporate information available at local (physical) libraries. SImilarly, academic articles in chemistry incorporate information available freely on wikipedia.

  9. Gender and what gets researched: Evidence from both patents and academia finds that women are more likely to research medical problems related to their gender. There is also some evidence from both that as gender representation improves, men also become more likely to work on these topics.

  10. Geography and what gets researched: Evidence from both patents and academia finds that people are more likely to conduct innovation related to local problems and priorities.

  11. Highly cited innovation takes a team: Academic papers, patents, and software, all increase citations received as the team involved in their creation rises. Comic books by bigger teams are also more valuable. Other related variables also correlate similarly with team size, across papers and patents.

  12. How long does it take to go from science to technology? The statistical correlation between funding for relatively basic science and subsequent productivity gains is strongest at around 20 years. The typical gap between when a patent application is filed and an academic article it cites is similarly long.

  13. How to impede technological progress: Policies that make the return on research effort less rewarding disproportionately impact marginal players, in both academic settings, where innovation is measured in papers, and in industry, where innovation is measured with patents or with new drug products.

  14. Importing knowledge: Evidence from both patents and academic paper citations shows that immigration seeds knowledge prevalent in the originating country among non-immigrants in the receiving country.

  15. Innovators who immigrate: When US or EU inventors emigrate, their patenting rises. Similarly, when scientists move to well-resourced places for science, their academic productivity rises (across many measures).

  16. Is technological progress slowing? The case of American agriculture: Patent data indicates agricultural invention substantially builds on knowledge discovered outside the agricultural sector; TFP data suggests agricultural productivity growth follows productivity growth in the rest of the economy with a long lag.

  17. Knowledge spillovers are a big deal: Data from patents, academic papers (and the grants that fund them), and R&D spending all suggests the quantitative impact of knowledge spillovers are large.

  18. More science leads to more innovation: A variety of patent data documents linkages between the supply of scientific research and subsequent technological progress. There is also a correlation between the supply of scientific publications and industrial productivity in related sectors, after a substantial lag.

  19. Publish or perish and the quality of science: Researchers working outside the academic system in structural biology tend to be higher quality, holding constant the citation potential of a protein. Patent evidence suggests industry prefers industry research to academic research, holding constant the nature of the discovery.

  20. Pulling more fuel efficient cars into existence: Rising fuel prices and fuel efficiency standards tend to improve the fuel efficiency of cars, whether measured by patents or the actual traits of vehicles.

  21. Remote breakthroughs: Innovators increasingly collaborate at a distance, whether we measure collaboration among patentees or coauthors on academic papers. More remote teams have typically been less disruptive/novel than colocated ones, but this effect has moderated or even reversed over time, whether measured by papers or patents.

  22. Science is getting harder: Both patents and academic papers have become progressively less likely to cite recent academic work for several decades.

  23. Science is good at making useful knowledge: Papers that are highly cited in one domain, tend to be more highly cited in other domains as well. Economics papers highly cited by economists are likely to be cited outside economics; academic work highly cited by other academics is likely to be cited by patents.

  24. Teacher influence and innovation: Various studies show students adopt the interests of their mentors, where interest is measured in several different ways, including interest in seeking a patent and other non-patent measures.

  25. The internet, the postal service, and access to distant ideas: When the costs of communicating via text between two geographically distant establishments of the same firm falls because they get internet access, they are more likely to cite each others patents or collaborate. When the cost of communicating via text fell in Great Britain, due to postal reforms, distant regions were more likely to cite each other’s scientific work.

  26. The size of firms and the nature of Innovation: As firms get larger they obtain fewer inventions per R&D dollar, whether inventions are measured with patents or alternatives. The inventions they do get also tend to be more incremental, again, whether we measure with patent-based proxies or others.

  27. Transportation and innovation: When regions are better connected by transit links, collaboration by inventors and scientists across those regions increases, as measured by either patents or papers.

  28. What does peer review know? One study looking at NIH peer review scores finds that grants with higher scores tend to lead to more publications, more citations, and more patents.

  29. When extreme necessity is the mother of invention: Covid-19 spurred a surge in new invention in technologies to mitigate its effects, whether medical treatments (measured by new clinical trials) or patent applications for remote work technology.

  30. When technology goes bad: A greater share of R&D is focused on health and safety, whether measured by the share of patents that correspond to medical technologies, or the share of publicly funded research spending on health and environment.

  31. Why proximity matters: who you know: Evidence from patents citations is consistent with a story where distance is not a strong impediment to sharing knowledge with people you have relationships with, but an impediment to forming such relationships. This is consistent with evidence from academia.

Selection Bias?

The above finds broad agreement between innovation studies that use patent data and those that don’t, where they study closely related phenomena. But we might reasonably worry: is this just an artifact of selection?

Indeed, there are multiple possible layers of selection bias.

The first level of selection bias is that researchers decide when and when not to use patent data. In this post’s exercise, I’m only observing the cases where the researcher thought patents would be an appropriate measure of innovation and where I thought the paper was a good fit for New Things Under the Sun. And so the claim that “patent and non-patent data tend to arrive at similar conclusions” only applies to the set of claims where researchers thought patents were an appropriate dataset (and I thought the researchers wrote a nice paper).

To give a concrete example, I have a series of posts about publication bias in the sciences - the notion that the research record gives us a biased picture of evidence since only positive findings are publishable. Only one of those posts features any studies reliant on patent data (see #19 above in the “No disagreements” list). It makes sense that few researchers thought it was appropriate to study publication bias with patents, since publication bias is typically assumed to be an outcome of incentives that are peculiar to academia, not private sector invention. If someone did try to study publication bias in patents, they might get quite a different result than if they had studied it with data on journal articles.

The upshot is that this post’s analysis implies that if you think a paper is by a good researcher and it uses patent data, the results of the paper would probably agree with another paper on the same topic that didn’t use patent data. But, if you instead start with a specific research question, these results don’t imply you would get the same results whether you use patents or not. They instead imply that you would, if it’s the kind of research question that researchers think patents are appropriate for. If it’s not, then the results of this post don’t really apply. The claim is not that patents measure innovation well in all cases. The claim is that innovation researchers have done a decent job of restricting their attention to cases when patents do work well.

There is a second potential layer of selection bias though, above the researcher’s own decision about whether to use patents. Publication bias might actually be giving us a skewed perception of how reliable is patent data itself! Suppose that patents really are a bad measure of innovation, and accordingly they rarely deliver positive findings. It might be the case that we only observe the papers that do get positive results, since those are the only ones that are publishable. If this issue is serious, it would mean I’m overstating the extent to which research using patent data arrives at similar conclusions as papers that do not. I think the popularity of patent data as a data source is some evidence against this concern - if the data had a reputation for leading disproportionately often to unpublishable null results, it probably wouldn’t be so popular. But it is something to bear in mind.

Lastly, there could be bias from the fact that my choice of topics on New Things Under the Sun isn’t random. I like writing about topics I think are important or where I think academic research can tell us something useful. The latter preference is potentially a serious source of bias. All else equal I feel less enthusiastic writing about a field where there is a muddle of different findings depending on which dataset you use (though I would still write a post if I thought the topic was important). That might mean my selection of topics is biased towards claims where patent and non-patent data obtain similar results, since those are the ones where I’m most confident social science research can tell us something.

There’s at least one way to evaluate how much of a concern this should be. New Things Under the Sun is a living literature review. There might well be a selection bias in how I choose which articles to write. But after the articles are written, there is a lot less bias in my choice about what articles to update. One of my goals for this project is for the posts to provide an honest account of the state of the literature. That means if new studies come out that contradict what I’ve already written, I do feel obliged to update the post to reflect this. That presents an opportunity to check for this last form of selection bias. If updates tend to find more disagreement between patent and non-patent data than original articles, that would suggest my choice of what to initially write about is overstating the extent to which patent and non-patent studies agree.

Going through my newsletter archive, I found 20 updates to existing articles that include patent and non-patent data. Of these updates, 3 have at least some disagreement between the patent and non-patent analyses. The other 17 do not have any meaningful disagreement, in my judgment. This is pretty close to the ratio I found in my original survey of 37 articles that examine both patent and non-patent data. About 15% (3/20) of the time, there is some disagreement between analyses reliant on patent data and those that do not, compared to 16% in my main analysis. See the appendix for how I classified each of these 20 updates, along with a short description of the nature of agreement or disagreement.

All in all, this exercise formalizes an intuition I’ve had for a long time. I’ve noticed that when I write about studies that use patent data, I often encounter some skepticism. For that very reason, I often go out of my way to try and find articles that do not rely on patent data, but which study the same phenomena as the patent-based papers I’m writing about. And in my experience, that exercise rarely leads me to substantively revise my original views. In the academic literature, if it’s possible and sensible to study a question with both patent data and non-patent data, in my experience results are subjectively similar.

New articles and updates to existing articles are typically added to this site every three weeks. To learn what’s new on New Things Under the Sun, subscribe to the newsletter.

Cited in the Above

(setting aside the list of 37 articles I classified)

How many inventions are patented?

Patents (weakly) predict innovation

Cites the Above

Can we learn about innovation from patent data?

Indexed at

Study of innovation


In the following, I classify updates to existing articles to see whether the update disagreed with evidence in the original article.

At least some disagreement

  1. April 2022 Updates: This post discussed a major revision to one of the first articles written for New Things Under the Sun. This update introduced some of the nuances about the differential reliance on conventional combinations of ideas between papers and patents.

  2. December 2022 Updates: Expanded the post Innovation (mostly) gets harder to talk about patents. While many measures of innovation suggest a constant level of research effort has diminishing returns, raw patent counts do not display this trend; but a measure of particularly innovative patents does.

  3. September 2023 Updates: Survey evidence shows the chances a firm has at least one process innovation grows faster than the probability it has at least one product innovation as the firm gets larger, until the size of the firm hits 50 employees, at which point the relationship reverses. In contrast patent evidence shows that as firms get larger a bigger share of patents are processes. That said, this result is a bit ambiguous, since the survey data doesn’t tell us the share of innovations that are process or product, just whether firms have at least one.

No Disagreement

  1. The Future of New Things Under the Sun: An update to “How important are knowledge spillovers?” (since renamed Knowledge spillovers are a big deal) adding a study using patent data agreed with earlier studies that knowledge spillovers are large in magnitude.

  2. Science as a map of unfamiliar terrain: This included an update to Are ideas getting harder to find because of the burden of knowledge. The update, using non-patent data, provided evidence that domains of math with a large influx of new ideas saw bigger teams and more specialization. Patents also display a trend towards larger teams and specialization.

  3. Money! Novelty! Coworkers! This included an update to Entrepreneurship is contagious that used non-patent data to show exposure to entrepreneurial coworkers leads to more entrepreneurship; the original post noted postdocs with advisors who have a patent are more likely to patent.

  4. January 2022 Updates (1): One update to Why proximity matters: who you know provides patent evidence that social connections help mediate knowledge transfer, in line with the non-patent data on this post.

  5. January 2022 Updates (2): An update to Free Knowledge and Innovation includes an updated analysis of how patent libraries facilitate access to knowledge that is reflected in new patents; the original post found wikipedia also facilitated access to knowledge in chemistry papers.

  6. January 2022 Updates (3): An update to Are ideas getting harder to find because of the burden of knowledge found indirect evidence of rising specialization in academia (because it becomes less likely you can publish a highly cited paper when you switch topics), generally matching patent data that documents rising specialization.

  7. February 2022 Updates (1): An update to Importing knowledge used patent data to document that immigrants bring new access to new ideas from their originating country to the locals, matching some similar evidence from academia.

  8. February 2022 Updates (2): An update to Gender and what gets researched presented non-patent data from history that women are more likely to study different topics than men in history, but that this difference has attenuated as the share of women in the field rises. The original post presented evidence from patents with similar results.

  9. February 2022 Updates (3): An update to Adjacent knowledge is useful presented non-patent evidence that cross-field collaborations are most likely between researchers in fields not “too” far away. The original post used some data on agricultural patents to provide evidence ideas from fields not too far away were the most useful for agriculture.

  10. June 2022 Updates (1): Includes an update to Publish-or-perish and the quality of science which finds some small effects of incentives on the prevalence of mistakes in research; the original post looked at an article documenting a preference for research from industry as compared to research from academia, even on the same narrowly defined topic, using patent data.

  11. June 2022 Updates (2): Includes an update to The best new ideas combine disparate old ideas, which uses non-patent data that largely matches results on patents that weird combinations of ideas can turn out to be highly cited because they lay foundations for a flurry of new related discoveries.

  12. July 2022 Updates: Includes an update to The internet and access to distant ideas (now titled The internet, the postal service, and access to distant ideas) that shows falling communication costs contribute to citations from geographically distant scientists, largely matching modern patent evidence for a similar phenomenon.

  13. February 2023 Updates (1): An update to Planes, Trains, Automobiles, and Innovation (now renamed Transportation and innovation) included data on how the expansion of Chinese subway systems led to more collaboration on patents; earlier data showed similar results for academic publications.

  14. February 2023 Updates (2): An update to Remote Breakthroughs added discussion of a paper that showed, with patents, advantages to having collaborators who move to new places (and plausibly get exposure to new ideas). Another study, discussed in the original post, finds some similar evidence for academic papers.

  15. October 2023 Updates (1): The post discusses an update to Knowledge spillovers are a big deal, which uses non-patent data to document knowledge spillovers are substantial, something that the post also documents using some papers relying on patent data.

  16. October 2023 Updates (2): Another part of the post discusses an update to Age and the impact of innovation, which finds that as economists age, citations to their publications in a set of top journals declines substantially. The original post also shows that citations to patents decline substantially as the inventors age.

January 2024 Updates: An update to the post Geography and What Gets Researched presents some evidence that management science scholars are more likely to study their own countries; patent evidence from the original post documented similar phenomena for the inventors of new agricultural technologies.

No comments here
Why not start the discussion?