Skip to content

This repository contains the code used in the paper "Estimating Technology Performance Improvement Rates by Mining Patent Data", coauthored with Jeff Alstott and Chris Magee, to be published by Technological Forecasting and Social Change

Notifications You must be signed in to change notification settings

jsph-onl/TechnologyPerformanceImprovementEstimates

 
 

Repository files navigation

TechnologyPerformanceImprovementEstimates

This repository contains the code used in the paper "Estimating Technology Performance Improvement Rates by Mining Patent Data", coauthored with Jeff Alstott and Chris Magee, published by Technological Forecasting and Social Change and freely available at https://www.sciencedirect.com/science/article/pii/S0040162520309264.

Patent data used by the code is available at http://dx.doi.org/10.17632/f4fj887y67.1. Processed patent citation data is available at https://zenodo.org/record/3902550#.Xu-DlWhKiUk. Raw data from the USPTO is available at https://www.patentsview.org/download/. The content of the data files as well as the methods used to create them are described in the article G. Triulzi, C.L. Magee, "Functional performance improvement data and patent sets for 30 technology domains with measurements of patent centrality and estimations of the improvement rate", published by Data in Brief, available at https://doi.org/10.1016/j.dib.2020.106257. If you use the data, please cite both articles.

The code is structured as a series of Jupyter Notebooks, which use two datasets available on Mendeley Data and Zenodo.

  • Notebook: Compute entropy of assignees in domains
  • Notebook: Compute Normalized Knowledge Obsolescence Index at patent and domain levels
  • Notebook: Monte Carlo Cross Validation (with paper figures)
    • What it does: for a set of candidate patent-based predictors of the technology performance improvement rate (TIR), the code computes the predictor using data only up to a year (from 1980 to 2015), randomly sample half of the technology domains, train a regression to predict TIR and test it on the remaining half. It then produces a figure showing the correlation over time between the observed log of the TIR (a.k.a. the parameter “K”) and the predictor, the coefficient of the predictor and the intercept. Note that notebooks “Compute entropy of assignees in domains “ and “Compute Normalized Knowledge Obsolescence Index at Patent and Domain Levels” should be run before notebook “Monte Carlo Cross Validation (with paper figures)” as the latter uses inputs produced by the former two.
    • Inputs:
    • Output:
      • DF_stability_prediction_over_time_MONTE_CARLO_COMPARISON.csv
      • Figure 3 and S6 from the paper.
  • Notebook: Estimate TIR for new domains
    • What it does: It runs a regression that uses the best possible predictor identified by the Monte Carlo cross validation exercise to estimate TIRs for the 30 domains based on that predictor only. It then used the estimated coefficients of the regression and the values of the predictor for estimating TIR of 5 out-of-sample domains related to Bio-electronic Medicine for which we only have patent data (no available observation of the empirical TIR). This notebook can be used to estimate the yearly performance improvement rate for any new technology domain. You only need a list of US patent numbers for the new domain. If you input a patent dataset that has more than one domain it will also compute the likelihood that one is faster than the other(s).
    • Inputs:
    • Output:
      • REGRESSION_DATA.xlsx
      • TABLE_estimated_k_meanSPNPcited_1year_before_randomized_zscore_RPbyYear.xlsx
      • Figure “density_estimated_rate_new_domains” in PDF and TIFF formats
      • Figure “likelihood_domain_faster_than_other_domain” in PDF and TIFF formats

An accompanying code that can be used to compute patent centrality indicators normalized by randomizing the US patent citation network is available on my co-author page at: https://github.com/jeffalstott/patent_centralities.

About

This repository contains the code used in the paper "Estimating Technology Performance Improvement Rates by Mining Patent Data", coauthored with Jeff Alstott and Chris Magee, to be published by Technological Forecasting and Social Change

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%