#Georgetown Data Science Certificate
##Forecasting Peer-to-Peer Lending Risk
###Authors: Archange Giscard Destine, Steven L. Lerner, Erblin Mehmetaj, Hetal Shah
September 10, 2016
####Abstract
Peer-to-peer lending companies provide online platforms that can quickly pair borrowers seeking a loan with investors willing to fund the loan at an attractive rate. Since these loans are unsecured and companies creating the market generally do not invest their own capital, neither borrowers nor companies assume any risk. Entire credit risk is born by investors. Literature shows that credit risk depends upon borrower characteristics, loan terms and regional macroeconomic factors. To help investors identify unsecured loans likely to be fully paid, a machine learning algorithm was developed to forecast probability of full payment and probability of default. Training and input data consisted of historic loans’ data from Lending Club and state level macroeconomic data from government and organizational sources. A logistic regression was shown to provide optimal results, effectively sequestering high risk loans.
####Repo Overview
Information in the repo is described below.
data - includes all the datasets, both the raw and the clean ones, that were used for this project.
Data Wrangling -- Forecasting Peer-to-Peer Lending Risk.ipynb - includes the code that was used to wrangle our data.
iclub-DefaultPrediction9.ipynb - includes the code that was used to tune our models.
ForecastingPeer-to-PeerLendingRisk.ipynb — includes all of our code.
ForecastingPeer-to-PeerLendingRisk.pdf - includes our paper that explains all of the project in detail.