full toc

pursuit-of-42 · Oct 21, 2015 · 1edd0b5 · 1edd0b5
1 parent ee168e7
commit 1edd0b5
Show file tree

Hide file tree

Showing 15 changed files with 263 additions and 42 deletions.
diff --git a/code/README.md b/code/README.md
@@ -0,0 +1,28 @@
+## Table of Contents and Code Notebooks
+
+
+Simply click on the `ipynb`/`nbviewer` links next to the chapter headlines to view the code examples (currently, the internal document links are only supported by the NbViewer version).
+**Please note that these are just the code examples accompanying the book, which I uploaded for your convenience; be aware that these notebooks may not be useful without the formulae and descriptive text.**   
+
+
+1. Machine Learning - Giving Computers the Ability to Learn from Data [[dir](./ch01)] [[ipynb](./ch01/ch01.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch01/ch01.ipynb)]
+2. Training Machine Learning Algorithms for Classification [[dir](./ch02)] [[ipynb](./ch02/ch02.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch02/ch02.ipynb)]
+3. A Tour of Machine Learning Classifiers Using Scikit-Learn [[dir](./ch03)] [[ipynb](./ch03/ch03.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch03/ch03.ipynb)]
+4. Building Good Training Sets – Data Pre-Processing [[dir](./ch04)] [[ipynb](./ch04/ch04.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch04/ch04.ipynb)]
+5. Compressing Data via Dimensionality Reduction [[dir](./ch05)] [[ipynb](./ch05/ch05.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch05/ch05.ipynb)]
+6. Learning Best Practices for Model Evaluation and Hyperparameter Optimization [[dir](./ch06)] [[ipynb](./ch06/ch06.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch06/ch06.ipynb)]
+7. Combining Different Models for Ensemble Learning [[dir](./ch07)] [[ipynb](./ch07/ch07.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch07/ch07.ipynb)]
+8. Applying Machine Learning to Sentiment Analysis [[dir](./ch08)] [[ipynb](./ch08/ch08.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch08/ch08.ipynb)]
+9. Embedding a Machine Learning Model into a Web Application [[dir](./ch09)] [[ipynb](./ch09/ch09.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch09/ch09.ipynb)]
+10. Predicting Continuous Target Variables with Regression Analysis [[dir](./ch10)] [[ipynb](./ch10/ch10.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch10/ch10.ipynb)]
+11. Working with Unlabeled Data – Clustering Analysis [[dir](./ch11)] [[ipynb](./ch11/ch11.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch11/ch11.ipynb)]
+12. Training Artificial Neural Networks for Image Recognition [[dir](./ch12)] [[ipynb](./ch12/ch12.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch12/ch12.ipynb)]
+13. Parallelizing Neural Network Training via Theano [[dir](./ch13)] [[ipynb](.2/ch12.ipynb)]
+13. Parallelizing Neural Network Training via Theano [[dir](./ch13)] [[ipynb](./ch13/ch13.ipynb)] [[nbviewer](http://nbviewer.ipython.org/github/rasbt/python-machine-learning-book/blob/master/code/ch13/ch13.ipynb)]
+
+## Contact
+
+I am happy to answer questions! Just write me an [email](mailto:[email protected])
+or consider asking the question on the [Google Groups Email List](https://groups.google.com/forum/#!forum/python-machine-learning-book).
+
+If you are interested in keeping in touch, I have quite a lively twitter stream ([@rasbt](https://twitter.com/rasbt)) all about data science and machine learning. I also maintain a [blog](http://sebastianraschka.com/articles.html) where I post all of the things I am particularly excited about.
diff --git a/code/_convenience_scripts/md_toc.py b/code/_convenience_scripts/md_toc.py
@@ -0,0 +1,13 @@
+# Sebastian Raschka, 2015
+# convenience function for myself to create nested TOC lists
+# use as `python md_toc.py /blank_tocs/ch01.toc`
+
+import sys
+
+ipynb = sys.argv[1]
+with open(ipynb, 'r') as f:
+    for line in f:
+        out_str = ' ' * (len(line) - len(line.lstrip()))
+        line = line.strip()
+        out_str += '- %s' % line
+        print(out_str)
diff --git a/code/ch01/README.md b/code/ch01/README.md
@@ -1,16 +1,35 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
+Python Machine Learning - Code Examples
 
-# Chapter 1 Code Examples
 
-## Giving Computers the Ability to Learn from Data
+##  Chapter 1 - Giving Computers the Ability to Learn from Data
 
-<hr>
+- Building intelligent machines to transform data into knowledge
+- The three different types of machine learning
+  - Making predictions about the future with supervised learning
+    - Classification for predicting class labels
+    - Regression for predicting continuous outcomes
+  - Solving interactive problems with reinforcement learning
+  - Discovering hidden structures with unsupervised learning
+    - Finding subgroups with clustering
+    - Dimensionality reduction for data compression
+- An introduction to the basic terminology and notations
+- A roadmap for building machine learning systems
+  - Preprocessing – getting data into shape
+  - Training and selecting a predictive model
+  - Evaluating models and predicting unseen data instances
+- Using Python for machine learning
+  - Installing Python packages
+- Summary
+
+
+
+---
 
 **Chapter 1 does not contain any code examples.**
 
-<hr>
+---
 
 ## Installing Python packages
 

diff --git a/code/ch02/README.md b/code/ch02/README.md
@@ -1,6 +1,15 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 2 Code Examples
+Python Machine Learning - Code Examples
 
-## Training Machine Learning Algorithms for Classification
+##  Chapter 2 - Training Machine Learning Algorithms for Classification
+
+- Artificial neurons - a brief glimpse into the early history
+- of machine learning
+- Implementing a perceptron learning algorithm in Python
+  - Training a perceptron model on the Iris dataset
+- Adaptive linear neurons and the convergence of learning
+  - Minimizing cost functions with gradient descent
+  - Implementing an Adaptive Linear Neuron in Python
+  - Large scale machine learning and stochastic gradient descent
+- Summary
diff --git a/code/ch03/README.md b/code/ch03/README.md
@@ -1,6 +1,27 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 3 Code Examples
+Python Machine Learning - Code Examples
 
-## A Tour of Machine Learning Classifiers Using Scikit-learn
+
+## Chapter 3 - A Tour of Machine Learning Classifiers Using Scikit-learn
+
+- Choosing a classification algorithm
+- First steps with scikit-learn
+    - Training a perceptron via scikit-learn
+- Modeling class probabilities via logistic regression
+    - Logistic regression intuition and conditional probabilities
+    - Learning the weights of the logistic cost function
+    - Training a logistic regression model with scikit-learn
+    - Tackling overfitting via regularization
+- Maximum margin classification with support vector machines
+    - Maximum margin intuition
+    - Dealing with the nonlinearly separable case using slack variables
+    - Alternative implementations in scikit-learn
+- Solving nonlinear problems using a kernel SVM
+    - Using the kernel trick to find separating hyperplanes in higher dimensional space
+- Decision tree learning
+    - Maximizing information gain – getting the most bang for the buck
+    - Building a decision tree
+    - Combining weak to strong learners via random forests
+- K-nearest neighbors – a lazy learning algorithm
+- Summary
diff --git a/code/ch04/README.md b/code/ch04/README.md
@@ -1,6 +1,21 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 4 Code Examples
+Python Machine Learning - Code Examples
 
-## Building Good Training Sets – Data Preprocessing
+## Chapter 4 - Building Good Training Sets – Data Preprocessing
+
+- Dealing with missing data
+  - Eliminating samples or features with missing values
+  - Imputing missing values
+  - Understanding the scikit-learn estimator API
+- Handling categorical data
+  - Mapping ordinal features
+  - Encoding class labels
+  - Performing one-hot encoding on nominal features
+- Partitioning a dataset in training and test sets
+- Bringing features onto the same scale
+- Selecting meaningful features
+  - Sparse solutions with L1 regularization
+  - Sequential feature selection algorithms
+- Assessing feature importance with random forests
+- Summary
diff --git a/code/ch05/README.md b/code/ch05/README.md
@@ -1,6 +1,23 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 5 Code Examples
+Python Machine Learning - Code Examples
 
-## Compressing Data via Dimensionality Reduction
+## Chapter 5 - Compressing Data via Dimensionality Reduction
+
+- Unsupervised dimensionality reduction via principal component analysis 128
+  - Total and explained variance
+  - Feature transformation
+  - Principal component analysis in scikit-learn
+- Supervised data compression via linear discriminant analysis
+  - Computing the scatter matrices
+  - Selecting linear discriminants for the new feature subspace
+  - Projecting samples onto the new feature space
+  - LDA via scikit-learn
+- Using kernel principal component analysis for nonlinear mappings
+  - Kernel functions and the kernel trick
+  - Implementing a kernel principal component analysis in Python
+    - Example 1 – separating half-moon shapes
+    - Example 2 – separating concentric circles
+  - Projecting new data points
+  - Kernel principal component analysis in scikit-learn
+- Summary
diff --git a/code/ch06/README.md b/code/ch06/README.md
@@ -1,9 +1,25 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
+Python Machine Learning - Code Examples
 
-# Chapter 6 Code Examples
-
-##  Learning Best Practices for Model Evaluation and Hyperparameter Tuning
+##  Chapter 6 - Learning Best Practices for Model Evaluation and Hyperparameter Tuning
 
+- Streamlining workflows with pipelines
+  - Loading the Breast Cancer Wisconsin dataset
+  - Combining transformers and estimators in a pipeline
+- Using k-fold cross-validation to assess model performance
+  - The holdout method
+  - K-fold cross-validation
+- Debugging algorithms with learning and validation curves
+  - Diagnosing bias and variance problems with learning curves
+  - Addressing overfitting and underfitting with validation curves
+- Fine-tuning machine learning models via grid search
+  - Tuning hyperparameters via grid search
+  - Algorithm selection with nested cross-validation
+- Looking at different performance evaluation metrics
+  - Reading a confusion matrix
+  - Optimizing the precision and recall of a classification model
+  - Plotting a receiver operating characteristic
+  - The scoring metrics for multiclass classification
+- Summary
 
diff --git a/code/ch07/README.md b/code/ch07/README.md
@@ -1,6 +1,14 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 7 Code Examples
+Python Machine Learning - Code Examples
 
-## Combining Different Models for Ensemble Learning
+## Chapter 7 - Combining Different Models for Ensemble Learning
+
+
+- Learning with ensembles
+- Implementing a simple majority vote classifier
+  - Combining different algorithms for classification with majority vote
+- Evaluating and tuning the ensemble classifier
+- Bagging – building an ensemble of classifiers from bootstrap samples
+- Leveraging weak learners via adaptive boosting
+- Summary
diff --git a/code/ch08/README.md b/code/ch08/README.md
@@ -1,6 +1,15 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 8 Code Examples
+Python Machine Learning - Code Examples
 
-## Applying Machine Learning to Sentiment Analysis
+## Chapter 8 - Applying Machine Learning to Sentiment Analysis
+
+- Obtaining the IMDb movie review dataset
+- Introducing the bag-of-words model
+  - Transforming words into feature vectors
+  - Assessing word relevancy via term frequency-inverse document frequency
+  - Cleaning text data
+  - Processing documents into tokens
+- Training a logistic regression model for document classification
+- Working with bigger data – online algorithms and out-of-core learning
+- Summary
diff --git a/code/ch09/README.md b/code/ch09/README.md
@@ -1,10 +1,20 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning
-# Chapter 9 Code Examples
+Python Machine Learning - Code Examples
 
-## Embedding a Machine Learning Model into a Web Application
+## Chapter 9 - Embedding a Machine Learning Model into a Web Application
 
+- Serializing fitted scikit-learn estimators
+- Setting up a SQLite database for data storage
+- Developing a web application with Flask
+- Our first Flask web application
+  - Form validation and rendering
+  - Turning the movie classifier into a web application
+- Deploying the web application to a public server
+  - Updating the movie review classifier
+- Summary
+
+---
 
 The code for the Flask web applications can be found in the following directories:
 

diff --git a/code/ch10/README.md b/code/ch10/README.md
@@ -1,6 +1,21 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 10 Code Examples
+Python Machine Learning - Code Examples
 
-## Predicting Continuous Target Variables with Regression Analysis
+## Chapter 10 - Predicting Continuous Target Variables with Regression Analysis
+
+- Introducing a simple linear regression model
+- Exploring the Housing Dataset
+  - Visualizing the important characteristics of a dataset
+- Implementing an ordinary least squares linear regression model
+  - Solving regression for regression parameters with gradient descent
+  - Estimating the coefficient of a regression model via scikit-learn
+- Fitting a robust regression model using RANSAC
+- Evaluating the performance of linear regression models
+- Using regularized methods for regression
+- Turning a linear regression model into a curve – polynomial regression
+  - Modeling nonlinear relationships in the Housing Dataset
+  - Dealing with nonlinear relationships using random forests
+    - Decision tree regression
+    - Random forest regression
+- Summary
diff --git a/code/ch11/README.md b/code/ch11/README.md
@@ -1,6 +1,17 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 11 Code Examples
+Python Machine Learning - Code Examples
 
-## Working with Unlabeled Data – Clustering Analysis
+## Chapter 11 - Working with Unlabeled Data – Clustering Analysis
+
+- Grouping objects by similarity using k-means
+  - K-means++
+  - Hard versus soft clustering
+  - Using the elbow method to find the optimal number of clusters
+  - Quantifying the quality of clustering via silhouette plots
+- Organizing clusters as a hierarchical tree
+  - Performing hierarchical clustering on a distance matrix
+  - Attaching dendrograms to a heat map
+  - Applying agglomerative clustering via scikit-learn
+- Locating regions of high density via DBSCAN
+- Summary
diff --git a/code/ch12/README.md b/code/ch12/README.md
@@ -1,6 +1,24 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 12 Code Examples
+Python Machine Learning - Code Examples
 
-## Training Artificial Neural Networks for Image
+## Chapter 12 - Training Artificial Neural Networks for Image
+
+- Modeling complex functions with artificial neural networks
+  - Single-layer neural network recap
+  - Introducing the multi-layer neural network architecture
+  - Activating a neural network via forward propagation
+- Classifying handwritten digits
+  - Obtaining the MNIST dataset
+  - Implementing a multi-layer perceptron
+- Training an artificial neural network
+  - Computing the logistic cost function
+  - Training neural networks via backpropagation
+- Developing your intuition for backpropagation
+- Debugging neural networks with gradient checking
+- Convergence in neural networks
+- Other neural network architectures
+  - Convolutional Neural Networks
+  - Recurrent Neural Networks
+- A few last words about neural network implementation
+- Summary
diff --git a/code/ch13/README.md b/code/ch13/README.md
@@ -1,6 +1,18 @@
 Sebastian Raschka, 2015
 
-# Python Machine Learning 
-# Chapter 13 Code Examples
+Python Machine Learning - Code Examples
 
-## Parallelizing Neural Network Training with Theano
+## Chapter 13 - Parallelizing Neural Network Training with Theano
+
+- Building, compiling, and running expressions with Theano
+  - What is Theano?
+  - First steps with Theano
+  - Configuring Theano
+  - Working with array structures
+  - Wrapping things up – a linear regression example
+- Choosing activation functions for feedforward neural networks
+  - Logistic function recap
+  - Estimating probabilities in multi-class classification via the softmax function
+  - Broadening the output spectrum by using a hyperbolic tangent
+- Training neural networks efficiently using Keras
+- Summary