The goals of this project are the following:
- Load the data set (see below for links to the project data set)
- Explore, summarize and visualize the data set
- Design, train and test a model architecture
- Use the model to make predictions on new images
- Analyze the softmax probabilities of the new images
- Summarize the results with a written report
Here I will consider the rubric points individually and describe how I addressed each point in my implementation.
You're reading it! and here is a link to my project code
I used the pandas library to calculate summary statistics of the traffic signs data set:
- The size of training set is 34799
- The size of the validation set is 4410
- The size of test set is 12630
- The shape of a traffic sign image is 32323
- The number of unique classes/labels in the data set is 43
Here is an exploratory visualization of the data set. It is a bar chart showing how the data distributed
To make it clear, I list them in a table:
Here, I decided not to convert the images to grayscale because the color is also an important factors for the traffic sign classifier, for example the speed limits and the end of the speed limits.
To preprocess the , I normalized the image data because it's will be much more easier to process the optimization if the data have a mean zero and equal variance.
My final model consisted of the following layers:
Layer | Description |
---|---|
Input | 32x32x3 RGB image |
Convolution 5x5 | 5x5 stride, valid padding, outputs 28x28x6 |
RELU | |
Max pooling | 2x2 stride, outputs 14x14x6 |
Convolution 5x5 | 5x5 stride, valid padding, outputs 10x10x16 |
RELU | |
Max pooling | 2x2 stride, outputs 5x5x16 |
Flatten | outputs 400 |
Dropout | keep_prob 0.5 |
Fully connected | output 120 |
RELU | |
Dropout | keep_prob 0.6 |
Fully connected | output 84 |
RELU | |
Dropout | keep_prob 0.7 |
Fully connected | output 43 |
Softmax | output 43 |
To train the model, I used AdamOptimizer with learning rate 0.01, batch size of 128 and epochs I set it as 100, cause I wanted the accuracy to be more than 0.95, when it reach 0.95, the training process will stop automatically and save the data.
4. Approach taken for finding a solution and getting the validation set accuracy to be at least 0.93.
My final model results were:
- training set accuracy of = 0.982
- validation set accuracy = 0.952
- test set accuracy of = 0.933
To get a solution, I tried the structure of LeNet, which is proved to be a good structur for traffic signs classifier. In this structure, the input will be operated on two convolution process, which elimates the influence of the position. To avoid over fitting, I added three dropout progresses before each fully connected layer, and tried to adjust the keep_prob. As the results shows, this model is working well.
The following is the precision, recall and F values of the test set:
From the result we can see that the F value of the following signs are relevantly low, which means the prediction of these signs are relevantly not so reliable:
- 27: Pedestrians - 62.14%
- 30: Beware of ice/snow - 62.84%
- 24: Road narrows on the right - 67.11%
- 21: Double curve - 67.57%
- 41: End of no passing - 72.22%
- 0: Speed limit (20km/h) - 75.25%
- 19: Dangerous curve to the left - 77.55%
To test my model, I found the following traffic signs online:
Here are the results of the prediction:
Image | ID | Prediction | ID | T/F |
---|---|---|---|---|
Speed limit (20km/h) | 0 | Slippery road | 23 | F |
Speed limit (60km/h) | 3 | Speed limit (60km/h) | 3 | T |
Right-of-way at the next intersection | 11 | Right-of-way... | 11 | T |
Priority road | 12 | Priority road | 12 | T |
Stop | 14 | Stop | 14 | T |
Dangerous curve to the left | 19 | Dangerous curve to the left | 19 | T |
Double curve | 21 | Double curve | 21 | T |
Road narrows on the right | 24 | Priority road | 12 | F |
Road work | 25 | Road work | 25 | T |
Pedestrians | 27 | Road narrows on the right | 24 | F |
Beware of ice/snow | 30 | Right-of-way... | 11 | F |
Wild animals crossing | 31 | Wild animals crossing | 31 | T |
The model was able to correctly guess 6 of the 6 traffic signs, which gives an accuracy of 66.7%.
To make it more clear, I also tried to show the top 5 of the softmax matrix:
For the 1st image, the model totally wrong, it is not sure and says that this maybe a 'Slippery road'(probability of 0.408), and the image actually contain a 'Speed limit (20km/h)'. and the top five soft max contains also not the right answer. Corresponding to the 63.33% precision rate of 'Speed limit (20km/h)' and 80.47% recall rate of 'Slippery road' in test set.
For the 2nd image, the model is quite sure that this is a 'Speed limit (60km/h)'(probability of 0.919), and the image does contain a 'Speed limit (60km/h)'. Corresponding to the 95.56% precision rate of 'Speed limit (60km/h)' in test set.
For the 3rd image, the model is very sure that this is a 'Right-of-way at the next intersection'(probability of 0.997), and the image does contain a 'Right-of-way at the next intersection'. Corresponding to the 92.62% precision rate of 'Right-of-way at the next intersection' in test set.
For the 4th image, the model is very sure that this is a 'Priority road'(probability of 0.999), and the image does contain a 'Priority road'. Corresponding to the 96.38% precision rate of 'Priority road' in test set.
For the 5th image, the model is pretty sure that this is a 'Stop'(probability of 0.950), and the image does contain a 'Stop'. Corresponding to the 99.63% precision rate of 'Stop' in test set.
For the 6th image, the model is not sure that this is a 'Dangerous curve to the left'(probability of 0.634), and the image does contain a 'Dangerous curve to the left'. Corresponding to the 63.33% precision rate of 'Dangerous curve to the left' in test set.
For the 7th image, the model is pretty sure that this is a 'Double curve'(probability of 0.961), and the image does contain a 'Double curve'. Corresponding to the 55.56% precision rate of 'Double curve' in test set.
For the 8th image, the model is quite sure that this is a 'Road narrows on the right'(probability of 0.984), and the image does contain a 'Road narrows on the right'. Corresponding to the 55.56% precision rate of 'Road narrows on the right' in test set.
For the 9th image, the model is very sure that this is a 'Road work'(probability of 1.000), and the image does contain a 'Road work'. Corresponding to the 95.42% precision rate of 'Road work' in test set.
For the 10th image, the model is not very sure that this is a 'Road narrows on the right'(probability of 0.712), and the image does contain a 'Pedestrians'. In the top five soft max it does contains 'Pedestrians' with probability of 0.194. Corresponding to the 53.33% precision rate of 'Pedestrians' and 84.75% recall rate of 'Road narrows on the right' in test set.
For the 11th image, the model totally wrong, it is not very sure that this is a 'Right-of-way at the next intersection'(probability of 0.573), and the image does not contain a 'Beware of ice/snow'. Corresponding to the 62.00% precision rate of 'Beware of ice/snow' and 88.81% recall rate of 'Right-of-way at the next intersection' in test set.
For the 12th image, the model is very sure that this is a 'Wild animals crossing'(probability of 1.000), and the image does contain a 'Wild animals crossing'. Corresponding to the 95.56% precision rate of 'Wild animals crossing' in test set.
1. Discuss the visual output of your trained network's feature maps. What characteristics did the neural network use to make classifications?
The visual output of the first layer are like the follows:
We can find that it takes the shape of the sign and also the content into account when doing a prediction.