Classification

Classification of Hand Drawn Images of shapes is done to determine the type of shape. One shape per image. Drawings exist for 4 shapes:

Rectangle
Ellipse
Triangle
Other

Notebooks

hds_shapes_1_etl notebook to read the images and prepare data (ETL).
hds_shapes_2_classify notebook to do Shape classification.

Images

Images are (70px x 70 px x 1 gray channel). In the ETL phase, I separated the data in:

Set	Total	Other	Ellipse	Rectangle	Triangle
Training set	21386 images	5316	5025	5737	5308
Validation set	3983 images	1130	1069	860	924
Test set	1923 images	841	360	359	363

Easy to hard

The difficulty of the classification really depends on the flexibility that we want to implement.
Let's show some examples of various difficulties.

Easy

If we force the user to form very distincts shapes, the classification is pretty easy.

Ellipse	Rectangle	Triangle

Why is it easy?

Shapes are pretty much all closed.
Only the Ellipse contain curves.
No extra drawing that can confuse the model.

Medium

If we consider that the user may be drawing quickly with a basic shape intent but imperfect drawing, it gets more interesting and harder.

Ellipse	Rectangle	Triangle

Why is it more difficult?

Some open shapes.
Rectangles and Triangles may contain curves.
Some extra drawing is tolerated(like in the ellipse). Too much extra drawing will make the shape fall in the Other category.

Hard

If we try to go to the limit of what a human can interpret, then it becomes much harder and really interesting. Here are 2 sets of hard images.

Ellipse	Rectangle	Triangle

Why is it even more difficult?

The Ellipses have curves, but can almost be shaped as a Triangle.
The Triangles can also have curves.
The Rectangles have some parallel lines, but can contain curves .

or

Ellipse	Rectangle	Triangle

Why is it even more difficult?
When drawing quickly, the intent may only be roughly expressed in the actual drawing:

An ellipse may not contains a lot of curves.
Sides may have different lengths. We don't expect the user to be precise in it's drawing.

Hyper-parameters

The first considerations that impacts Hyper-parameters is that we can do a lot of very good Data Augmentation on our Training data. As images can be:

Flipped horizontally and vertically
Rotated 360 degres

and still be as good as the original images. It is easy to imagine that over, let's say 500 epochs, each image used for training will appear significatively different.

For Classification, we can use Keras ImageDataGenerator to do this Augmentation.

Learning Rate

I typically run 200 to 300 epochs and only lower the learning rate very slowly due to the big Data Agmentation.
The correct setting for patience and factor will be important to reach the best accuracy.
Note the factor of 0.9 and patience of 15. So if there is no improvements in validation accuracy for 15 epochs, I lower the Learning Rate to 90%.

learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy',
                                            patience=15,
                                            verbose=1,
                                            factor=0.9,
                                            min_lr=0.000000001)

Notes on classification

When doing classification training for Mix on Pix using a GPU over 300 epochs, I get a validation accuracy around 0.9980
Once the model was trained, I generated a TensorFlow Lite model that I then use in Mix on Pix.

by Francois Robert

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Classification

Notebooks

Images

Easy to hard

Easy

Medium

Hard

Hyper-parameters

Learning Rate

Notes on classification

Files

README.md

Latest commit

History

README.md

File metadata and controls

Classification

Notebooks

Images

Easy to hard

Easy

Medium

Hard

Hyper-parameters

Learning Rate

Notes on classification