Introduction

In the previous article, I was playing around with Detectron 2 for panoptic vision. After testing on some specific instances, I wanted to fine tune the model to better label and capture them. This article focuses on the steps I took to fine-train the model to better detect the object I passed in for instance segmentation.

Why do I want to fine-tune the model?

Fine-tuning, means that you can take an existing model which to differing levels, have a basic understanding of common objects, instances and overall knowledge and you can build upon that with the bespoke content you have.

In machine learning terms, it is fine-tuning is “unfreezing” the higher layers of the neural network to further train the parameters on the existing model. Exactly like taking a Java developer and teaching them Go. They already have an understanding of the world and you are now just focusing it on teaching the new aspects you want them to understand, for the relevant tasks at hand.

Fine-tuning offers a huge advantage as you don’t have to start at the very beginning which saves time, money and resources compared to a bespoke model. Further you can achieve more accurate results than from an existing model, if it is not transferring as well as you would like to your task at hand. If is already transfers well, even better!

How did we know to fine-tune the model?

Running the existing model on the custom dataset, revealed that that model was pretty accurate at detecting images boundaries but it was incorrectly classifying them as books. As the coco dataset does not include business cards, it meant I had to teach it what a business card was so it could then correctly apply the classes onto instances in the model.

Steps

Grab commercially available images
1. 20+
Label and annotate them with CVAT - Free tool
Create test + training datasets
Export into a zip file
Load into Colab
Run (requires GPU and processing credits)
Evaluate against your images

Results

Original image

Original detectron2 instance segmentation

After fine-tuning

Business Cards tests

Poster Test

Store front sign test

Findings

My quick annotations really showed up with squiggly lines along the image’s border. Not using polygons in CVAT would have helped here.

As my dataset was only about 20 images with quite a high proportion having their object cut out of the frame, it shows in the images above with some corners missing.

We can definitely make some further refinements for the future. Overall, business cards are now being correctly identified from the fine-tuned model and the task is complete. Further this extended mostly well in all except the test on A4 Posters.

Thanks for following along!