TensorFlow Deep Learning Projects
上QQ阅读APP看书,第一时间看更新

Presenting our project plan

Given such a powerful tool made available by TensorFlow, our plan is to leverage its API by creating a class you can use for annotating images both visually and in an external file. By annotating, we mean the following:

  • Pointing out the objects in an image (as recognized by a model trained on MS COCO)
  • Reporting the level of confidence in the object recognition (we will consider only objects above a minimum probability threshold, which is set to 0.25, based on the speed/accuracy trade-offs for modern convolutional object detectors discussed in the paper previously mentioned)
  • Outputting the coordinates of two opposite vertices of the bounding box for each image
  • Saving all such information in a text file in JSON format
  • Visually representing the bounding box on the original image, if required

In order to achieve such objectives, we need to:

  1. Download one of the pre-trained models (available in .pb format - protobuf) and make it available in-memory as a TensorFlow session.
  2. Reformulate the helper code provided by TensorFlow in order to make it easier to load labels, categories, and visualization tools by a class that can be easily imported into your scripts.
  3. Prepare a simple script to demonstrate its usage with single images, videos, and videos captured from a webcam.

We start by setting up an environment suitable for the project.