Working with the face-tracking ROS package
We have already created or copied the face_tracker_pkg
package to the workspace and have discussed some of its important dependencies. Now, we are going to discuss what this package exactly does!
This package consists of a ROS node called face_tracker_node
that can track faces using OpenCV APIs and publish the centroid of the face to a topic. Here is the block diagram of the working of face_tracker_node
:
Figure 15: Block diagram of face_tracker_node
Let's discuss the things connected to face_tracker_node
. One of the sections that may be unfamiliar to you is the face Haar classifier:
- Face Haar classifier: The Haar feature-based cascade classifier is a machine learning approach for detecting objects. This method was proposed by Paul Viola and Michael Jones in their paper Rapid Object detection using a boosted cascade of simple features in 2001. In this method, a cascade file is trained using a positive and negative sample image, and after training, that file is using for object detection.
- In our case, we are using a trained Haar classifier file along with OpenCV source code. You will get these Haar classifier files from the OpenCV
data
folder (https://github.com/opencv/opencv/tree/master/data). You can replace the desired Haar file according to your application. Here, we are using the face classifier. The classifier will be an XML file that has tags containing features of a face. Once the features inside the XML match, we can retrieve the region of interest (ROI) of the face from the image using the OpenCV APIs. You can check the Haar classifier of this project fromface_tracker_pkg/data/face.xml
.
- In our case, we are using a trained Haar classifier file along with OpenCV source code. You will get these Haar classifier files from the OpenCV
track.yaml
: This is a ROS parameter file having parameters such as the Haar file path, input image topic, output image topic, and flags to enable and disable face tracking. We are using ROS configuration files because we can change the node parameters without modifying the face tracker source code. You can get this file fromface_tracker_pkg/config/track.xml
.usb_cam
node: Theusb_cam
package has a node publishing the image stream from the camera to ROSImage
messages. Theusb_cam
node publishes camera images to the/usb_cam/raw_image
topic, and this topic is subscribed to by the face tracker node for face detection. We can change the input topic in thetrack.yaml
file if we require.face_tracker_control
: This is the second package we are going to discuss. Theface_tracker_pkg
package can detect faces and find the centroid of the face in the image. The centroid message contains two values, X and Y. We are using a custom message definition to send the centroid values. These centroid values are subscribed by the controller node and move the Dynamixel to track the face. The Dynamixel is controlled by this node.
Here is the file structure of face_tracker_pkg
:
Figure 16: File structure of face_tracker_pkg
Let's see how the face-tracking code works. You can open the CPP file at face_tracker_pkg/src/face_tracker_node.cpp
. This code performs the face detection and sends the centroid value to a topic.
We'll look at, and understand, some code snippets.
Understanding the face tracker code
Let's start with the header file. The ROS header files we are using in the code lie here. We have to include ros/ros.h
in every ROS C++ node; otherwise, the source code will not compile. The remaining three headers are image-transport headers, which have functions to publish and subscribe to image messages in low bandwidth. The cv_bridge
header has functions to convert between OpenCV ROS data types. The image_encoding.h
header has the image-encoding format used during ROS-OpenCV conversions:
#include <ros/ros.h> #include <image_transport/image_transport.h> #include <cv_bridge/cv_bridge.h> #include <sensor_msgs/image_encodings.h>
The next set of headers is for OpenCV. The imgproc
header consists of image-processing functions, highgui
has GUI-related functions, and objdetect.hpp
has APIs for object detection, such as the Haar classifier:
#include <opencv2/imgproc/imgproc.hpp> #include <opencv2/highgui/highgui.hpp> #include "opencv2/objdetect.hpp"
The last header file is for accessing a custom message called centroid. The centroid
message definition has two fields, int32 x
and int32 y
. This can hold the centroid of the file. You can check this message definition from the face_tracker_pkg/msg/centroid.msg
folder:
#include <face_tracker_pkg/centroid.h>
The following lines of code give a name to the raw image window and face-detection window:
static const std::string OPENCV_WINDOW = "raw_image_window"; static const std::string OPENCV_WINDOW_1 = "face_detector";
The following lines of code create a C++ class for our face detector. The code snippet is creates handles of NodeHandle
, which is a mandatory handle for a ROS node; image_transport
, which helps send ROS Image
messages across the ROS computing graph; and a publisher for the face centroid, which can publish the centroid values using the centroid.msg
file defined by us. The remaining definitions are for handling parameter values from the parameter file track.yaml
:
class Face_Detector { ros::NodeHandle nh_; image_transport::ImageTransport it_; image_transport::Subscriber image_sub_; image_transport::Publisher image_pub_; ros::Publisher face_centroid_pub; face_tracker_pkg::centroid face_centroid; string input_image_topic, output_image_topic, haar_file_face; int face_tracking, display_original_image, display_tracking_image, center_offset, screenmaxx;
The following is the code for retrieving ROS parameters inside the track.yaml
file. The advantage of using ROS parameters is that we can avoid hard-coding these values inside the program and modify the values without recompiling the code:
try{ nh_.getParam("image_input_topic", input_image_topic); nh_.getParam("face_detected_image_topic", output_image_topic); nh_.getParam("haar_file_face", haar_file_face); nh_.getParam("face_tracking", face_tracking); nh_.getParam("display_original_image", display_original_image); nh_.getParam("display_tracking_image", display_tracking_image); nh_.getParam("center_offset", center_offset); nh_.getParam("screenmaxx", screenmaxx); ROS_INFO("Successfully Loaded tracking parameters"); }
The following code creates a subscriber for the input image topic and publisher for the face-detected image. Whenever an image arrives on the input image topic, it will call a function called imageCb
. The names of the topics are retrieved from ROS parameters. We create another publisher for publishing the centroid value, which is the last line of the code snippet:
image_sub_ = it_.subscribe(input_image_topic, 1, &Face_Detector::imageCb, this); image_pub_ = it_.advertise(output_image_topic, 1); face_centroid_pub = nh_.advertise<face_tracker_pkg::centroid> ("/face_centroid",10);
The next bit of code is the definition of imageCb
, which is a callback for input_image_topic
. What it basically does is it converts the sensor_msgs/Image
data into the cv::Mat
OpenCV data type. The cv_bridge::CvImagePtr cv_ptr
buffer is allocated for storing the OpenCV image after performing the ROS-OpenCV conversion using the cv_bridge::toCvCopy
function:
void imageCb(const sensor_msgs::ImageConstPtr& msg) { cv_bridge::CvImagePtr cv_ptr; namespace enc = sensor_msgs::image_encodings; try { cv_ptr = cv_bridge::toCvCopy(msg, sensor_msgs::image_encodings::BGR8); }
We have already discussed the Haar classifier; here is the code to load the Haar classifier file:
string cascadeName = haar_file_face; CascadeClassifier cascade; if( !cascade.load( cascadeName ) ) { cerr << "ERROR: Could not load classifier cascade" << endl; }
We are now moving to the core part of the program, which is the detection of the face performed on the converted OpenCV image data type from the ROS Image
message. The following is the function call of detectAndDraw()
, which is performing the face detection, and in the last line, you can see the output image topic being published. Using cv_ptr->image
, we can retrieve the cv::Mat
data type, and in the next line, cv_ptr->toImageMsg()
can convert this into a ROS Image
message. The arguments of the detectAndDraw()
function are the OpenCV image and cascade variables:
detectAndDraw( cv_ptr->image, cascade ); image_pub_.publish(cv_ptr->toImageMsg());
Let's understand the detectAndDraw()
function, which is adopted from the OpenCV sample code for face detection: The function arguments are the input image and cascade object. The next bit of code will convert the image into grayscale first and equalize the histogram using OpenCV APIs. This is a kind of preprocessing before detecting the face from the image. The cascade.detectMultiScale()
function is used for this purpose (http://docs.opencv.org/2.4/modules/objdetect/doc/cascade_classification.html).
Mat gray, smallImg; cvtColor( img, gray, COLOR_BGR2GRAY ); double fx = 1 / scale ; resize( gray, smallImg, Size(), fx, fx, INTER_LINEAR ); equalizeHist( smallImg, smallImg ); t = (double)cvGetTickCount(); cascade.detectMultiScale( smallImg, faces, 1.1, 15, 0 |CASCADE_SCALE_IMAGE, Size(30, 30) );
The following loop will iterate on each face that is detected using the detectMultiScale()
function. For each face, it finds the centroid and publishes to the /face_centroid
topic:
for ( size_t i = 0; i < faces.size(); i++ ) { Rect r = faces[i]; Mat smallImgROI; vector<Rect> nestedObjects; Point center; Scalar color = colors[i%8]; int radius; double aspect_ratio = (double)r.width/r.height; if( 0.75 < aspect_ratio && aspect_ratio < 1.3 ) { center.x = cvRound((r.x + r.width*0.5)*scale); center.y = cvRound((r.y + r.height*0.5)*scale); radius = cvRound((r.width + r.height)*0.25*scale); circle( img, center, radius, color, 3, 8, 0 ); face_centroid.x = center.x; face_centroid.y = center.y; //Publishing centroid of detected face face_centroid_pub.publish(face_centroid); }
To make the output image window more interactive, there are text and lines to alert about the user's face on the left or right or at the center. This last section of code is mainly for that purpose. It uses OpenCV APIs to do this job. Here is the code to display text such as Left, Right, and Center on the screen:
putText(img, "Left", cvPoint(50,240), FONT_HERSHEY_SIMPLEX, 1, cvScalar(255,0,0), 2, CV_AA); putText(img, "Center", cvPoint(280,240), FONT_HERSHEY_SIMPLEX, 1, cvScalar(0,0,255), 2, CV_AA); putText(img, "Right", cvPoint(480,240), FONT_HERSHEY_SIMPLEX, 1, cvScalar(255,0,0), 2, CV_AA);
Excellent! We're done with the tracker code; let's see how to build it and make it executable.
Understanding CMakeLists.txt
The default CMakeLists.txt
file made during the creation of the package has to be edited in order to compile the previous source code. Here is the CMakeLists.txt
file used to build the face_tracker_node.cpp
class.
The first two lines state the minimum version of cmake
required to build this package, and next line is the package name:
cmake_minimum_required(VERSION 2.8.3) project(face_tracker_pkg)
The following line searches for the dependent packages of face_tracker_pkg
and raises an error if it is not found:
find_package(catkin REQUIRED COMPONENTS cv_bridge image_transport roscpp rospy sensor_msgs std_msgs message_generation )
This line of code contains the system-level dependencies for building the package:
find_package(Boost REQUIRED COMPONENTS system)
As we've already seen, we are using a custom message definition called centroid.msg
, which contains two fields, int32 x
and int32 y
. To build and generate C++ equivalent headers, we should use the following lines:
add_message_files( FILES centroid.msg ) ## Generate added messages and services with any dependencies listed here generate_messages( DEPENDENCIES std_msgs )
The catkin_package()
function is a catkin-provided CMake macro that is required to generate pkg-config
and CMake files.
catkin_package( CATKIN_DEPENDS roscpp rospy std_msgs message_runtime ) include_directories( ${catkin_INCLUDE_DIRS} )
Here, we are creating the executable called face_tracker_node
and linking it to catkin and OpenCV libraries:
add_executable(face_tracker_node src/face_tracker_node.cpp) target_link_libraries(face_tracker_node ${catkin_LIBRARIES} ${OpenCV_LIBRARIES} )
The track.yaml file
As we discussed, the track.yaml
file contains ROS parameters, which are required by the face_tracker_node
. Here are the contents of track.yaml
:
image_input_topic: "/usb_cam/image_raw" face_detected_image_topic: "/face_detector/raw_image" haar_file_face: "/home/robot/ros_robotics_projects_ws/ src/face_tracker_pkg/data/face.xml" face_tracking: 1 display_original_image: 1 display_tracking_image: 1
You can change all the parameters according to your needs. Especially, you may need to change haar_file_face
, which is the path of the Haar face file. If we set face_tracking:1
, it will enable face tracking, otherwise not. Also, if you want to display the original and face-tracking image, you can set the flag here.
The launch files
The launch files in ROS can do multiple tasks in a single file. The launch files have an extension of .launch
. The following code shows the definition of start_usb_cam.launch
, which starts the usb_cam
node for publishing the camera image as a ROS topic:
<launch> <node name="usb_cam" pkg="usb_cam" type="usb_cam_node" output="screen" > <param name="video_device" value="/dev/video0" /> <param name="image_width" value="640" /> <param name="image_height" value="480" /> <param name="pixel_format" value="yuyv" /> <param name="camera_frame_id" value="usb_cam" /> <param name="auto_focus" value="false" /> <param name="io_method" value="mmap"/> </node> </launch>
Within the <node>
...</node>
tags, there are camera parameters that can be change by the user. For example, if you have multiple cameras, you can change the video_device
value from /dev/video0
to /dev/video1
to get the second camera's frames.
The next important launch file is start_tracking.launch
, which will launch the face-tracker node. Here is the definition of this launch file:
<launch> <!-- Launching USB CAM launch files and Dynamixel controllers --> <include file="$(find face_tracker_pkg)/launch/start_usb_cam.launch"/> <!-- Starting face tracker node --> <rosparam file="$(find face_tracker_pkg)/config/track.yaml" command="load"/> <node name="face_tracker" pkg="face_tracker_pkg" type="face_tracker_node" output="screen" /> </launch>
It will first start the start_usb_cam.launch
file in order to get ROS image topics, then load track.yaml
to get necessary ROS parameters, and then load face_tracker_node
to start tracking.
The final launch file is start_dynamixel_tracking.launch
; this is the launch file we have to execute for tracking and Dynamixel control. We will discuss this launch file at the end of the chapter after discussing the face_tracker_control
package.
Running the face tracker node
Let's launch the start_tracking.launch
file from face_tracker_pkg
using the following command. Note that you should connect your webcam to your PC:
$ roslaunch face_tracker_pkg start_tracking.launch
If everything works fine, you will get output like the following; the first one is the original image, and the second one is the face-detected image:
Figure 17: Face-detected image
We have not enabled Dynamixel now; this node will just find the face and publish the centroid values to a topic called /face_centroid
.
So the first part of the project is done-what is next? It's the control part, right? Yes, so next, we are going to discuss the second package, face_tracker_control
.
The face_tracker_control package
The face_tracker_control
package is the control package used to track the face using the AX-12A Dynamixel servo.
Given here is the file structure of the face_tracker_control
package:
Figure 18: File organization in the face_tracker_control package
We'll look at the use of each of these files first.
The start_dynamixel
launch file starts Dynamixel Control Manager, which can establish a connection to a USB-to-Dynamixel adapter and Dynamixel servos. Here is the definition of this launch file:
<!-- This will open USB To Dynamixel controller and search for servos --> <launch> <node name="dynamixel_manager" pkg="dynamixel_controllers" type="controller_manager.py" required="true" output="screen"> <rosparam> namespace: dxl_manager serial_ports: pan_port: port_name: "/dev/ttyUSB0" baud_rate: 1000000 min_motor_id: 1 max_motor_id: 25 update_rate: 20 </rosparam> </node> <!-- This will launch the Dynamixel pan controller --> <include file="$(find face_tracker_control)/launch/start_pan_controller.launch"/> </launch>
We have to mention the port_name
(you can get the port number from kernel logs using the dmesg
command). The baud_rate
we configured was 1 Mbps, and the motor ID was 1
. The controller_manager.py
file will scan from servo ID 1
to 25
and report any servos being detected.
After detecting the servo, it will start the start_pan_controller.launch
file, which will attach a ROS joint position controller for each servo.
As we can see from the previous subsection, the pan controller launch file is the trigger for attaching the ROS controller to the detected servos. Here is the definition for the start_pan_controller.launch
file:
This will start the pan joint controller:
<launch> <!-- Start tilt joint controller --> <rosparam file="$(find face_tracker_control)/config/pan.yaml" command="load"/> <rosparam file="$(find face_tracker_control)/config/servo_param.yaml" command="load"/> <node name="tilt_controller_spawner" pkg="dynamixel_controllers" type="controller_spawner.py" args="--manager=dxl_manager --port pan_port pan_controller" output="screen"/> </launch>
The controller_spawner.py
node can spawn a controller for each detected servo. The parameters of the controllers and servos are included in pan.yaml
and servo_param.yaml
.
The pan controller configuration file
The pan controller configuration file contains the configuration of the controller that the controller spawner node is going to create. Here is the pan.yaml
file definition for our controller:
pan_controller: controller: package: dynamixel_controllers module: joint_position_controller type: JointPositionController joint_name: pan_joint joint_speed: 1.17 motor: id: 1 init: 512 min: 316 max: 708
In this configuration file, we have to mention the servo details, such as ID, initial position, minimum and maximum servo limits, servo moving speed, and joint name. The name of the controller is pan_controller
, and it's a joint position controller. We are writing one controller configuration for ID 1
because we are only using one servo.
The servo parameters configuration file
The servo_param.yaml
file contains the configuration of the pan_controller
, such as the limits of the controller and step distance of each movement; also, it has screen parameters such as the maximum resolution of the camera image and offset from the center of the image. The offset is used to define an area around the actual center of the image:
servomaxx: 0.5 #max degree servo horizontal (x) can turn servomin: -0.5 # Min degree servo horizontal (x) can turn screenmaxx: 640 #max screen horizontal (x)resolution center_offset: 50 #offset pixels from actual center to right and left step_distancex: 0.01 #x servo rotation steps
The face tracker controller node
As we've already seen, the face tracker controller node is responsible for controlling the Dynamixel servo according to the face centroid position. Let's understand the code of this node, which is placed at face_tracker_control/src/face_tracker_controller.cpp
.
The main ROS headers included in this code are as follows. Here, the Float64
header is used to hold the position value message to the controller:
#include "ros/ros.h" #include "std_msgs/Float64.h" #include <iostream>
The following variables hold the parameter values from servo_param.yaml
:
int servomaxx, servomin,screenmaxx, center_offset, center_left, center_right; float servo_step_distancex, current_pos_x;
The following message headers of std_msgs::Float64
are for holding the initial and current positions of the controller, respectively. The controller only accepts this message type:
std_msgs::Float64 initial_pose; std_msgs::Float64 current_pose;
This is the publisher handler for publishing the position commands to the controller:
ros::Publisher dynamixel_control;
Switching to the main()
function of the code, you can see following lines of code. The first line is the subscriber of /face_centroid
, which has the centroid value, and when a value comes to the topic, it will call the face_callback()
function:
ros::Subscriber number_subscriber = node_obj.subscribe("/face_centroid",10,face_callback);
The following line will initialize the publisher handle in which the values are going to be published through the /pan_controller/command
topic:
dynamixel_control = node_obj.advertise<std_msgs::Float64> ("/pan_controller/command",10);
The following code creates new limits around the actual center of image. This will be helpful for getting an approximated center point of the image:
center_left = (screenmaxx / 2) - center_offset; center_right = (screenmaxx / 2) + center_offset;
Here is the callback function executed while receiving the centroid value coming through the /face_centroid
topic. This callback also has the logic for moving the Dynamixel for each centroid value.
In the first section, the x
value in the centroid is checking against center_left
, and if it is in the left, it just increments the servo controller position. It will publish the current value only if the current position is inside the limit. If it is in the limit, then it will publish the current position to the controller. The logic is the same for the right side: if the face is in the right side of the image, it will decrement the controller position.
When the camera reaches the center of image, it will pause there and do nothing, and that is the thing we want too. This loop is repeated, and we will get a continuous tracking:
void track_face(int x,int y) { if (x < (center_left)){ current_pos_x += servo_step_distancex; current_pose.data = current_pos_x; if (current_pos_x < servomaxx and current_pos_x > servomin ){ dynamixel_control.publish(current_pose); } } else if(x > center_right){ current_pos_x -= servo_step_distancex; current_pose.data = current_pos_x; if (current_pos_x < servomaxx and current_pos_x > servomin ){ dynamixel_control.publish(current_pose); } } else if(x > center_left and x < center_right){ ; } }
Creating CMakeLists.txt
Like the first tracker package, there is no special difference in the control package; the difference is in the dependencies. Here, the main dependency is dynamixel_controllers
. We are not using OpenCV in this package, so there's no need to include it:
cmake_minimum_required(VERSION 2.8.3) project(face_tracker_control) find_package(catkin REQUIRED COMPONENTS dynamixel_controllers roscpp rospy std_msgs message_generation ) find_package(Boost REQUIRED COMPONENTS system) add_message_files( FILES centroid.msg ) ## Generate added messages and services with any dependencies listed here generate_messages( DEPENDENCIES std_msgs ) catkin_package( CATKIN_DEPENDS dynamixel_controllers roscpp rospy std_msgs ) include_directories( ${catkin_INCLUDE_DIRS} ) add_executable(face_tracker_controller src/face_tracker_controller.cpp) target_link_libraries(face_tracker_controller ${catkin_LIBRARIES})
Note
The complete source code of this project can be cloned from the following Git repository. The following command will clone the project repo:
$ git clone
Testing the face tracker control package
We have seen most of the files and their functionalities. So let's test this package first. We have to ensure that it is detecting the Dynamixel servo and creating the proper topic.
Before running the launch file, we may have to change the permission of the USB device, or it will throw an exception. The following command can be used to get permissions on the serial device:
$ sudo chmod 777 /dev/ttyUSB0
Note that you must replace ttyUSB0
with your device name; you can retrieve it by looking at kernel logs. The dmesg
command can help you find it.
Start the start_dynamixel.launch
file using the following command:
$ roslaunch face_tracker_control start_dynamixel.launch
Figure 19: Finding Dynamixel servos and creating controllers
If everything is successful, you will get a message as shown in the previous figure.
The following topics are generated when we run this launch file:
Figure 20: Face tracker control topics
Bringing all the nodes together
Next, we'll look at the final launch file, which we skipped while covering the face_tracker_pkg
package, and that is start_dynamixel_tracking.launch
. This launch file starts both face detection and tracking using Dynamixel motors:
<launch> <!-- Launching USB CAM launch files and Dynamixel controllers --> <include file="$(find face_tracker_pkg)/launch/start_tracking.launch"/><include file="$(find face_tracker_control)/launch/start_dynamixel.launch"/> <!-- Starting face tracker node --> <node name="face_controller" pkg="face_tracker_control" type="face_tracker_controller" output="screen" /> </launch>
Fixing the bracket and setting up the circuit
Before doing the final run of the project, we have to do something on the hardware side. We have to fix the bracket to the servo horn and fix the camera to the bracket. The bracket should be connected in such a way that it is always perpendicular to the center of the servo. The camera is mounted on the bracket, and it should be pointed toward the center position.
The following image shows the setup I did for this project. I simply used tape to fix the camera to the bracket. You can use any additional material to fix the camera, but it should always be aligned to the center first:
Figure 21: Fixing camera and bracket to the AX-12A
If you are done with this, then you are ready to go for the final run of this project.
The final run
I hope that you have followed all instructions properly; here is the command to launch all the nodes for this project and start tracking using Dynamixel:
$ roslaunch face_tracker_pkg start_dynamixel_tracking.launch
You will get the following windows, and it would be good if you could use a photo to test the tracking, because you will get continuous tracking of the face:
Figure 22: Final face tracking
Here, you can see the Terminal message that says the image is in the right and the controller is reducing the position value to achieve the center position.