How to implement the Object Detection with YOLO 9000?

YAN PANG

Published Dec 23, 2017

YOLO: You only look once! It is a state-of-the-art, real-time object detection system in deep learning domain. Joseph Redmon and Ali Farhadi built YOLO in 2015. The improved model, YOLO v2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. Using a novel, multi-scale training method the same YOLOv2 model can run at varying sizes, offering an easy tradeoff between speed and accuracy. At 67 FPS, YOLO v2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster R-CNN with ResNet and SSD while still running significantly faster[1]. Please find more details about it to click here.

Detection Using A Pre-Trained Model

In this blog, I will only introduce how to do object detection with a pre-trained model. If you want to know how to train your own model, please move to my another blog: How to train the YOLO from zero?

git clone https://github.com/pypancho/darknet.git

cd darknet

Then, please copy the script which you need. Please choose one script to copy. For GPU version, please make sure you have already installed CUDA, Cudnn, and OpenCV on your system. And for CPU version, only OpenCV was necessary.

rm -rf Makefile


cp ~/darknet/Make/Makefile_CPU ~/darknet/Makefile
cp ~/darknet/Make/Makefile_GPU ~/darknet/Makefile

make

You will have to download the pre-trained weight file to run this:

wget https://pjreddie.com/media/files/yolo.weights

OK. Now it is the time to run!

Image Detection

First, let us run the following code to detect one image.

./darknet detect cfg/yolo.cfg yolo.weights data/dog.jpg

The new detected image will be saved as predictions.png. You can open it to see the detected objects.

If you want to detect multiple images at the same time, please type as follows:

./darknet detect cfg/yolo.cfg yolo.weights

You will be asked to input the path of the images like this:

layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   416 x 416 x   3   ->   416 x 416 x  321 max          2 x 2 / 2   416 x 416 x  32   ->   208 x 208 x  32
    .......
   29 conv    425  1 x 1 / 1    13 x  13 x1024   ->    13 x  13 x 42530 detection
Loading weights from yolo.weights ...Done!
Enter Image Path:

Once it is done it will prompt you for more paths to try different images. Use Ctrl-C to exit the program once you are done.

Changing The Detection Threshold

By default, YOLO only displays objects detected with a confidence of .25 or higher. You can change this by passing the -thresh <val> flag to the YOLO command. For example, to display all detection you can set the threshold to 0:

./darknet detect cfg/yolo.cfg yolo.weights data/dog.jpg -thresh 0

Real-Time Detection on a Webcam or Video

To run this demo you will need to compile with CUDA, Cudnn, and OpenCV. Please make sure you have already installed them on your machine. Then run the command:

./darknet detector demo cfg/coco.data cfg/yolo.cfg yolo.weights

YOLO will display the current FPS and predicted classes as well as the image with bounding boxes drawn on top of it.

You will need a webcam connected to the computer that OpenCV can connect to or it won't work. If you have multiple webcams connected and want to select which one to use you can pass the flag -c <num> to pick (OpenCV uses webcam 0 by default).

You can also run it on a video file if OpenCV can read the video:

./darknet detector demo cfg/coco.data cfg/yolo.cfg yolo.weights <video file>

How to do Object Detection with a camera or video remotely

First, you need to remote access from your host machine.

ssh remote_username@remote_IP -X

After login to your remote machine, we need to set up the display IP.

export DISPLAY=":0.0"

Then, you can do the object detection with a camera or video, and display on your host monitor.

Please enjoy!

Reference:

[1] https://arxiv.org/abs/1612.08242

Rob Beetel 8y

Yan, thanks very much for this tutorial! I'm having a little trouble though. I'm having trouble making the project. I am attempting to make the project on a GCP GPU instance running ubuntu 14. When I run make, the error I get is that the #include opencv and opencv2 files cannot be found. those files are on the instance but the path is more extensive than simply opencv/ opencv2/ Should my opencv install be in a folder relative to the project?

See more comments

To view or add a comment, sign in

How to implement the Object Detection with YOLO 9000?

YAN PANG

Detection Using A Pre-Trained Model

Image Detection

Changing The Detection Threshold

Real-Time Detection on a Webcam or Video

How to do Object Detection with a camera or video remotely

Please enjoy!

Reference:

More articles by YAN PANG

Others also viewed

SE(3), The Lie Group That Moves the World

Prediction using Deep Belief Network

MLOPS TASK 4

Linear-time sequence modeling with selective state spaces

Convolutions without Multiplications for Efficient CNNs Inference

Think Like a Machine: Understanding Data Through Scalars and Vectors

Q-NeuroSHT: Quantum-Inspired Neuromorphic Sparse Hypergraph Transformer with a dummy simulator designed by me , https://spiketransform.lovable.app/

Do you know: Gaussian Noise is the backbone of Ghibli Style Images

De-Clouding Intelligence: Why I built a Modular, Biological AI on an External Drive.

Observer-Relative S-FPINS: Complete Mathematical Framework

Explore content categories

Detection Using A Pre-Trained Model

Image Detection

Changing The Detection Threshold

Real-Time Detection on a Webcam or Video

How to do Object Detection with a camera or video remotely

Please enjoy!

Reference:

More articles by YAN PANG

Some tips to setup the NVIDIA Jetson Nano

Communication between the drone and Raspberry Pi via MAVLink

Install Keras based on TensorFlow-GPU and Caffe-GPU on Ubuntu 18.04

How to train the YOLOv2 from zero?

How to install caffe on Ubuntu

Deploy DIGITS on Nvidia TX1 from zero

How to install matplotlib on Jupyter Nootbook in Windows

How to use TensorFlow in Jupyter Notebook in Windows

Others also viewed

SE(3), The Lie Group That Moves the World

Prediction using Deep Belief Network

MLOPS TASK 4

Linear-time sequence modeling with selective state spaces

Convolutions without Multiplications for Efficient CNNs Inference

Think Like a Machine: Understanding Data Through Scalars and Vectors

Q-NeuroSHT: Quantum-Inspired Neuromorphic Sparse Hypergraph Transformer with a dummy simulator designed by me , https://spiketransform.lovable.app/

Do you know: Gaussian Noise is the backbone of Ghibli Style Images

De-Clouding Intelligence: Why I built a Modular, Biological AI on an External Drive.

Observer-Relative S-FPINS: Complete Mathematical Framework

Explore content categories