Object detection with OpenCV

Gopal Katariya
2 min readMar 21, 2023

--

Object detection is a computer vision task that involves detecting and localizing objects within an image or video. In recent years, deep learning-based approaches have achieved state-of-the-art performance on this task, and frameworks like OpenCV make it easy to implement these approaches.

In this blog post, we’ll walk through an example of object detection using OpenCV and a deep learning model called MobileNet SSD. The goal of this example is to detect objects in an input image and draw bounding boxes around them.

First, let’s take a look at the code:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
help="path to input image")
ap.add_argument("-p", "--prototxt", required=False, default="models/MobileNetSSD_deploy.prototxt.txt",
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=False, default="models/MobileNetSSD_deploy.caffemodel",
help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))

print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

image = cv2.imread(args["image"])
(h, w) = image.shape[:2]
blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 0.007843,
(300, 300), 127.5)

print("[INFO] computing object detections...")
net.setInput(blob)
detections = net.forward()

for i in np.arange(0, detections.shape[2]):
confidence = detections[0, 0, i, 2]

if confidence > args["confidence"]:
idx = int(detections[0, 0, i, 1])
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
# display the prediction
label = "{}: {:.2f}%".format(CLASSES[idx], confidence * 100)
print("[INFO] {}".format(label))
cv2.rectangle(image, (startX, startY), (endX, endY),
COLORS[idx], 2)
y = startY - 15 if startY - 15 > 15 else startY + 15
cv2.putText(image, label, (startX, y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
cv2.imwrite("images/output/bus.jpg", image)
# show image
cv2.imshow("Output", image)
cv2.waitKey(0)

source code

Reference :

[1] MobileNetSSD_deploy.caffemodel

[2] https://docs.opencv.org/4.x/index.html

Feel free to connect:

LinkedIN : https://www.linkedin.com/in/gopalkatariya44/

Github : https://github.com/gopalkatariya44/

Instagram : https://www.instagram.com/_gk_44/

Twitter: https://twitter.com/GopalKatariya44

Thanks 😊 !

--

--

Gopal Katariya

AI Engineer | Machine Learning Enthusiast | Transforming Ideas into Intelligent Solutions