Optimizing Object Detection with Bi-dimensional Empirical Mode Decomposition (BEMD) based Dimensionality Reduction and AlexNet

Dr. J. Anvar Shathik& Dr. Krishna prasad K

doi:10.48165/bapas.2024.44.2.1

PDF

Published: Sep 26, 2024

DOI: https://doi.org/10.48165/bapas.2024.44.2.1

Keywords:

Object Detectors, Bi-dimensional Empirical Mode Decomposition (BEMD), Deep learning, AlexNet, Sparse Reduction, and Common Objects in Context (COCO) dataset.

Dr. J. Anvar Shathik& Dr. Krishna prasad K

Abstract

Object detection is among the most significant and widely used for identifying target items in a specific image and determining their position and category in order to understand computer vision. Google, Facebook, and Snapchat's server-side production systems have greater freedom to optimize for accuracy, but they are still constrained by throughput limits. Lot of approaches is proposed in the literature for obtaining solution for solving object detection issue. Best results have been obtained by using deep learning and computer vision based methods. However, most of the existing techniques perform under expectation particularly in detecting the small and dense objects and in the detection of objects which has random transformations in geometric measures. The sparse representation methods are often failed to perform particularly in cases where the rate of recognition is more. The problem of object detection can be represented in three stages namely the representation, dimension reduction and object detection. The primary goal of this work is to serve as a guide for the selection of a detection architecture that achieves the proper speed, less memory and accuracy balance for a chosen application. Initially, a dictionary is used for the representation of samples undergoing test followed by sparse representation. Secondly, in order to resolve the issue of high dimension of the features, Bi-Dimensional Empirical Mode Decomposition (BEMD) is introduced for dimension reduction. Thirdly, an architecture called AlexNET is proposed for getting the multi-scaled feature and to add convolutional structure for the detection of dictionaries for the identification of objects which is used as a guideline for the selection of architecture for achieving performance measures such as speed, memory and accuracy. Visualization of images from the Common Object in Context (COCO) dataset offers a side-by-side comparison with current approaches that trace the accuracy/speed tradeoff. The proposed classifier is efficiently reducing number of computational cost and number of parameters when compared to other methods.

Issue

Vol. 44 No. 2 (2024): LIB PRO. 44(2), JUL-DEC 2024 (Published: 01-07-2024)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details