Extract Rich Information from Images and Video using Custom Vision Cognitive Services

Amr Elmaghraby

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Real-time Synchronization Mechanisms Between Batch-oriented Legacy Systems and Modern Interfaces in the Retirement Domain

Balamurugan Krishnaswamy Gnanasekaran

Random Articles

Estimation of Population Variance in Simple Random Sampling using Auxiliary Information

Nov

2020

Compiler for Detection of Program Vulnerabilities

October

2014

Color Content based Video Retrieval using Block Truncation Coding with Different Color Spaces

February

2013

A Novel Progressive Sampling based Approach for Effective Mining of Association Rules

November

2010

Reseach Article

Extract Rich Information from Images and Video using Custom Vision Cognitive Services

by Amr Elmaghraby

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 184 - Number 16

Year of Publication: 2022

Authors: Amr Elmaghraby

10.5120/ijca2022922152

Amr Elmaghraby . Extract Rich Information from Images and Video using Custom Vision Cognitive Services. International Journal of Computer Applications. 184, 16 ( Jun 2022), 15-28. DOI=10.5120/ijca2022922152

@article{ 10.5120/ijca2022922152,

author = { Amr Elmaghraby },

title = { Extract Rich Information from Images and Video using Custom Vision Cognitive Services },

journal = { International Journal of Computer Applications },

issue_date = { Jun 2022 },

volume = { 184 },

number = { 16 },

month = { Jun },

year = { 2022 },

issn = { 0975-8887 },

pages = { 15-28 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume184/number16/32402-2022922152/ },

doi = { 10.5120/ijca2022922152 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:21:35.598980+05:30

%A Amr Elmaghraby

%T Extract Rich Information from Images and Video using Custom Vision Cognitive Services

%J International Journal of Computer Applications

%@ 0975-8887

%V 184

%N 16

%P 15-28

%D 2022

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Computer vision is a branch of AI that allows computers and systems to extract useful information from digital photos, movies, and other visual inputs in order to address real-world visual problems. Artificial intelligence has an area called machine learning. Machine learning has a subfield called deep learning. Cognitive Services are a set of data-mining-based machine learning methods. Cognitive Machine Learning is a type of artificial intelligence that was created to address issues (AI).Deep Learning (DL)-powered computer vision technology adds real-world benefit to a variety of businesses. Deep learning is the use of neural networks containing more than one hidden layer of neurons to solve problems in domains such as computer vision which are more accurate quality inspectors than humans, make fewer mistakes, and don't mind doing tedious, repetitive duties all day. [1].Cognitive services algorithms are used in a variety of industries to help businesses and improve our daily lives. One of these domains is image classification, which uses convolutional neural networks to help humans discover key components of a picture. The purpose of this paper is to introduce the Microsoft Azure framework's Custom Vision Service. The Azure Custom Vision Service allows you to create, deploy, and develop high image identification modelsand how to make your Custom Vision Service model better. The amount, quality, and variety of labelled data you offer, as well as the entire dataset's balance, determine the quality of the classifier or object detector. A good model will have a well-balanced training dataset that is indicative of the data it will be given. The process of creating such a model is iterative, and it's normal to go through several rounds of training before getting the desired results..Convolutional neural networks, a cutting-edge technology with massive learning capacity, are used in the Custom Vision Service. Because constructing a convolutional neural network is a time-consuming activity that most engineers lack, a Custom Vision Service supplies this component for constructing a classifier. The Custom Vision service analyses photographs using a machine learning algorithm. You can use Custom Vision to create your own labels and train custom models to detect them. Each label denotes a different set of classes or objects. By submit groups of images that have and don't have the characteristics in question. The images have been labeled.at the time of submission. Then the algorithm trains this data and calculates its own accuracy by testing itself on those same images. Train the model by iterating over the entire dataset several times. On the basis of the test results, the model was evaluated. The model can be downloaded and utilized without having to be connected to the internet. Azure Cognitive Services provides a wide range of Artificial Intelligence (AI) solutions. Because the Custom Vision service is optimized for fast detecting significant differences between photographs, we can begin constructing our model with a small amount of data. We'll use 15 images in Custom Vision (the minimum required). Microsoft recommends using at least 50 different images to improve prediction accuracy (with different types of images).The suggested system can handle JPEG images, MPEG-1 bitstreams, and live video inputs. It is also possible to operate the procedures on an individual and autonomous basis. Once the training is complete, the model can be published, and you should be able to access it using the Custom Vision API..Azure Custom Vision's primary goal is to aid in the picture prediction process. The second suggested experiment, will use Java to build an integration with the Video Indexer service in order to improve it even further.

References

Frida Femling, Adam Olsson and Fernando Alonso-Fernandez 2018 Fruit and Vegetable Identification Using Machine Learning for Retail Applications 14th Int. Conf. on Signal-Image Technology & Internet-Based Systems (SITIS) , p. 9-15
Y LeCun, L Bottou, Y Bengio and P Haffner 1998 Gradient-based learning applied to document recognition Proc. of the IEEE vol86 , p. 2278-2324
X Yuan et al 2018 Deep Learning-Based Feature Representation and Its Application for Soft Sensor Modeling with Variable-Wise Weighted SAE IEEE Transactions on Industrial Informatics 14 , p. 3235-43
Z Gao, L Wang, L Zhou and J Zhang 2017 HEp-2 Cell Image Classification with Deep Convolutional Neural Networks IEEE J. Biomed. Heal. Informatics vol21 , p. 416-28
T Kooi et al 2017 Large scale deep learning for computer-aided detection of mammographic lesions Med. Image Anal 35 , p. 303–12
S. Du and RK Ward, "Adaptive Region-Based Image Enhancement Method for Robust Face Recognition Under Variable Illumination Conditions," IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, (9), pp. 1165-1175, 2010.
Y. Zhu, J. Sun, and S. Naoi, “Recognizing Natural Scene Characters by Convolutional Neural Network and Bimodal Image Enhancement,” in CameraBased Document Analysis and Recognition, vol. 7139, M. Iwamura and F. Shafait, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 69–82
Q. Tian, G. Xie, Y. Wang, and Y. Zhang, “Pedestrian Detection Based on Laplace Operator Image Enhancement Algorithm and Faster R-CNN,” in 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Beijing, China, 2018, pp. 1–5.
Lucas, SM, Panaretos, A., Sosa , L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: 7th International Conference on Document Analysis and Recognition, Edinburgh, Scotland, vol. 2, pp. 682–687 ( 2003)
Alex Krizhevsky, Geoffrey Hinton, Technical Report, Learning Multiple Layers of Features from Tiny Images, vol. 1, University of Toronto, 2009. No. 4.
Yann LeCun, et al., Gradient-based learning applied to document recognition, Proc. IEEE (1998) 2278–2324.
Feng, X., Jiang, Y., Yang, X., Du, M. and Li, X., 2019. Computer vision algorithms and hardware implementations: A survey. Integration, 69, pp.309-320.
Ying, X., 2019. An Overview of Overfitting and its Solutions.Journal of Physics: Conference Series, 1168, p.022022.
Bossert, L. and Hagendorff, T., 2021. Animals and AI.The role of animals in AI research and application – An overview and ethical evaluation.Technology in Society, 67, p.101678.
Oosthuizen, K., Botha, E., Robertson, J. and Montecchi, M., 2020. Artificial intelligence in retail: The AI-enabled value chain. Australasian Marketing Journal, 29(3), pp.264-273.
Plinere, D. and Borisov, A., 2015. Case Study on Inventory Management Improvement.Information Technology and Management Science, 18(1).
B., 2014. A STUDY ON THE IMPORTANCE OF IMAGE PROCESSING AND ITS APPLICATIONS.International Journal of Research in Engineering and Technology, 03(15), pp.155-160.
L'Heureux, A., Grolinger, K., Elyamany, H. and Capretz, M., 2017. Machine Learning With Big Data: Challenges and Approaches. IEEE Access, 5, pp.7776-7797.
Razali, M. and Manshor, N., 2018. Object Detection Framework for Multiclass Food Object Localization and Classification. Advanced Science Letters, 24(2), pp.1357-1361.
Kene, Y., Khot, U. and Rizvi, I., 2018. A Survey of Image Classification and Techniques for Improving Classification Performance.SSRN Electronic Journal,.

Index Terms

Computer Science

Information Sciences

Keywords

Artificial Intelligence Big Data Cloud Computing Deep Learning Machine Learning image classification Custom Vision Service convolutional neural network precision and recall.