Assistive Application for the Visually Impaired using Machine Learning and Image Processing

The task of interpreting visual information poses a difficulty for artificial intelligence due to the intricate and diverse characteristics of visual data. Visual data can be disrupted and deficient, which complicates the process of machines attempting to precisely comprehend and decipher the meanin...

Full description

Saved in:
Bibliographic Details
Published in:2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT) pp. 1 - 6
Main Authors: Sreerenganathan, Abirami, K P, Vyshali Rao, Dhanalakshmi, M
Format: Conference Proceeding
Language:English
Published: IEEE 06-07-2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The task of interpreting visual information poses a difficulty for artificial intelligence due to the intricate and diverse characteristics of visual data. Visual data can be disrupted and deficient, which complicates the process of machines attempting to precisely comprehend and decipher the meaning of an image. In this paper, a new method for image captioning for people who are blind is suggested. This method involves using a CNN-LSTM architecture, where a CNN is utilized to extract visual features from the image, and an LSTM generates a text-based description based on these features. A vast dataset of images and their corresponding captions are used to train the suggested model, and its effectiveness is assessed using the BLEU metric. Our model is validated using the benchmark dataset Flickr8K. The outcomes of the experiment demonstrate that the suggested technique has the capability to produce relevant and precise descriptions, which can help visually impaired people to access visual content. This method has the potential to fill the gap and provide a solution to the challenge of accessing visual media by the visually impaired.
ISSN:2473-7674
DOI:10.1109/ICCCNT56998.2023.10307136