Image Description Generator using Residual Neural Network and Long Short-Term Memory

Computer Science Journal of Moldova(2023)

引用 0|浏览4
暂无评分
摘要
Human beings can describe scenarios and objects in a picture through vision easily whereas performing the same task with a computer is a complicated one. Generating captions for the objects of an image helps everyone to understand the scenario of the image in a better way. Instinctively describing the content of an image requires the apprehension of computer vision as well as natural language processing. This task has gained huge popularity in the field of technology and there is a lot of research work being carried out. Recent works have been successful in identifying objects in the image but are facing many challenges in generating captions to the given image accurately by understanding the scenario. To address this challenge, we propose a model to generate the caption for an image. Residual Neural Network (ResNet) is used to extract the features from an image. These features are converted into a vector of size 2048. The caption generation for the image is obtained with Long Short-Term Memory (LSTM). The proposed model is experimented on the Flickr8K dataset and obtained an accuracy of 88.4\%. The experimental results indicate that our model produces appropriate captions compared to the state of art models.
更多
查看译文
关键词
image,description generator,resnet,lstm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要