Research on Image Caption Algorithm Based on Improved Attention Mechanism

Konferenz: ISCTT 2021 - 6th International Conference on Information Science, Computer Technology and Transportation
26.11.2021 - 28.11.2021 in Xishuangbanna, China

Tagungsband: ISCTT 2021

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Ke, Jie; Zeng, Shangyou; Wang, Jinjin (School of Electronic Engineering, Guangxi Normal University, Gui Lin, China)

Image caption is a natural language description of the specified picture. The output sentences are required to conform to the natural language habits, point out the important information in the image, and cover the scene actions of the main characters. In the encoder part of this model, reset101 is used to extract the features of the picture, the decoder uses the long-term and short-term memory network, and proposes a new improved attention mechanism to enhance the correlation between the picture and the word, and finally outputs the natural language. Compare and verify the model in the public data set Flickr8k, and use a variety of evaluation indicators (Bleu, Meteor) to evaluate the model. The experimental results show that compared with the traditional attention mechanism model, the image caption generation model based on improved attention mechanism improves the accuracy of image caption task and has significant advantages.