Diverse Search Methods and Multi-Modal Fusion for High-Performance Video Retrieval.

Sieu Tran, Duc Minh Nguyen, Triet Huynh Minh Nguyen, Danh Phuc Ngo,Thu Minh Nguyen, Hao Vo, Khiem Le,Tien Do,Thanh Duc Ngo

Symposium on Information and Communication Technology(2023)

引用 0|浏览2
暂无评分
摘要
Querying events within extensive video datasets currently stands as a prominent research focus within the field of multimedia information retrieval. Achieving high-performance retrieval within such contexts necessitates the efficient extraction and effective storage of information from videos to expedite the retrieval process. These challenges become notably pronounced when handling substantial datasets. In this paper, we introduce a system tailored for event querying within video data. Our system is meticulously crafted to optimize information retrieval speed and to efficiently organize storage, harnessing the power of FAISS and ElasticSearch. It boasts the capability to process diverse forms of input information, including textual video descriptions, Optical Character Recognition (OCR) results, Automatic Speech Recognition (ASR) transcriptions, visually similar images, and details about objects within videos, encompassing aspects such as color and quantity. Moreover, our system can also extract information about the temporal sequence of events within videos—a particularly challenging task when extracting information from video frames. By amalgamating these various input types, our system delivers optimal results.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要