Publications

LifeInsight: An Interactive Lifelog Retrieval System with Comprehensive Spatial Insights and Query Assistance

LifeInsight: An Interactive Lifelog Retrieval System with Comprehensive Spatial Insights and Query Assistance

Tien-Thanh Nguyen-Dang, Xuan-Dang Thai, Gia-Huy Vuong, Van-Son Ho, Minh-Triet Tran, Van-Tu Ninh, Minh-Khoi Pham, Tu-Khiem Le, and Graham Healy

Proceedings of the 6th Annual ACM Lifelog Search Challenge (ICMR), 2023

In this paper, we introduce LifeInsight – an interactive lifelog retrieval system developed for the sixth annual Lifelog Search Challenge (LSC’23). LifeInsight incorporates semantic search mechanisms from state-of-the-art lifelog retrieval systems while focusing on providing insights into the lifelogger’s routine using spatial information to support question-answering tasks. The system employs the Bootstrapping Language-Image Pre-training (BLIP) model for zero-shot image-text retrieval, which has been shown to achieve higher recall scores than the CLIP model on the Flickr30K dataset. In addition, the Elastic Search filtering mechanism is utilized to remove irrelevant images. Apart from semantic search mechanisms, the system also supports visual similarity search by comparing the inner product distance between the vectors in the lifelog image corpus and the query image. Furthermore, the system includes an explicit relevance feedback function, AI-based query description rewriting, and visual-example-generating features to re-phrase the query to describe it better and support end-users envisioning the targeted image for retrieval.

Show more
  • interactive retrieval ·
  • AI-based assistance ·
  • spatial insights ·
  • lifelog
ViewsInsight2.0: Enhancing Video Retrieval for VBS 2025 with an Automatic Query Generator Powered by Large Language Models

ViewsInsight2.0: Enhancing Video Retrieval for VBS 2025 with an Automatic Query Generator Powered by Large Language Models

Gia-Huy Vuong, Van-Son Ho, Tien-Thanh Nguyen-Dang, Xuan-Dang Thai, Minh-Quan Ho-Le, Tu-Khiem Le, Minh-Khoi Pham, Van-Tu Ninh, Cathal Gurrin, and Minh-Triet Tran

International Conference on Multimedia Modeling, 2025

We present ViewsInsight2.0, an advanced iteration of the ViewsInsight system, purpose-built for the Video Browser Showdown (VBS) 2025. ViewsInsight2.0 retains the core strengths of ViewsInsight while introducing significant enhancements to address previous performance limitations. This optimized architecture is designed to deliver exceptional search capabilities tailored for VBS 2025. At its core, ViewsInsight2.0 integrates the CLIP (Contrastive Language-Image Pre-training) model, trained on DFN-5B. Furthermore, we have refined our temporal query mechanism with a more efficient algorithm and a user-friendly interface. In addition, ViewsInsight2.0 utilizes an automatic query generator powered by open-source large language models to efficiently optimize user input queries.

Show more
  • video retrieval ·
  • large language model ·
  • vision language model
NewsInsight2.0: An Enhanced Version Integrating Large Language Model-based Query Optimization with Advanced Temporal Mechanisms

NewsInsight2.0: An Enhanced Version Integrating Large Language Model-based Query Optimization with Advanced Temporal Mechanisms

Gia-Huy Vuong, Van-Loc Nguyen, Van-Son Ho, Tien-Thanh Nguyen-Dang, Ngoc-Do Tran, Van-Tu Ninh and Minh-Triet Tran

The 12th International Symposium on Information and Communication Technology (SOICT), 2024

We introduce NewsInsight2.0, a cutting-edge evolution of the NewsInsight system, specifically designed for the Ho Chi Minh AI Challenge 2024. Building on the strengths of ViewsInsight, NewsInsight2.0 addresses previous performance limitations with significant enhancements. Our optimized architecture is tailored to deliver exceptional search capabilities for AIC 2024. At its core, NewsInsight2.0 leverages the CLIP (Contrastive Language-Image Pre-training) model, trained on a vast dataset of 5 billion parameters (DFN-5B). Additionally, we have refined our temporal query mechanism with a more efficient algorithm and an intuitive user interface. Furthermore, NewsInsight2.0 features an automatic query generator powered by open-source large language models streamlining the process of optimizing user input queries.

Show more
  • news event retrieval ·
  • machine learning ·
  • image processing
Automatic Sub-Task Focus: LifeInsight’s Contribution to NTCIR-17 Lifelog-5

Automatic Sub-Task Focus: LifeInsight’s Contribution to NTCIR-17 Lifelog-5

Thang-Long Nguyen-Ho, Tien-Thanh Nguyen-Dang, Gia Huy Vuong, Van-Son Ho, Xuan-Dang Thai, Minh-Khoi Pham, Tu-Khiem Le, Van-Tu Ninh, and Minh-Triet Tran

NTCIR-17, 2023

As the demand for personalized data retrieval systems continues to grow, recent research has emphasized the development of lifelog retrieval mechanisms. Many new research and methods have focused on studying the integration of user interactions and feedback into search engines. In this paper, we introduce the automation approach of LifeInsight, a retrieval system designed explicitly for the NTCIR-17 Lifelog-5 Automatic Task, facilitating a seamless search experience and efficient data mining. Our method entails a two-fold process, where we first enrich the metadata from the raw query, followed by the composition of the retrieval method from input entities. Our proposed system not only enhances the search process but also ensures a comprehensive and detailed analysis of lifelog data for diverse applications. By focusing primarily on the automatic sub-task, we demonstrate the efficacy of our LifeInsight retrieval algorithm, showcasing competitive results that rival those of an expert user.

Show more
  • lifelog ·
  • automatic retrieval ·
  • large language model
ViewsInsight: Enhancing Video Retrieval for VBS 2024 with a User-Friendly Interaction Mechanism

ViewsInsight: Enhancing Video Retrieval for VBS 2024 with a User-Friendly Interaction Mechanism

Gia-Huy Vuong, Van-Son Ho, Tien-Thanh Nguyen-Dang, Xuan-Dang Thai, Tu-Khiem Le, Minh-Khoi Pham, Van-Tu Ninh, Cathal Gurrin, and Minh-Triet Tran

International Conference on Multimedia Modeling, 2024

ViewsInsight revolutionizes video content retrieval with its comprehensive suite of AI-powered features, enabling users to locate relevant videos using a variety of query types effortlessly. Its intelligent query description rewriting capability ensures precise video matching, while the visual example generation feature provides a powerful tool for refining search results. Additionally, the temporal query mechanism allows users to easily pinpoint specific video segments. The system's intuitive chat-based interface seamlessly integrates these advanced features.

Show more
LifeInsight2.0: An Enhanced Approach for Automated Lifelog Retrieval in LSC'24

LifeInsight2.0: An Enhanced Approach for Automated Lifelog Retrieval in LSC'24

Gia Huy Vuong, Van-Son Ho, Tien-Thanh Nguyen-Dang, Xuan-Dang Thai, Thang-Long Nguyen-Ho, Minh-Khoi Pham, Tu-Khiem Le, Van-Tu Ninh, Graham Healy, Cathal Gurrin, and Minh-Triet Tran

Proceedings of the 7th Annual ACM Workshop on the Lifelog Search Challenge (ICMR), 2024

We introduce the LifeInsight 2.0 system - an enhanced version of LifeInsight, built specifically for the sixth annual Lifelog Search Challenge (LSC'23). LifeInsight 2.0 leverages the core functionalities of LifeInsight while incorporating significant improvements to address performance bottlenecks. This refined architecture aims to deliver superior search capabilities within the LSC'24. LifeInsight 2.0 employs an ensemble approach combining two powerful foundation models: CLIP (Contrastive Language-Image Pretraining) and BLIP2 (Bootstrapping Language-Image Pretraining) model. In addition, the system incorporates a temporal query mechanism and an automatic query parser. The former enables LifeInsight 2.0 to interpret queries that include time-based information, while the latter specifically handles tasks involving question answering.

Show more
  • lifelog ·
  • interactive retrieval ·
  • automatic retrieval ·
  • spatial insights ·
  • AI-based assistance
Interactive Sub-Task Focus: LifeInsight’s Contribution to NTCIR-17 Lifelog-5

Interactive Sub-Task Focus: LifeInsight’s Contribution to NTCIR-17 Lifelog-5

Gia Huy Vuong, Van-Son Ho, Tien-Thanh Nguyen-Dang, Xuan-Dang Thai, Thang-Long Nguyen-Ho, Minh-Khoi Pham Tu-Khiem Le, Van-Tu Ninh and Minh-Triet Tran

NTCIR-17, 2023

This paper presents LifeInsight, a robust lifelog retrieval system designed specifically for the NTCIR17 Lifelog-5 Task, and focuses on the interactive sub-task, which involves evaluating LifeInsight’s performance under different user interaction approaches employed by various users.

Show more