Multimodal Deception Detection Using Machine and Deep Learning: A Comparative Study on Audio and Visual Cues

No Thumbnail Available
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
Computer Science
Abstract
This thesis explores the use of artificial intelligence for automated deception detection using multimodal data. Traditional lie detection methods—such as polygraph testing and behavioral observation—are limited by subjectivity and poor reliability. In response, this work investigates machine learning and deep learning models applied to audio and visual cues from the Real-Life Trial Deception Dataset (RLDD), which features authentic courtroom testimonies. The proposed pipeline covers feature extraction, model training, and system deployment. Audio features were extracted using the ComParE_2016 set, while visual features were generated through Vision Transformer (ViT) embeddings. Multiple models, including SVM, XGBoost, Conv1D, BiGRU, and CNN+LSTM, were tested across four configurations: audio-only, visual-only, early fusion, and late fusion. SVM and Conv1D performed best for audio inputs, while BiGRU and CNN+LSTM achieved the highest accuracy for visual inputs. The final system—LieBusters—uses SVM for audio and BiGRU for visual detection, integrated via decision-level late fusion. It features a real-time interface, recording tools, and LIME-based interpretability for ethical transparency. This work provides both a practical prototype and a benchmark for future AI-driven behavioral analysis tools.
Description
Keywords
Citation
Collections