Amsterdam, The Netherlands  /  June 30, 2025

5th ACM International Workshop on Multimedia AI against Disinformation (MAD’26)

On June 16, 2026, the 5th edition of the ACM International Workshop on Multimedia AI against Disinformation (MAD’26) organized with the ACM International Conference on Multimedia Retrieval (ICMR’26) will take place in Amsterdam, The Netherlands. The workshop welcomes contributions related to different aspects of AI-powered disinformation detection, analysis and mitigation.

Fraunhofer IDMT will support the organization of the workshop and present its latest research findings there.

Detecting Audio-Text Decontextualization through Entailment and Semantic Analysis

Milica Gerhardt, Luca Cuccovillo, Patrick Aichroth

Audio-text decontextualization is a form of real-world misinformation in which genuine audio recordings – speech excerpts, news clips, interviews – are detached from their authentic context and paired with misleading textual narratives. Addressing it in practice requires both audio provenance analysis and context analysis: provenance retrieves candidate source recordings, while context analysis determines whether the recovered source supports the narrative attached to the post. This paper presents three context-analysis pipelines able to address this issue and their cascade combinations, and evaluates them on the M3A dataset alongside four audio-language baselines. We show that a substantial fraction of M3A manipulations are fundamentally undetectable from audio-text content alone, and that on the subset where detection is possible our best pipelines reach 0.73 accuracy on Named Entity Manipulation (NEM) and 0.92 on Multimodal Misalignment (MM) audio swap. Building on these findings, we formulate an operational workflow for real-world investigations and demonstrate it on three case studies, which also motivate a lightweight linguistic middle layer for conditional and modal/hedging framing drops. This leads to two practical deployment recommendations: (1) a fast bulk-screening pipeline that flags context-stripping attacks via entailment failure; and (2) a large language model (LLM)-based deep-verification pipeline for the most suspicious cases, capable of explicit reasoning about framing shifts.