Following Clues, Approaching the Truth: Explainable Micro-Video Rumor Detection via Chain-of-Thought Reasoning

This repo provides the official implementation of ExMRD as described in the paper:

Following Clues, Approaching the Truth: Explainable Micro-Video Rumor Detection via Chain-of-Thought Reasoning (WWW'25 research track)

Source Code Structure

data        # dir of each dataset
- FakeSV
- FakeTT
- FVC

preprocess  # code to preprocess video and CoT preprocess

src
- config    # training config
- model     # ExMRD model
- utils     # training utils
- data      # dataloader of ExMRD

Dataset

We provide video IDs for each dataset in both temporal and five-fold splits. Due to copyright restrictions, the raw datasets are not included. You can obtain the datasets from their respective original project sites.

Start

Environment Setup

To set up the environment, run the following commands:

# install ffmpeg (if you are using a Debian-based OS, run the following command)
apt install ffmpeg 
# create env using conda
conda create --name ExMRD python=3.12
conda activate ExMRD
pip install -r requirements.txt

Prepare Datasets

Get raw dataset (including videos and metadatas) from source, and save raw videos to {dataset}/videos.
Init {dataset}/data.jsonl for each dataset, with each line containing a vid(video id) to include all video ids in full dataset.
Init {dataset}/label.jsonl for each dataset with each line containing a vid and its associated label (1 or 0).

Video Data Preprocessing

Run the following command:

bash run/preprocess.sh

Or you can manually preprocess data following these instructions:

Sample 16 frames from each video in the dataset and store them in {dataset}/frames_16.
Extract on-screen text from keyframes using Paddle-OCR and save to {dataset}/ocr.jsonl.
Extract audio transcripts from video audio using Whisper and save to {dataset}/transcript.jsonl.
Extract visual features from each video using a pre-trained CLIP-ViT model and save to {dataset}/fea/vit_tensor.pt.

R³CoT with MLLM Preprocessing

Run the following commands:

# setup .env.example, and make sure you have access to OpenAI API
mv .env.example .env
bash run/cot.sh

Run

# Run ExMRD for the FakeSV dataset
python src/main.py --config-name ExMRD_FakeSV

# Run ExMRD for the FakeTT dataset
python src/main.py --config-name ExMRD_FakeTT

# Run ExMRD for the FVC dataset
python src/main.py --config-name ExMRD_FVC

Citation

@inproceedings{hong2025following,
	author = {Hong, Rongpei and Lang, Jian and Xu, Jin and Cheng, Zhangtao and Zhong, Ting and Zhou, Fan},
	booktitle = {The {Web} {Conference} ({WWW})},
	year = {2025},
	organization = {ACM},
	title = {Following Clues, Approaching the Truth: Explainable Micro-Video Rumor Detection via Chain-of-Thought Reasoning},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Following Clues, Approaching the Truth: Explainable Micro-Video Rumor Detection via Chain-of-Thought Reasoning

Source Code Structure

Dataset

FakeSV

FakeTT

FVC

Start

Environment Setup

Prepare Datasets

Video Data Preprocessing

R³CoT with MLLM Preprocessing

Run

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Following Clues, Approaching the Truth: Explainable Micro-Video Rumor Detection via Chain-of-Thought Reasoning

Source Code Structure

Dataset

FakeSV

FakeTT

FVC

Start

Environment Setup

Prepare Datasets

Video Data Preprocessing

R3CoT with MLLM Preprocessing

Run

Citation

R³CoT with MLLM Preprocessing