all search terms 2024 年 10 月 21 日

Realtime Fake News from Adversarial Feedback

all search terms dataset

title: Realtime Fake News from Adversarial Feedback

publish date:

2024-10-18

authors:

Sanxing Chen et.al.

paper id

2410.14651v1

download

2410.14651v1

abstracts:

We show that existing evaluations for fake news detection based on conventional sources, such as claims on fact-checking websites, result in an increasing accuracy over time for LLM-based detectors — even after their knowledge cutoffs. This suggests that recent popular political claims, which form the majority of fake news on such sources, are easily classified using surface-level shallow patterns. Instead, we argue that a proper fake news detection dataset should test a model’s ability to reason factually about the current world by retrieving and reading related evidence. To this end, we develop a novel pipeline that leverages natural language feedback from a RAG-based detector to iteratively modify real-time news into deceptive fake news that challenges LLMs. Our iterative rewrite decreases the binary classification AUC by an absolute 17.5 percent for a strong RAG GPT-4o detector. Our experiments reveal the important role of RAG in both detecting and generating fake news, as retrieval-free LLM detectors are vulnerable to unseen events and adversarial attacks, while feedback from RAG detection helps discover more deceitful patterns in fake news.

QA:

coming soon

编辑整理： wanghaisheng 更新日期：2024 年 10 月 21 日