- Analyzes fact-check requests on X (Grok and Perplexity)
- "exposure to LLM fact-checks meaningfully shifts belief accuracy" comparable to the degree observed in studies of professional fact-checking
- 54.5% of Grok ratings and 57.7% of Perplexity ratings agreed with human fact-checkers ("significantly lower than the inter-fact-checker agreement rate of 64.0%"). But "API-access versions of Grok had higher agreement with fact-checkers"
- "Responses to Grok fact-checks are polarized by partisanship when model identity is disclosed, whereas responses to Perplexity are not"
- "Users requesting fact-checks from Grok are much more likely to be Republican than Democratic, while the opposite is true for fact-check requests from Perplexity – indicating emerging polarization in attitudes toward specific AI models."
- "posts from Republican-leaning accounts are more likely to be rated as inaccurate by both LLMs"
- Grok and Perplexity "strongly disagree" (one rates a claim as true and the other as false) 13.6% of the time
Some highlights from the abstract:
- Analyzes fact-check requests on X (Grok and Perplexity)
- "exposure to LLM fact-checks meaningfully shifts belief accuracy" comparable to the degree observed in studies of professional fact-checking
- 54.5% of Grok ratings and 57.7% of Perplexity ratings agreed with human fact-checkers ("significantly lower than the inter-fact-checker agreement rate of 64.0%"). But "API-access versions of Grok had higher agreement with fact-checkers"
- "Responses to Grok fact-checks are polarized by partisanship when model identity is disclosed, whereas responses to Perplexity are not"
- "Users requesting fact-checks from Grok are much more likely to be Republican than Democratic, while the opposite is true for fact-check requests from Perplexity – indicating emerging polarization in attitudes toward specific AI models."
- "posts from Republican-leaning accounts are more likely to be rated as inaccurate by both LLMs"
- Grok and Perplexity "strongly disagree" (one rates a claim as true and the other as false) 13.6% of the time