Exciting news! Our team at SUMMETIX is proud to present our paper “Argument Summarization and its Evaluation in the Era of Large Language Models” (Altemeyer, Eger, Daxenberger, Chen, Altendorf, Cimiano, Schiller) accepted at EMNLP 2025. The Conference on Empirical Methods in Natural Language Processing (EMNLP) is taking place in Suzhou, China, this year and one of the largest and most prestigious scientific language technology events of the world.
In short:
- We used large language models to make argument summaries much better.
- We built two new systems and a smarter way to check summary quality.
- Surprisingly, a smaller model (Qwen3-32B) beat giants like GPT-4o.
So?
- LLMs can seriously boost how we summarize complex debates.
- Our new evaluation method lets AI judge summaries almost like a human.
- And the kicker? Larger models are not necessarily better for this task.
We believe this advances how we summarise and evaluate arguments in the age of LLMs — helping machines to not just summarize what’s said, but what matters.


