Adversarial Text - Detection, Quality Enhancement, and Future Challenges in the LLM Era

Abstract:
Adversarial text — carefully crafted inputs designed to mislead or degrade the performance of NLP systems—poses a growing challenge across a range of language technologies. In this talk, I will present my work on adversarial text detection and methods for improving the quality and stability of such texts once identified. I will discuss the linguistic and structural characteristics of adversarial inputs, outline current approaches for automatic detection, and introduce techniques for refining adversarial examples to make them more semantically coherent. While the primary focus will be on traditional NLP systems, I will also reflect on how these techniques might evolve to address the emerging complexities of large language models (LLMs). Looking ahead, I will highlight how adversarial methods could be leveraged not only for defence but also as diagnostic tools for probing and improving LLM robustness, interpretability, and trustworthiness.