AttenMIA: LLM Membership Inference Attack through Attention Signals

Published in IEEE Symposium on Security and Privacy (S&P) — under submission, 2026

We introduce AttenMIA, the first framework for membership inference attacks on large language models using attention signals. Our method demonstrates that internal attention patterns leak sensitive information about whether a sample was included in a model’s training data, even when output probabilities are unavailable.

This work exposes a new privacy risk at the architectural level of LLMs and shows that attention mechanisms can act as a previously overlooked side channel. Our results highlight the need for privacy-aware design and defenses that consider internal model representations, not just external model outputs.