Daniel Scalena

Milan, Italy

Hi! I am Daniel, a (double) second-year PhD student at the 🇮🇹 University of Milano - Bicocca and the 🇳🇱 University of Groningen working on interpretability, fairness and security of generative (and non-generative) Large Language Models. My supervisors are Elisabetta Fersini and Malvina Nissim.

My research focuses on the use of interpretability as a tool to make generative models safer, more reliable and less toxic to extend and improve their real-world applications.

In my spare time I take pictures and echo "from NL import infrastructure" > Milan.py.

news

Oct 14, 2025	🗣️ Want to save some tokens AND improve performance? 📝 New paper: EAGER: Entropy-Aware GEneRation for Adaptive Inference-Time Scaling.
May 23, 2025	📢 New paper out! Steering Large Language Models for Machine Translation Personalization, small thread about it on X or BSky.
Oct 02, 2024	📜 Multi-property Steering paper accepted to BlackBoxNLP 2024 (@EMNLP 2024) and 📜 A gentle push funziona benissimo accepted @ CLIC-it conference! 🎉

latest posts

Sep 10, 2023	In-depth Notes to understand more about Transformers
Jul 12, 2023	Let the Models Respond: Interpreting the Detoxification process of LMs

selected publications

EAGER: Entropy-Aware GEneRation for Adaptive Inference-Time Scaling

Daniel Scalena, Leonidas Zotos, Elisabetta Fersini, and 2 more authors

2025

Abs HTML

With the rise of reasoning language models and test-time scaling methods as a paradigm for improving model performance, substantial computation is often required to generate multiple candidate sequences from the same prompt. This enables exploration of different reasoning paths toward the correct solution, however, allocates the same compute budget for each prompt. Grounded on the assumption that different prompts carry different degrees of complexity, and thus different computation needs, we propose EAGer, a training-free generation method that leverages model uncertainty through token-wise entropy distribution to reduce redundant computation and concurrently improve overall performance. EAGer allows branching to multiple reasoning paths only in the presence of high-entropy tokens, and then reallocates the saved compute budget to the instances where exploration of alternative paths is most needed. We find that across multiple open-source models on complex reasoning benchmarks such as AIME 2025, EAGer can reallocate the budget without accessing target labels, achieving the best efficiency-performance trade-off in terms of reasoning length and Pass@k. When target labels are accessible, EAGer generates up to 65% fewer tokens (hence saving compute) and achieves up to 37% improvement in Pass@k compared to the Full Parallel Sampling.
Steering Large Language Models for Machine Translation Personalization

Daniel Scalena^*, Gabriele Sarti^*, Arianna Bisazza, and 2 more authors

2025

Abs HTML

High-quality machine translation systems based on large language models (LLMs) have simplified the production of personalized translations reflecting specific stylistic constraints. However, these systems still struggle in settings where stylistic requirements are less explicit and might be harder to convey via prompting. We explore various strategies for personalizing LLM-generated translations in low-resource settings, focusing on the challenging literary translation domain. We explore prompting strategies and inference-time interventions for steering model generations towards a personalized style, and propose a contrastive framework exploiting latent concepts extracted from sparse autoencoders to identify salient personalization properties. Our results show that steering achieves strong personalization while preserving translation quality. We further examine the impact of steering on LLM representations, finding model layers with a relevant impact for personalization are impacted similarly by multi-shot prompting and our steering method, suggesting similar mechanism at play.
Multi-property Steering of Large Language Models with Dynamic Activation Composition

Daniel Scalena, Gabriele Sarti, and Malvina Nissim

In Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, Nov 2024

Abs HTML PDF

Activation steering methods were shown to be effective in conditioning language model generation by additively intervening over models’ intermediate representations. However, the evaluation of these techniques has so far been limited to single conditioning properties and synthetic settings. In this work, we conduct a comprehensive evaluation of various activation steering strategies, highlighting the property-dependent nature of optimal parameters to ensure a robust effect throughout generation. To address this issue, we propose Dynamic Activation Composition, an information-theoretic approach to modulate the steering intensity of one or more properties throughout generation. Our experiments on multi-property steering show that our method successfully maintains high conditioning while minimizing the impact of conditioning on generation fluency.