Black-Box On-Policy Distillation of Large Language Models
Best AI papers explained - En podkast av Enoch H. Kang
Kategorier:
This paper introduces a novel technique called **Generative Adversarial Distillation (GAD)** for knowledge transfer from a large, proprietary teacher language model (LLM), such as GPT-5-Chat, to a smaller student LLM in a **black-box setting**. Black-box distillation is necessary when the student only has access to the teacher’s final text outputs, not its internal parameters or probabilities. GAD frames the distillation process as a **minimax game** similar to a Generative Adversarial Network (GAN), where the student acts as a generator and an adaptive discriminator learns to distinguish the student’s outputs from the teacher’s, providing **on-policy feedback** without relying on likelihood-based objectives. Experimental results confirm that GAD **consistently outperforms** traditional sequence-level knowledge distillation (SeqKD) across various benchmarks, especially in terms of out-of-distribution generalization. The research validates GAD as a stable and effective method for extracting knowledge from closed-source LLMs by treating the discriminator as a continually evolving reward model.
