203 Episoder

  1. TextGrad: Backpropagating Language Model Feedback for Generative AI Optimization

    Publisert: 27.3.2025
  2. MemReasoner: Generalizing Language Models on Reasoning-in-a-Haystack Tasks

    Publisert: 27.3.2025
  3. RAFT: In-Domain Retrieval-Augmented Fine-Tuning for Language Models

    Publisert: 27.3.2025
  4. Inductive Biases for Exchangeable Sequence Modeling

    Publisert: 26.3.2025
  5. InverseRLignment: LLM Alignment via Inverse Reinforcement Learning

    Publisert: 26.3.2025
  6. Prompt-OIRL: Offline Inverse RL for Query-Dependent Prompting

    Publisert: 26.3.2025
  7. Alignment from Demonstrations for Large Language Models

    Publisert: 25.3.2025
  8. Q♯: Distributional RL for Optimal LLM Post-Training

    Publisert: 18.3.2025
  9. Scaling Test-Time Compute Without Verification or RL is Suboptimal

    Publisert: 14.3.2025
  10. Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

    Publisert: 14.3.2025
  11. Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

    Publisert: 14.3.2025
  12. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

    Publisert: 14.3.2025
  13. Revisiting Superficial Alignment Hypothesis

    Publisert: 14.3.2025
  14. Diagnostic uncertainty: teaching language Models to describe open-ended uncertainty

    Publisert: 14.3.2025
  15. Language Model Personalization via Reward Factorization

    Publisert: 14.3.2025
  16. Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

    Publisert: 14.3.2025
  17. How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach

    Publisert: 14.3.2025
  18. Can Large Language Models Extract Customer Needs as well as Professional Analysts?

    Publisert: 13.3.2025
  19. Spurlens: finding spurious correlations in Multimodal llms

    Publisert: 13.3.2025
  20. Improving test-time search with backtrack- Ing Improving test-time search with backtrack- Ing against in-context value verifiersagainst in-context value verifiers

    Publisert: 13.3.2025

10 / 11

Men know other men best. Women know other women best. And yes, perhaps AIs know other AIs best. AI explains what you should know about this week's AI research progress.

Visit the podcast's native language site