Best AI papers explained
En podkast av Enoch H. Kang - Fredager
203 Episoder
-
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Publisert: 12.5.2025 -
Leaked Claude Sonnet 3.7 System Instruction tuning
Publisert: 12.5.2025 -
Converging Predictions with Shared Information
Publisert: 11.5.2025 -
Test-Time Alignment Via Hypothesis Reweighting
Publisert: 11.5.2025 -
Rethinking Diverse Human Preference Learning through Principal Component Analysis
Publisert: 11.5.2025 -
Active Statistical Inference
Publisert: 10.5.2025 -
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Publisert: 10.5.2025 -
AI-Powered Bayesian Inference
Publisert: 10.5.2025 -
Can Unconfident LLM Annotations Be Used for Confident Conclusions?
Publisert: 9.5.2025 -
Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI
Publisert: 9.5.2025 -
Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control
Publisert: 9.5.2025 -
How to Evaluate Reward Models for RLHF
Publisert: 9.5.2025 -
LLMs as Judges: Survey of Evaluation Methods
Publisert: 9.5.2025 -
The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs
Publisert: 9.5.2025 -
Limits to scalable evaluation at the frontier: LLM as Judge won’t beat twice the data
Publisert: 9.5.2025 -
Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation
Publisert: 9.5.2025 -
Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Publisert: 9.5.2025 -
Prediction-Powered Statistical Inference Framework
Publisert: 9.5.2025 -
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Publisert: 9.5.2025 -
RM-R1: Reward Modeling as Reasoning
Publisert: 9.5.2025
Men know other men best. Women know other women best. And yes, perhaps AIs know other AIs best. AI explains what you should know about this week's AI research progress.