Oapl Efficient Llm Reasoning Via

OAPL: Efficient LLM Reasoning via Off-Policy RL

In this AI Research Roundup episode, Alex discusses the paper: 'LLMs Can Learn to Reason

AdaR1: Adaptive Reasoning for Efficient LLMs

In this AI Research Roundup episode, Alex discusses the paper: 'AdaR1: From Long-CoT to Hybrid-CoT

Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 6 - LLM Reasoning

For more information about Stanford's graduate programs, visit: https://online.stanford.edu/graduate-education November 7, 2025 ...

[EMNLP 2025] Offloaded Reasoning: Efficient Inference for LLMs via Modular Reasoning and Refinement

EMNLP 2025 Findings Paper presentation video for our work: "Offloaded

OPSD: Faster LLM Reasoning via Self-Distillation

In this AI Research Roundup episode, Alex discusses the paper: 'Self-Distilled Reasoner: On-Policy Self-Distillation for Large ...

Fellowship: LlamaV-o1, Rethinking Step-by-step Visual Reasoning in LLMs

ai #arxiv #

"Reliable LLM Reasoning: Agents, Evaluation, and Lean Inference"- Prof. Akhil Arora - EPFL AI Center

The talk was jointly organized by the EPFL AI Center and the EPFL LiGHT lab, as part of the AI Fundamentals series.

BuPO Bottom-up Policy Optimization: Enhancing LLM Reasoning via Internal Layer Policies

In this video, we dive deep into a novel reinforcement learning paradigm called BuPO (Bottom-up Policy Optimization) that ...

AutoThink: Efficient LLM Reasoning with Adaptive Budgeting

The article introduces AutoThink, an innovative approach designed to enhance the inference

Video LLM Reasoning - "Think With Video" INSANE New LLM Video Reasoning Training

arxiv - https://arxiv.org/pdf/2510.20579 Become AI Researcher & Train

RLCER: Better LLM CoT via Self-Evolving Rubrics

In this AI Research Roundup episode, Alex discusses the paper: 'Reinforcing Chain-of-Thought

LightThinker++: Adaptive Memory Management for Efficient LLM Reasoning

Introducing the LightThinker framework to solve the huge computational costs and memory overload problems that occur in the ...

RL for Reasoning in LLMs w/ One Training Example (Apr 2025)

Title: Reinforcement Learning for

Anti-Self-Distillation for LLM Reasoning

In this AI Research Roundup episode, Alex discusses the paper: 'Anti-Self-Distillation for

LLMs Can Learn to Reason Via Off-Policy RL (Feb 2026)

Title: LLMs Can Learn to Reason

SAGE: Efficient LLM Reasoning without Overthinking

In this AI Research Roundup episode, Alex discusses the paper: 'Does Your

AI Agents + LLM Reasoning: Transforming Autonomous Workflows

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ...

How to Train LLMs to "Think" (o1 & DeepSeek-R1)

Want your team maximizing Claude? I run 1:1 and team AI workshops for companies doing $1M+ per year: ...