AI NewsResearch

7 min read 6/17/2026

LLMagentic AIreinforcement learningattention mechanismsvideo embeddingsKV cache

Ling-2.6 and Ring-2.6 release public checkpoints, including a trillion-parameter model

One is tuned for instant replies, the other for deeper reasoning, with a hybrid linear attention design and a new reinforcement learning framework for agent training. Also in today’s digest: a 23-task video embedding benchmark, a context-aware RL method, and a fast key–value (KV) cache eraser.

Find in this article

Reading Mode

One-Line Summary

Today’s work pushes agentic AI toward instant, grounded reasoning — from public trillion-scale checkpoints to a 23-task video benchmark, context-aware training, and efficient cache editing.

LLM & SOTA Models

Ling-2.6 and Ring-2.6 target instant, agentic AI at trillion-parameter scale

Two sister models — Ling-2.6 for instant responses and Ring-2.6 for deeper reasoning and agent workflows — are introduced with public checkpoints, aiming to pair low-latency answers with strong reasoning while staying practical to train, serve, and deploy. The technical report positions the 2.6 family as a path to efficient, scalable, and open agentic systems. ¹

Instead of training from scratch, the team migrates from the Ling-2.0 base through architectural migration pre-training and large-scale post-training, guided by a co-design of architecture, optimization objectives, serving systems, and agent training environments. A hybrid linear attention design that combines Lightning Attention with another linear-attention approach targets efficient long-context training and decoding. On the output side, “capability per token” is boosted with Evolutionary Chain-of-Thought, Linguistic Unit Policy Optimization, bidirectional preference alignment, and shortest-correct-response distillation. ¹

For agent capabilities, the report introduces KPop, a reinforcement learning framework that supports stable training of Ring-2.6-1T (one trillion parameters) on large-scale, environment-grounded data. KPop schedules coding, search, tool use, and workflow execution asynchronously, improving learning efficiency from complex agent–environment interactions. ¹

Why it matters: the 2.6 family explicitly optimizes both speed and reasoning depth while coordinating training and serving. Watch for independent latency measurements, long-context behavior, and how these checkpoints translate into more reliable, tool-using agents in real tasks. ¹

Open Source & Repos

Rivet Actors provide stateful building blocks for agents

Rivet offers long-running, lightweight “actors” — processes that keep state in memory with automatic persistence — built for AI agents, collaborative apps, and durable execution. You can create one actor per agent, session, or user, and use built-in workflows and queues to coordinate work. ²

The project shows active development with Release v2.3.0 (2026-06-15), shipping bug fixes and frontend improvements alongside quickstart docs and community channels. For teams exploring agent backends, Rivet provides a pragmatic way to manage state without stitching together multiple services. ²

Research Papers

MVEB: a 23-task video embedding benchmark shows trade-offs

MVEB is a 23-task benchmark for video embeddings spanning classification, zero-shot classification, clustering, pair classification, retrieval, and video-centric question answering (QA). Evaluating 33 models, the authors report that no single approach dominates: embeddings from multimodal large language models (MLLMs) lead on classification, clustering, pair classification, and QA, while multimodal binding methods lead on retrieval and zero-shot classification; generative MLLMs without contrastive adaptation collapse on cross-modal tasks. ³

Audio’s value depends on how datasets were labeled: when labels use both audio and visuals, audio helps; when labels come from visuals alone, adding audio hurts — a consistent six-point gap across model families. MVEB is distilled from a 184-task pool (MVEB+) and integrates into the Massive Text Embedding Benchmark (MTEB), with code and a leaderboard released alongside the tasks. ³

ContextRL improves long-horizon reasoning by teaching models to pick the right context

ContextRL is a context-aware reinforcement learning (RL) method for large language models (LLMs) that presents a query, an answer, and two highly similar contexts, rewarding the model for selecting the context that truly supports the answer — encouraging fine-grained grounding over long tool traces or subtle image details. ⁴

Built on group relative policy optimization (GRPO) as the baseline, ContextRL uses contrastive context pairs curated for two domains: about 1,000 code-trajectory pairs and 7,000 image-based pairs. It reports average gains of +2.2% over standard GRPO on five long-horizon benchmarks, and +1.8% across 12 visual question answering (QA) benchmarks; data-only augmentation baselines using the same contrasts provide little to no improvement, isolating the benefit to the new objective. ⁴

KVEraser removes bad context from the KV cache without full recomputation

KVEraser is a learned method to edit a model’s key–value (KV) cache so you can erase a now-wrong span of context (like a harmful prompt injection or a stale retrieved fact) without reprocessing everything after it. It replaces only the KV states for the erased interval with learned “steering” states while reusing the rest of the cache, trained through a two-stage pipeline of generic span-neighbor pre-training and task-specific fine-tuning. ⁵

Experiments show post-erasure performance nearly matches full recomputation across 1K–32K context lengths, but with only a 24% latency increase, compared with 17.6× for exact recomputation. On unseen long-document question answering (QA) with harmful distractors, KVEraser outperforms approximate baselines while delivering a 3–4× speedup over full recomputation. ⁵

Why It Matters

Models, training recipes, and serving systems are converging on the same goal: agents that respond instantly yet reason reliably. The Ling/Ring 2.6 report co-designs architecture, objectives, serving, and environments, and releases public checkpoints that teams can try to test latency and tool use end-to-end. ¹

At the same time, evaluation and control are maturing: MVEB clarifies where video embeddings excel or fail, and KVEraser offers a practical way to fix a model’s memory without paying a 17.6× recompute penalty — a crucial capability for long-context RAG and agent pipelines. ³

This Week, Try

Rivet Actors quickstart: open the GitHub repo and follow the Quickstart to spin up a simple stateful actor. https://github.com/rivet-dev/rivet
Skim the MVEB paper: browse the arXiv abstract and figures to see which embedding types fit your task. https://arxiv.org/abs/2606.14958

Sources 5

[1] Arxiv Ling and Ring 2.6 Technical Report: Efficient and Instant Agentic Intelligence at Trillion-Parameter Scale [2] Arxiv MVEB: Massive Video Embedding Benchmark [3] Arxiv Context-Aware RL for Agentic and Multimodal LLMs [4] Arxiv KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing [5] Github rivet-dev/rivet: Rivet Actors are the primitive for stateful workloads

Helpful?

0to1log Weekly

Latest AI News