AI NewsResearch

5 min read 6/2/2026

3D reconstructionvideo generationLLM agentsagentic systemsevaluation metricsopen source tools

New 3D method sharpens local geometry in point maps

SurGe introduces a local-surface metric and two training components to reduce visible micro-geometry errors in point maps while maintaining top global accuracy. Companion papers spotlight agent ‘harness’ design and a one-step video generator, pointing to gains in precision and latency rather than just bigger models.

Find in this article

Reading Mode

One-Line Summary

Today's papers emphasize practical gains: cleaner local 3D geometry, stable one-step video generation for lower latency, and agent research that elevates the surrounding system or “harness” alongside the model.

Research Papers

SurGe improves local surface geometry in point maps

Point-map 3D reconstruction methods often get global shape right but local surface orientation wrong; SurGe makes those local errors measurable with a new “point map normal” metric and reduces them while preserving global accuracy. ¹

To do this, it adds two parts: a point gradient matching loss that supervises depth-normalized 3D finite differences, and a Neighborhood Attention Decoder (NAD) that progressively upsamples features and mixes local context using Neighborhood Attention. ¹

Across eight zero-shot monocular geometry benchmarks, SurGe achieves the best average rank for global point map Absolute Relative Error (AbsRel) and shows consistent improvements on local point map and point map-normal evaluations. ¹

Agent self-evolution separates 'update skill' from 'benefit'

This study examines self-evolving Large Language Model (LLM) agents that update the “harness” around a base model — prompts, skills, memories, and tools — from execution evidence, asking which models write useful updates and which actually benefit from them at task time. ²

It disentangles two capabilities: harness-updating (producing persistent, useful updates) and harness-benefit (gaining from those updates during execution). Findings: harness-updating is largely flat across base capability — even Qwen3.5-9B’s updates yield gains comparable to Claude Opus 4.6 — while harness-benefit is non-monotonic, with mid-tier models gaining most and weak-/strong-tier models gaining less; weak-tier gains are limited by failures to invoke harness artifacts or to follow them faithfully. ²

One-Forcing stabilizes one-step autoregressive video generation

One-Forcing is a training recipe for high-quality “one-step” autoregressive video generation that augments the Self-Forcing distillation objective with an auxiliary Generative Adversarial Network (GAN) loss, aiming to cut latency without sacrificing sharpness or motion. ³

On VBench, One-Forcing scores 83.76 — the state of the art among one-step causal video methods — and remains competitive with strong many-step approaches; it also achieves stable one-step framewise autoregressive generation with about one-third of the training cost of the chunkwise model. ³

From model scaling to system scaling in agentic AI

This position paper argues the next bottleneck in agentic AI is the auditable, modular system around a foundation model — the “agent harness” spanning memory, retrieval, tool use, orchestration, verification, and governance — and that model-centric evaluation undershoots this interaction. ⁴

It outlines three bottlenecks (context governance, trustworthy memory, dynamic skill routing) and a research agenda for harness-level benchmarks measuring trajectory quality, memory hygiene, context efficiency, communication fidelity, verification cost, and safe evolution; the authors introduce CheetahClaws, a Python-native reference harness, and compare it with Claude Code and OpenClaw. ⁴

Open Source & Repos

InsForge: all-in-one backend for agentic coding

InsForge is an open-source backend platform that bundles database, authentication, storage, compute, hosting, and an AI gateway so a coding agent can ship end-to-end full-stack apps; it also provides a software development kit (SDK) on npm. ⁵

The repository highlights Apache-2.0 licensing and activity badges, and shows continued updates, including a v2.1.10 release dated 2026-05-29. ⁵

Why It Matters

Precision and latency are center stage: SurGe brings evaluation and training closer to the local surface details users notice, while One-Forcing collapses video sampling steps toward near-instant interaction without the usual blur. ¹

Equally, the agent studies argue that progress depends on system design — the harness for memory, routing, and governance — as much as on raw model power, a cue for teams to invest in harness-level evaluation and reusable infrastructure alongside models. ⁴

Sources 5

[1] Arxiv SurGe: Improved Surface Geometry in Point Maps [2] Arxiv Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents [3] Arxiv One-Forcing: Towards Stable One-Step Autoregressive Video Generation [4] Arxiv From Model Scaling to System Scaling: Scaling the Harness in Agentic AI [5] Github InsForge/InsForge: The all-in-one, open-source backend platform for agentic coding

Helpful?

0to1log Weekly

Latest AI News