Vol.01 · No.10 Log Daily May 30, 2026

Latest AI News

AI · PapersDaily CurationOpen Access
◆ HEADLINE AI NewsBusiness

Microsoft to unveil its own coding AI to boost Copilot

Reuters says Microsoft is preparing a homegrown coding model and other specialized AI for Build, as Asana buys StackAI and Groq lines up $650M for inference.

MicrosoftGitHub CopilotOpenAIGroqAsana 4 min read
AI NewsBusiness

The funding pushes Anthropic past OpenAI in value, while the new model adds “effort control” and faster, cheaper responses; OpenAI, meanwhile, outlines election safeguards.

AnthropicOpenAIClaude Opus 4.8AI valuation 5 min read
AI NewsBusiness

The Devin maker cites fast enterprise uptake and a $492M run-rate as investors pile in, while YouTube begins auto-labeling photorealistic AI videos to boost transparency.

CognitionDevinYouTubeAI labeling 4 min read
AI NewsResearch

A single “native multimodal” embedding reports strong retrieval scores across major image, video, and text benchmarks, pointing to simpler pipelines for search, recommendations, and retrieval-augmented generation.

multimodal embeddingsretrievalGPU kernelsRLHF 5 min read
AI NewsResearch

MotiMotion introduces a “reason-then-generate” approach to motion control and a new benchmark. Three agent-training papers target reliability from rewards to terminal feedback, and LocalAI ships a no‑GPU engine under MIT License.

video generationreinforcement learningagentsreward hacking 6 min read
AI NewsBusiness

Three papers propose attractor-based reasoning, a Shannon scaling law, and staged vision training—pointing to better accuracy by tuning compute and reducing noise. Here’s what it means for budgets, prompts, and vendor evaluations.

Scaling lawsTest-time computeVision-language modelsInference efficiency 5 min read
AI NewsResearch

New research frames inference as converging to learned ‘attractors,’ treats model training as a noisy channel with capacity limits, shows vision-language models learn more by separating seeing from thinking, and turns language-driven virtual photography into an executable 3D agent task.

scaling lawsreasoningvision-language models3D agents 5 min read
AI NewsBusiness

Google is remaking Search around conversational agents and says Gemini 3.5 Flash powers AI Mode, while 10‑second AI video creation appears in its apps — as pricing and security pressures reshape how teams adopt AI.

GoogleGeminiAI agentsSearch 5 min read
AI NewsResearch

A convex-optimization tokenizer replaces greedy rules with a global objective, improving bits-per-byte for language models and certifying how close the vocabulary is to optimal. Plus: live music diffusion on consumer laptops, AI’s forecasting limits, promptable 3D animals, and an incremental engine for always‑fresh agent context.

TokenizationConvex optimizationDiffusion modelsMusic generation 5 min read
AI NewsBusiness

The Financial Times frames planned listings by SpaceX, OpenAI, and Anthropic as a reality check on AI demand, while Corsair introduces Grace Blackwell–based workstations and servers for private AI deployments.

IPOsSpaceXOpenAIAnthropic 4 min read
AI NewsResearch

LoREnc suppresses recoverable low‑rank signals so stolen weights or unauthorized adapters fail, while authorized adapters restore full quality with under 1% overhead. Also in focus: self‑regulated planning that saves tokens, safer shared caches for multi‑agent systems, and a study showing chatbots’ reliance on retrieval.

model securityLoRAmulti-agent LLMsKV cache 6 min read
AI NewsResearch

AutoRubric-T2I teaches a vision‑language judge to grade images with learned checklists, outperforming prior reward models while using under 0.01% of human preference data. New papers also push execution‑grounded coding agents and steadier long‑context attention.

text-to-imagereward modelingvision-language modelslinear attention 5 min read
AI NewsResearch

The new diffusion-based system adds variable-length generation and targeted inpainting, trained on licensed and Creative Commons data, and runs on consumer hardware. The team reports under-2-second outputs on an H200 and a few seconds on a MacBook Pro M4.

audio generationdiffusion LLMMixture-of-Expertstime-series forecasting 5 min read
AI NewsResearch

A new paper argues the popular Rotary Positional Embedding loses its locality and token-order cues as context grows, while three studies push practical gains in efficient diffusion-MoE inference, VLM training, and clinical agents.

RoPElong-context LLMsMoE inferenceVision-language models 6 min read
AI NewsResearch

Google’s new model emphasizes doing multi‑step work, not just chatting — it becomes the default in the Gemini app and powers a 24/7 agent, while open‑source tools focus on cleaner inputs and deployment patterns.

GoogleGemini 3.5 FlashAgentic AIBenchmarks 6 min read
AI NewsBusiness

The first Vera systems leave Nvidia’s labs with 88 custom cores to handle agent workloads, while Apple lines up AI writing tools for iOS 27 and marketers chase data collaboration with Publicis–LiveRamp.

Nvidia VeraAgentic AIApple iOS 27Publicis LiveRamp 6 min read
AI NewsResearch

A few-shot reinforcement approach matches full-data baselines with just 128 examples, while a cleaned omni-modal benchmark clarifies real gains — and a macOS app packages local AI agents for everyday use.

RLVRReinforcement LearningOmni-modalVisual reasoning 6 min read
AI NewsResearch

A new image-generation paper reports consistent ImageNet-256 gains by keeping training steps on spherical paths — no architecture changes. Two more studies push single-image 3D from satellites, stress-test long video consistency, and lift Gemini 3.1 Pro’s coding Elo by 405 with a pairwise “tournament.”

diffusion modelsflow matchingVAE latents3D scene generation 5 min read
AI NewsBusiness

The preview lets developers review outputs and approve changes from iOS and Android after linking to a Mac running Codex. OpenAI also spotlights enterprise uptake, with Sea reporting 87% weekly active usage among Codex users.

OpenAICodexChatGPTAnthropic 6 min read
AI NewsResearch

Lighthouse Attention compresses sequences around standard attention during pretraining, then removes itself after a short recovery phase. New papers also stress-test table understanding, speed up Mixture-of-Experts routing, and replay real news to grade adaptive agents, while a Kubernetes inference stack ships a breaking upgrade.

TransformersLong-contextMixture-of-ExpertsBenchmarking 5 min read
AI NewsResearch

FlowCompile builds a reusable set of accuracy–latency plans for structured agent pipelines before they run, reporting up to 6.4x speedups. Companion papers focus on shorter reasoning, communication-light MoE inference, and a full-stack voice agent benchmark.

FlowCompileMixture-of-ExpertsLLM workflowsvoice agents 5 min read
AI NewsBusiness

New “instant AI workforce” Hirebase arrives in closed beta to run agents across Google Docs, Slack, and Notion — alongside Meta’s encrypted AI chat, Microsoft’s agentic security system, and Alphabet’s $2.1B bet on AI drug design.

AI agentsBasedAIMeta AIMicrosoft Security 5 min read
AI NewsResearch

DeepMind demos a pointer that understands what you select and why, turning pixels into actions like “compare these” or “get directions.” Microsoft details an agent system that uncovered 16 Windows vulnerabilities, while new repos sharpen agent workflows for everyday builders.

DeepMindGeminiAgentic SecurityMicrosoft 6 min read
AI NewsBusiness

Daybreak combines GPT-5.5-Cyber and Codex Security to help organizations find and validate software flaws before attackers do. Google also outlines Gemini-powered automations coming to Android.

OpenAIDaybreakCybersecurityAndroid 3 min read
AI NewsBusiness

The new, majority-controlled unit starts with Tomoro’s 150 engineers and 19 investment partners—arriving as Google flags AI-driven hacking and a U.S. group pushes safety screens for federal AI deals.

OpenAIEnterprise AIConsultingAnthropic 5 min read
AI NewsResearch

ROPD turns teacher responses into prompt-specific checklists to score student rollouts, beating logit-based on-policy distillation in most tests. New work on model selection, agent skills, and test-time scaling also targets lower-cost, safer AI deployment.

LLM alignmentOn-policy distillationModel recommendationAgent frameworks 6 min read
AI NewsResearch

ActCam lets creators steer both motion and camera in generated footage. Meanwhile, a shared expert pool makes mixture-of-experts (MoE) models more efficient, and Hermes Agent climbs to #1 by usage.

video generationMixture-of-Expertsreasoningagent systems 6 min read
AI NewsBusiness

A fast-rising Chinese model maker is courting fresh capital while Brussels gives companies more time to implement high‑risk AI safeguards. Under courtroom scrutiny, OpenAI is emphasizing ‘trusted’ cyber access and controlled coding agents.

KimiEU AI ActOpenAIGovernance 4 min read
AI NewsResearch

One‑line installers and a Docker image streamline local runs for Kimi‑K2.5, GLM‑5, MiniMax, DeepSeek, Qwen, and Gemma. New papers chart where AI‑written GPU kernels fail, organize audio‑plus‑vision learning, introduce a biomedical tool‑calling dataset, and prescribe training when good data is scarce.

OllamaLocal LLMGPU kernelsAudio-Visual 6 min read
AI NewsBusiness

The open‑weight model claims up to 5x higher throughput and a 1M‑token context window to keep multi‑agent workflows on track. Nvidia also adds a unified multimodal model as investors pour $2B into China’s Moonshot AI.

NVIDIAAgentic AINemotron 3Open models 5 min read
AI NewsResearch

KinDER bundles 25 physics-grounded robot environments and a Gymnasium library to stress-test planning, while new benchmarks flag creativity and app-builder weaknesses — and a one-token confidence trick offers a cheaper hallucination filter.

roboticsbenchmarksLLM judgeshallucination detection 6 min read
AI NewsBusiness

OpenAI makes a faster, less error-prone model the ChatGPT default and adds visible memory sources. Meanwhile Apple tests multi-model choices for iOS 27, Anthropic secures SpaceX compute, and SAP buys a tabular-AI lab—signals that AI is moving from demos to deployment.

OpenAIAnthropicAppleSAP 6 min read
AI NewsResearch

Two surveys codify how to design and govern the data flows behind RL-tuned reasoning models and evolving agent skills, while Google ships multi‑token prediction to speed Gemma 4 and developer webhooks for long jobs.

LLM reinforcement learningSpeculative decodingRetrievalRed teaming 7 min read
AI NewsBusiness

Google’s File Search now works across images and text and can cite the exact page it pulled from—pushing RAG toward audit-ready answers—while Microsoft ships an open-source toolkit to govern what agents can do.

GoogleGeminiRAGEmbeddings 3 min read
AI NewsResearch

HiL-Bench plants hidden blockers in coding and SQL tasks to test whether agents ask clarifying questions instead of guessing. Its Ask-F1 metric focuses on judgment, and early reinforcement learning results show this skill is trainable.

agent benchmarksspeculative decodingKV cache compressionagent governance 6 min read
AI NewsBusiness

Appfigures says image model releases generate 6.5× more downloads than standard model updates — but only ChatGPT turned one surge into $70M in 28 days. At the same time, Anthropic and OpenAI are forming private‑equity joint ventures to push AI into mid‑market firms, while Morgan Stanley flags AI-driven flows into Hong Kong tech.

OpenAIAnthropicAI image generationGoogle Gemini 5 min read
AI NewsResearch

Odysseus trains a multimodal agent to make 100+ decisions in Super Mario Land and goes at least 3× farther than prior agents. Meanwhile, open models scale on tough exams and fresh benchmarks stress-test video lectures and visual honesty.

Reinforcement LearningVision-Language ModelsMoE routingMultimodal benchmarks 7 min read
AI NewsBusiness

A massive raise gives OpenAI long-term compute across multiple clouds and chips. Paired with a new AWS tie-up and Nvidia’s agent-focused CPU, it signals how workplace AI will actually run.

OpenAIAWSNvidiaGoogle Cloud 6 min read
AI NewsResearch

A new paper proposes an event-driven cascade for computer-use agents: run a small policy by default and call a stronger model only when monitors flag stalls or semantic drift. Live workflow benchmarks and fresh visual datasets show why targeted compute and better evaluation matter.

computer-use agentscascaded inferencepreference optimizationworkflow benchmarks 6 min read
AI NewsBusiness

The Defense Department inks agreements with Google, Nvidia, OpenAI and others to run AI tools on classified systems. Also today: Nvidia ships a multimodal model for faster agents, Meta buys a humanoid AI startup, and IBM rolls out an enterprise SDLC assistant.

PentagonOpenAINvidia NemotronGoogle 5 min read
AI NewsResearch

Nvidia’s Nemotron 3 Nano Omni folds audio, vision, and text into one lightweight system. Also in focus: faster red‑teaming for long‑context attacks, evidence that fine‑tuning can shift safety, a consumer‑GPU training boost, and a self‑hosted personal agent.

NVIDIAmultimodal LLMred teamingfine-tuning safety 6 min read
AI NewsBusiness

Cohere moves to buy Germany’s Aleph Alpha at a $20B valuation as Cisco ships an open-source provenance tool and investors pour fresh capital into legal and clinical AI. For teams, this means more vendor choice, tighter compliance, and clearer paths to production.

CohereAleph AlphaSovereign AICisco Model Provenance Kit 5 min read
AI NewsResearch

Nemotron 3 Nano Omni unifies audio, vision, and language in a 30B‑A3B system with open weights. New papers highlight safety drift after fine‑tuning and cheaper, faster red‑teaming and training on consumer GPUs.

NVIDIANemotron 3 Nano OmnimultimodalMixture-of-Experts 5 min read
AI NewsBusiness

Microsoft, Alphabet, Amazon and Meta report as investors look for proof that massive AI capex is translating into cloud growth and profits—while ad platforms and customer support get fresh AI upgrades.

AlphabetMicrosoftMetaAWS 6 min read
AI NewsBusiness

GPT-5.5 matches GPT-5.4’s latency while posting higher scores on coding and computer-use benchmarks, and it’s rolling out to Plus, Pro, Business, and Enterprise in ChatGPT and Codex. API access is delayed pending additional safety work, as Nvidia touts a CPU built for agentic AI and investors back agent infrastructure and clinical AI.

OpenAIGPT-5.5agentic AINVIDIA Vera 7 min read
AI NewsResearch

Researchers tie fine-tuning–induced hallucinations to interference in a model’s existing knowledge and propose a self‑distillation recipe to steady outputs. Meanwhile, HyLo extends context up to 32× and Nvidia’s Nemotron 3 Nano Omni claims 9× higher multimodal throughput.

fine-tuninghallucinationslong-contextMixture-of-Experts 9 min read
AI NewsBusiness

The reworked pact keeps Microsoft’s license through 2032 and removes AGI triggers, while OpenAI can take its models to AWS and Google Cloud. Enterprises get real choice as AI platform competition shifts to cloud marketplaces.

MicrosoftOpenAIGoogle CloudAWS 6 min read
AI NewsResearch

A process-aware reward model lifts data-analysis agents by 7.21% and 11.28% and delivers 78.73%/64.84% with reinforcement learning, while SketchVLM makes reasoning visible and promptfoo packages evals for teams.

Process reward modelsAgentic AIVision-languagePrompt evaluation 5 min read
AI NewsResearch

Researchers introduce 3D-VCD, an inference-time check that contrasts a scene with a deliberately distorted version to suppress ungrounded tokens. Alongside, papers push adaptive diffusion training, RL that builds full websites, and a terminal-native coding agent you can run locally.

Embodied AIContrastive decodingReinforcement learningDiffusion models 6 min read
AI NewsBusiness

Cash meets compute: Google puts $10B in now with up to $30B more for Anthropic, as Amazon and OpenAI intensify the race. Plus, Cohere–Aleph Alpha and ComfyUI’s $500M valuation show where enterprises and creators are placing their bets.

GoogleAnthropicCohereAleph Alpha 5 min read
AI NewsResearch

GPT-5.5 hits ChatGPT with GPT‑5.4‑level latency and stronger coding, browsing, and analysis skills. At the same time, Google’s Gemma 4 and Alibaba’s Qwen3.6‑27B push efficient open models, while new MoE research trims training compute.

OpenAIGPT-5.5Gemma 4Qwen3.6-27B 7 min read
AI NewsResearch

River-LLM uses a KV-sharing trick so decoder-only models can skip layers mid-generation without losing context, claiming real wall‑clock gains. Also in focus: a dataset cataloging 3,632 reward hacks in terminal agents and a healthcare model trained on 25B records across 7.2M patients.

early exitKV cacheinference-time controlagent security 7 min read
AI NewsBusiness

Bloomberg reports Google is preparing new chips for inference after striking deals with Meta and Anthropic. At the same time, Adobe and Siemens push agentic AI into enterprise workflows, hinting at faster, cheaper automation ahead.

GoogleNvidiaInference chipsAdobe CX Enterprise 4 min read
AI NewsResearch

Analyzing 935 ablation experiments, researchers report a heavy‑tailed distribution of fitness effects in AI architecture tweaks—68% harmful, 19% neutral, 13% helpful—and logistic bursts of new ideas. The same issue also brings a new robotics benchmark and a practical fix for diffusion models’ sampling bias.

evolutionary dynamicsNeural Architecture SearchRoboLabdiffusion models 7 min read
AI NewsResearch

The new flagship boosts spreadsheet/presentation work, ships native computer-use for agents, and posts big gains on OSWorld and BrowseComp. Google counters with Gemma 4 under Apache 2.0 and a robotics model that reads analog gauges.

OpenAIGPT-5.4Agentic workflowsGoogle DeepMind 6 min read
AI NewsResearch

Researchers propose a three-phase residual stream that cuts perplexity by 7.2% at 123M parameters with just 1,536 extra weights and nearly 2x faster convergence. Alongside, new papers push RL fine-tuning and visual reasoning, while system optimizers squeeze 2–5x speed from kernels and compilers.

TransformersReinforcement LearningCompiler OptimizationKernel Tuning 8 min read
AI NewsBusiness

Factory, which builds autonomous coding agents, is in talks to raise a new round led by Khosla Ventures, with Keith Rabois set to join the board — squaring up against Anthropic, OpenAI, and Cursor.

Factoryagentic AIOpenAIGoogle Chrome AI Mode 7 min read
AI NewsResearch

A new study shows desktop/web agents can cause serious harm even when users give innocuous instructions, while fresh training and architecture work races to make models faster and safer. We also track a major open-source agent release that brings enterprise-grade features to mobile and the browser.

Agent safetyBenchmarksDistillationLooped models 7 min read
AI NewsBusiness

OpenAI plugs a pricing gap with a $100 ChatGPT Pro plan while Microsoft debuts in-house speech, voice, and image models. Japan’s SoftBank rallies industry for homegrown ‘physical AI,’ and U.S. regulators push ad giants on boycott conduct.

OpenAIMicrosoftClaude MaxCodex 7 min read
AI NewsResearch

LG AI Research releases EXAONE 4.5 with native vision-language training and a 256K context window tuned for document-heavy use, while NVIDIA's Nemotron 3 Super targets agent workloads with a hybrid Mamba-Transformer MoE. Two vision papers push open-world 3D detection and parameter-efficient generation.

Vision-Language ModelsLong-context3D DetectionGenerative Models 7 min read
AI NewsResearch

A new theory paper sets a floor on how few steps diffusion samplers can take, while fresh research tackles the open-loop vs. closed-loop gap in autonomous driving and makes coding agents harder to break. If you care about speed, today is about knowing the limits—and building around them.

diffusion modelssampling theoryspeculative decodingautonomous driving 7 min read
AI NewsResearch

A once-anonymous video generator, HappyHorse-1.0, is confirmed as Alibaba’s work after it raced to the top of global leaderboards. At the same time, a new paper and tools rethink how AI agents remember and manage state.

Alibabavideo generationArtificial AnalysisLLM agents 7 min read