Global AI Weekly

Issue number: 145 | Tuesday, April 14, 2026

Highlights

Mythos could fix AI hallucinations

Anthropic’s Mythos preview goes after the real problem behind hallucinations: AI does not just get things wrong, it builds convincing stories. This work explores how models handle truth, uncertainty, and explanation, and why accuracy alone is not enough. The idea is simple but powerful: make AI show its doubt instead of hiding it. If this direction sticks, it could reshape how we trust everything AI says.

red.anthropic.com

The Role of Tech and AI in the Artemis II Moon Mission

NASA’s Artemis II mission is gearing up to launch the Space Launch System (SLS) rocket alongside the Orion spacecraft, setting the stage for groundbreaking advancements in lunar exploration. Equipped with cutting-edge technology and artificial intelligence, the mission aims to enhance crew safety, improve navigation, and pave the way for future deep space travel. This effort represents a pivotal step in returning humans to the Moon and expanding our reach into the cosmos.

aimagazine.com

Here's what that Claude Code source leak reveals about Anthropic's plans

Anthropic's Claude Code source leak sheds light on intriguing features and plans, including a persistent AI agent, a stealthy "Undercover" mode, and a virtual assistant named Buddy. These developments hint at a focus on creating adaptive, versatile systems that could offer personalized assistance while staying seamlessly integrated into users' daily routines. The leaked details provide an exciting glimpse into the next steps for Anthropic’s AI technology.

arstechnica.com

Research

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

This paper introduces Claw-Eval, a novel approach aimed at creating more reliable and trustworthy evaluations for autonomous agents. By addressing current challenges in assessing these systems, it proposes a framework that ensures consistency and fairness while highlighting key areas for improvement. The work emphasizes the importance of robust benchmarks to support the growth and deployment of AI-driven agents.

huggingface.co

Embarrassingly Simple Self-Distillation Improves Code Generation

This paper explores how an embarrassingly simple self-distillation approach can enhance code generation performance. By iteratively refining a model with its own predictions, this method achieves notable improvements without the need for complex processes or additional supervision. The study highlights the effectiveness of simplicity in advancing code generation tasks.

huggingface.co

AI is a common workplace tool: half of employed AI users now use it for work

A survey of over 2,000 Americans reveals that AI has become a common tool in the workplace, with half of employed AI users incorporating it into their jobs. It examines who is using AI, in what capacity, the services being utilized, and whether AI is replacing existing tasks or generating new ones. The findings highlight the growing role of AI in modern work environments.

epoch.ai

Video

Opinionated agentic development and sharing

In Episode 3 of the Made for Dev @DockerInc special, Sammy Deprez and Oleg Šelajev explore the capabilities of Docker Agent, a tool for building custom, portable AI agents. Oleg showcases the "Agent-as-Code" philosophy, demonstrating how to configure an agent's personality, AI model, and toolset using a simple YAML file. The episode highlights features like running agents locally with the Docker Agent CLI and integrating tools to handle tasks such as research, coding, and debugging. If you've been curious about creating shareable AI assistants tailored to your needs, this episode has you covered.

youtube.com

Claude Mythos, Project Glasswing and AI cybersecurity risks

This week's Mixture of Experts podcast explores key AI topics, including Anthropic's decision to withhold its Mythos model and the implications for AI security, along with a breakdown of the financial strategies of OpenAI and Anthropic as they tackle different market sectors. The discussion also covers whether AI can rediscover historic scientific breakthroughs, featuring findings from the GPT-1900 model experiment. To wrap up, IBM Fellow Aaron Baughman showcases the Masters Vault, an AI innovation that enables seamless searching of decades worth of Masters golf footage using natural language.

youtube.com

Articles

VS Code Just Turned AI Agents Into Your New Dev Team

The latest Visual Studio Code update quietly levels up AI from autocomplete to full agents that can plan, act, and iterate across your workspace. These agents go beyond suggestions, handling multi-step tasks, tool use, and context-aware changes inside your codebase. It marks a shift from passive assistance to active collaboration, where AI can actually execute work. If you thought Copilot was useful, this release hints at what happens when it starts behaving like a real teammate.

code.visualstudio.com

glm-5.1:cloud

GLM-5.1 is a cutting-edge model designed for advanced agentic engineering, offering vastly improved coding capabilities over its predecessor. It sets a new benchmark by excelling in SWE-Bench Pro and outperforms the earlier GLM-5 model with a significant lead, showcasing its exceptional performance and innovation.

ollama.com

Your AI Agents Are Broken. Here’s the Blueprint Fix

Most agent failures are not prompt issues but design flaws. This guide cuts through the hype and shows when to use patterns like ReAct, reflection, planning, or multi-agent setups without overengineering. It explains how to choose the simplest structure that works, enforce tool discipline, and avoid runaway loops and fragile workflows. If your agents feel unpredictable or expensive, this roadmap offers a grounded way to make them reliable and production-ready.

machinelearningmastery.com

Upcoming Events

AgentCamp - Coming to a City Near You

AgentCamp continues to grow as a global series of hands-on gatherings dedicated to building and experimenting with AI agents. These community-driven events bring developers, founders, and AI enthusiasts together for practical sessions, collaborative building, and open exchange of ideas. Hosted in cities around the world, AgentCamp focuses on real-world experimentation, giving participants the space to prototype agent workflows, explore emerging tools, and learn directly from peers working at the edge of autonomous AI. Join the community to build, share, and help advance what AI agents can do in practice.

globalai.community

Code

Building a Real-Time Multi-Agent UI with AG-UI and Microsoft Agent Framework Workflows

Learn how to create a real-time multi-agent user interface using AG-UI and the Microsoft Agent Framework in this comprehensive demo. Explore features like dynamic agent handoffs, human-in-the-loop approvals, and streaming server-sent events, all applied to a customer support workflow. This hands-on walkthrough highlights practical ways to streamline interactions and enhance efficiency.

devblogs.microsoft.com

Connecting MCP servers to Amazon Bedrock AgentCore Gateway using Authorization Code flow | Amazon Web Services

Amazon Bedrock AgentCore Gateway simplifies managing AI agent connections to tools and MCP servers within your organization. This guide explains the process of setting up the gateway to link with an OAuth-protected MCP server through the Authorization Code flow, providing a secure and efficient connection method.

aws.amazon.com

Copilot CLI now supports BYOK and local models

GitHub Copilot CLI now allows users to bring their own model provider (BYOK) or run fully local models, giving more flexibility and control over how models are used. This enhancement means you’re no longer limited to GitHub-hosted model routing and can integrate the models that best suit your needs. It’s a step forward in customizing your AI-driven coding experience.

github.blog

Podcast

Machine Learning Street Talk

Machine Learning Street Talk (MLST) offers engaging conversations with leading experts in AI, covering topics like cognitive science, neuroscience, and the philosophy of mind. The show provides in-depth analysis of current developments in AI, emphasizing intellectual diversity while cutting through the hype. Hosted by Tim Scarfe, Ph.D., with regular contributions from MIT's Dr. Keith Duggar, MLST delivers a rigorous and wide-ranging perspective on the field.

open.spotify.com