Run AI Run

The Week AI Safety United Rivals and Reality Checked the Hype

Jul 18, 2025

Welcome to the inaugural edition of Run AI Run, as announced here.

This week marked a pivotal moment in AI development with unprecedented collaboration between major AI companies on safety research, massive infrastructure investments exceeding $100 billion, and significant talent movements reshaping the competitive landscape. The period was characterized by breakthrough model releases, geopolitical AI infrastructure plays, and growing tensions between AI tool expectations and reality.

🚀 Model Innovations and Releases

Grok-4 Sets New Intelligence Standards
xAI unveiled Grok-4, the world's most powerful AI model with native tool use and real-time search, available immediately for subscribers and ranking third on global AI leaderboards.

Kimi K2 Tops Coding Benchmarks
Moonshot AI released Kimi K2, a 1T parameter open-source MoE model excelling in coding and agentic tasks, outperforming ChatGPT and Claude at lower costs.

Voxtral Mini Advances Speech AI
Mistral launched Voxtral Mini, an open-source speech recognition model supporting transcription and summarization with superior word error rates.

Gemini 2.5 Flash-Lite Boosts Efficiency
Google released Gemini 2.5 Flash-Lite, a cost-efficient model 1.5x faster than predecessors with dynamic thinking control for latency-sensitive apps.

🔬 Research Breakthroughs and Techniques

Muon Optimizer Scales LLM Training
Researchers detailed Muon optimizer's scalability for LLM training, incorporating weight decay and update adjustments to handle trillion-parameter models stably.

AI Safety Collaboration Urges CoT Monitoring
Joint paper from OpenAI, DeepMind, Anthropic, and Meta highlights chain-of-thought monitoring for detecting AI misbehavior, emphasizing its fragility as models advance.

Verifier's Law Explains AI Training Ease
Jason Wei proposed Verifier's Law, linking AI task solvability to verification ease, with implications for reinforcement learning and task asymmetry.

SmolLM3 Enables Multilingual Reasoning
Hugging Face released SmolLM3, a 3B parameter model supporting six languages and long-context reasoning, topping benchmarks in its size class.

⚡ Infrastructure and Hardware Advances

NVIDIA Unveils Blackwell Ultra
NVIDIA announced Blackwell Ultra platform, delivering 40x performance over Hopper for AI reasoning, with products launching in H2 2025.

AMD Launches Instinct MI350 Series
AMD introduced Instinct MI350 GPUs with 4x AI compute improvement and 35x inferencing gains, partnering with major AI firms.

DeepInfra Offers Affordable B200 GPUs
DeepInfra provides NVIDIA B200 GPUs at $1.99/hour on-demand, promoting accessibility for AI workloads until July end.

TensorWave Unveils 21 ExaFLOPS Cluster
TensorWave launched an 8,192 MI325X GPU supercluster offering 21 exaFLOPS for AI training, challenging NVIDIA dominance.

🛠️ Tools and Developer Ecosystem

Perplexity Launches Comet Browser
Perplexity introduced Comet, an AI-powered browser that automates tasks and assists research with integrated agents.

Anthropic Debuts Claude Code Analytics
Anthropic released an analytics dashboard for Claude Code, tracking productivity metrics and ROI for engineering teams.

METR Study Reveals AI Tool Slowdowns
METR's RCT found early-2025 AI tools made experienced developers 19% slower, challenging productivity hype.

Asimov Agent Boosts Code Management
Reflection AI's Asimov acts as a code librarian, querying personal codebases to enhance developer productivity.

🏢 Industry Developments and Announcements

xAI Builds Frontier Lab Rapidly
xAI constructed a state-of-the-art AI lab in 1.5 years, launching Grok-4 with 100k H100s and questioning Western scaling pace.

Trump Announces $90B AI Initiative
President Trump unveiled a $70-90B AI infrastructure plan at Carnegie Mellon, partnering with tech giants for data centers in Pennsylvania.

Anthropic Launches Claude Financial
Anthropic introduced Claude for Financial Services, integrating with major data providers for investment analysis and compliance.

Meta Poaches Apple AI Leaders
Meta hired Apple's top AI executive Ruoming Pang and two key researchers, bolstering its Superintelligence Labs with $200M+ packages.

🐦 Twitter Recap

X discussions buzzed around Grok-4's launch controversies, Kimi K2's open-source impact, and AI safety warnings from major labs; influencers like @elonmusk and @_jasonwei drove debates on talent wars and verifier's law, with hashtags #Grok4 #AISafety #KimiK2 trending amid high engagement on infrastructure announcements. Reactions to the historic AI safety collaboration paper, with researchers praising the unprecedented cooperation between competing companies. The METR study findings sparked heated debates about AI tool effectiveness, with many developers sharing their own experiences contradicting benchmark claims. Infrastructure announcements from NVIDIA and AMD generated significant technical discussions, while the Apple-Meta talent move prompted conversations about AI talent retention strategies. Industry leaders actively discussed the implications of government-private sector AI partnerships, particularly regarding the Trump administration's Pennsylvania initiative.

Research Statistics

Sources Reviewed:

X/Twitter Posts: 150+ posts analyzed (minimum 20 favorites, 50 retweets)
Web Articles: 85+ articles from Reuters, WSJ, TechCrunch, VentureBeat, Bloomberg
Research Papers: 25+ ArXiv preprints and academic publications
Reddit Discussions: 40+ threads from r/MachineLearning, r/LocalLLaMA, r/singularity
Company Announcements: 15+ official press releases and blog posts

Run Data Run

Discussion about this post

Ready for more?