Software engineer and founder with a background in finance and tech. Currently building aVenture.vc, a platform for researching private companies. Based in San Francisco.
Anthropic just released their next generation of AI models: Claude Opus 4 and Claude Sonnet 4. This is a major update with real improvements for coding, reasoning, and agent workflows.
The Claude 4 Family: Opus and Sonnet
The Claude 4 family introduces two distinct models with complementary strengths:
Claude Opus 4 is Anthropic's most intelligent model to date and, according to their benchmarks, the world's best coding model. My experience anecdotally more or less confirms this, although Google and OpenAI models have recently come neck-and-neck with Anthropic.
It handles complex, long-running tasks that require thousands of steps and hours of focused effort—useful for AI agents.
Claude Sonnet 4 improves on Sonnet 3.7 with better coding and reasoning and more precise instruction-following. It doesn't match Opus 4 in most areas, but works well for everyday use.
Benchmark-Leading Performance
Claude models now lead on several benchmarks:
Benchmark
Claude Opus 4
Claude Sonnet 4
Previous Best
SWE-bench Verified
72.5%
72.7%
67.0% (GPT-4.1)
Terminal-bench
43.2%
39.6%
38.0% (GPT-4.1)
MMLU
90.2%
88.0%
88.7% (GPT-4.1)
TAU-bench
83.0%
78.0%
76.0% (o3)
Claude Opus 4 can maintain performance over extended periods, at least according to the benchmark results. They cited today a Rakuten validation of this by having Opus 4 run a demanding open-source refactor continuously for 7 hours with sustained performance—something previous models couldn't achieve. If true, this would constitute a significant improvement over the status quo -- I'll be testing this shortly.
Hybrid Reasoning with Extended Thinking
Both Claude 4 models are hybrid reasoning models offering two modes:
Standard mode: Near-instant responses for everyday queries
Extended thinking: Deeper reasoning for complex problems
The extended thinking capability now works with tools like web search, allowing Claude to alternate between reasoning and tool use to improve responses. This helps with research tasks, complex coding problems, and multi-step workflows.
Claude can dynamically decide when to use extended thinking based on task complexity, rather than requiring explicit configuration.
Claude Code: Now Generally Available
After a successful research preview, Claude Code is now generally available with more capabilities. It lives in your terminal, understands your codebase, and helps you code faster through natural language commands.
Capabilities
Claude Code can:
Editing files and fixing bugs across your codebase
Answering questions about your code's architecture and logic
Executing and fixing tests, linting, and other commands
Searching through git history, resolving merge conflicts, and creating commits and PRs
Browsing documentation and resources from the internet using web search
Inline code edits displayed directly in your files
GitHub reports that Claude Sonnet 4 "soars in agentic scenarios" and will serve as the base model for the new coding agent in GitHub Copilot.
Parallel Workflows with Git Worktrees
Claude Code also supports git worktrees, so you can run multiple Claude Code sessions in parallel across different branches of the same repository.
Using Git Worktrees with Claude Code
Git worktrees let you check out multiple branches of a repository simultaneously in different directories. With Claude Code, this enables:
Working on multiple features or bug fixes concurrently
Running separate Claude Code sessions for each worktree
Maintaining context isolation between different tasks
Comparing approaches across branches without context switching
To set up a git worktree workflow with Claude Code:
# Create a new worktree for a feature branchgit worktree add../repo-feature-a feature/a
# In one terminal, run Claude Code in the main worktreecd /path/to/main/repo
claude-code
# In another terminal, run Claude Code in the feature worktreecd../repo-feature-a
claude-code
New API Capabilities for Agent Development
Anthropic has released four new capabilities on their API for building AI agents:
Code execution tool: Allows Claude to run code in a sandboxed environment
MCP connector: Enables connection to external Model Context Protocol servers
Prompt caching: Allows caching prompts for up to one hour, reducing costs by up to 90%
Pricing and Availability
Both models maintain consistent pricing with previous Opus and Sonnet models:
Model
Input Tokens
Output Tokens
Context Window
Availability
Claude Opus 4
$15 per million
$75 per million
200K tokens
Pro, Max, Team, Enterprise
Claude Sonnet 4
$3 per million
$15 per million
200K tokens
All users (including free)
Both models are available on:
Claude.ai
Anthropic API
Amazon Bedrock
Google Cloud's Vertex AI
What This Means for Developers
Claude 4 matters for developers building AI-powered applications:
Sustained performance for complex tasks: The ability to work continuously for hours enables entirely new categories of AI applications.
Improved memory capabilities: When given access to local files, Claude Opus 4 can create and maintain "memory files" to store key information, enabling better long-term task awareness.
Reduced shortcut behavior: Both models are 65% less likely to take shortcuts or exploit loopholes compared to Sonnet 3.7, making them more reliable for autonomous workflows.
Thinking summaries: A new feature that condenses lengthy thought processes, making it easier to understand Claude's reasoning without sacrificing depth.
For developers working on coding assistants, research tools, or autonomous agents, these improvements enable more capable AI applications.
The Claude 4 models can maintain context and stay on task for longer projects—useful across many domains.
Anthropic's Claude 4 Announced: New AI Capabilities for Coding,...
Similar Content
Related Articles
How to Secure Environment Variables for LLMs, MCPs, and AI Tools Using 1Password or Doppler
Stop hardcoding API keys in MCP configs and AI tool settings. Learn how to use 1Password CLI or Doppler to inject secrets just-in-time for Claude, Cur...
Claude Code Output Styles: Explanatory, Learning, and Custom Options
An implementation guide to Claude Code's /output-style, the built‑in Explanatory and Learning modes (with to-do prompts), and creating reusable custom...
Claude Code: Automatic Linting, Error Analysis, & Custom Commands
How to use Claude Code's error analysis slash-commands and create your own linting commands to automate repetitive CLI tasks.
Sweep: Coding Agent and Next-Edit Autocomplete Sweep is a world-class AI coding agent and autocomplete built for JetBrains developers. We are the only...
docs.anthropic.com
SDK for Claude Code (CLI) - Anthropic
Programmatically integrate Claude Code into your applications using the SDK.
Related Books
Knowledge Graphs and LLMs in Action
Alessandro Negro, Vlastimil Kus +2
Description A technical manual on integrating knowledge graphs with Large Language Models (LLMs) to create intelligent systems with structured reasoni...
Build AI Applications with Spring AI
Fu Cheng
Description A guide for Java developers on using the Spring AI framework to integrate artificial intelligence capabilities into enterprise application...
Related Investments
Owners Platform
AI-powered investment management platform for real estate portfolios.
AngelList
Platform connecting startups with investors, talent, and resources for fundraising and growth.
Sudrania
Fund administration and accounting platform for investment managers.
Causal AI
Robert Osazuwa Ness
Description An introduction to building AI models that identify and reason about cause-and-effect relationships rather than just statistical correlati...