Anthropic's Claude 4 Announced: New AI Capabilities for Coding, Agents, and Claude Code in VS Code / IntelliJ
Software engineer and entrepreneur based in San Francisco.
Software engineer and entrepreneur based in San Francisco.
Anthropic just released their next generation of AI models: Claude Opus 4 and Claude Sonnet 4. This isn't just another incremental update—it represents a significant leap in AI capabilities, particularly for coding, complex reasoning, and autonomous agent workflows.
The Claude 4 family introduces two distinct models with complementary strengths:
Claude Opus 4 is Anthropic's most intelligent model to date and, according to their benchmarks, the world's best coding model. My experience anecdotally more or less confirms this, although Google and OpenAI models have recently come neck-and-neck with Anthropic.
It excels at sustained performance on complex, long-running tasks that require thousands of steps and hours of focused effort—dramatically expanding what AI agents can accomplish.
Claude Sonnet 4 significantly improves on Sonnet 3.7, delivering superior coding and reasoning while responding more precisely to instructions. While not matching Opus 4 in most domains, it delivers an optimal mix of capability and practicality for everyday use.
The performance improvements are substantial, with Claude models now leading on key benchmarks:
Benchmark | Claude Opus 4 | Claude Sonnet 4 | Previous Best |
---|---|---|---|
SWE-bench Verified | 72.5% | 72.7% | 67.0% (GPT-4.1) |
Terminal-bench | 43.2% | 39.6% | 38.0% (GPT-4.1) |
MMLU | 90.2% | 88.0% | 88.7% (GPT-4.1) |
TAU-bench | 83.0% | 78.0% | 76.0% (o3) |
What's particularly impressive is Claude Opus 4's ability to maintain performance over extended periods, at least according to the benchmark results. They cited today a Rakuten validation of this by having Opus 4 run a demanding open-source refactor continuously for 7 hours with sustained performance—something previous models couldn't achieve. If true, this would constitute a significant improvement over the status quo -- I'll be testing this shortly.
Both Claude 4 models are hybrid reasoning models offering two modes:
The extended thinking capability now works with tools like web search, allowing Claude to alternate between reasoning and tool use to improve responses. This is particularly valuable for research tasks, complex coding problems, and multi-step workflows.
What makes this approach unique is that Claude can dynamically decide when to use extended thinking based on the complexity of the task, rather than requiring explicit configuration.
After a successful research preview, Claude Code is now generally available with expanded capabilities. This agentic coding tool lives in your terminal, understands your codebase, and helps you code faster through natural language commands.
Claude Code's key capabilities include:
One of the most exciting additions is the new IDE integrations for VS Code and JetBrains:
These integrations provide:
Cmd+Esc
(Mac) or Ctrl+Esc
(Windows/Linux) to open Claude Code directly from your editorClaude Code now supports:
GitHub reports that Claude Sonnet 4 "soars in agentic scenarios" and will serve as the base model for the new coding agent in GitHub Copilot.
Another new powerful feature of Claude Code is its support for git worktrees, allowing you to run multiple Claude Code sessions in parallel across different branches of the same repository.
Git worktrees let you check out multiple branches of a repository simultaneously in different directories. With Claude Code, this enables:
To set up a git worktree workflow with Claude Code:
# Create a new worktree for a feature branch
git worktree add ../repo-feature-a feature/a
# In one terminal, run Claude Code in the main worktree
cd /path/to/main/repo
claude-code
# In another terminal, run Claude Code in the feature worktree
cd ../repo-feature-a
claude-code
Anthropic has released four new capabilities on their API specifically designed for building more powerful AI agents:
Both models maintain consistent pricing with previous Opus and Sonnet models:
Model | Input Tokens | Output Tokens | Context Window | Availability |
---|---|---|---|---|
Claude Opus 4 | $15 per million | $75 per million | 200K tokens | Pro, Max, Team, Enterprise |
Claude Sonnet 4 | $3 per million | $15 per million | 200K tokens | All users (including free) |
Both models are available on:
The Claude 4 release represents a significant advancement for developers building AI-powered applications:
Sustained performance for complex tasks: The ability to work continuously for hours enables entirely new categories of AI applications.
Improved memory capabilities: When given access to local files, Claude Opus 4 can create and maintain "memory files" to store key information, enabling better long-term task awareness.
Reduced shortcut behavior: Both models are 65% less likely to take shortcuts or exploit loopholes compared to Sonnet 3.7, making them more reliable for autonomous workflows.
Thinking summaries: A new feature that condenses lengthy thought processes, making it easier to understand Claude's reasoning without sacrificing depth.
For developers working on coding assistants, research tools, or autonomous agents, these improvements open up new possibilities for building more capable and reliable AI applications.
The Claude 4 family represents a significant step toward AI systems that can maintain context, sustain focus on longer projects, and deliver transformational results across a wide range of domains.