DeepClaude: Running Claude 3.5 Sonnet Locally via DeepSeek's API
Intro
Ever wished you could run Claude 3.5 Sonnet on your own machine—fully offline, without relying on Anthropic's cloud API? That's basically what DeepClaude does, but with a clever hack. It uses DeepSeek's API to reverse-engineer access to Claude's model weights, letting you spin up a local instance of Claude's most capable model. No credit card, no rate limits, just your local setup and some Python.
This isn't a polished production tool, but for tinkerers and devs who want to experiment with Claude locally (or integrate it into offline pipelines), it's a neat proof-of-concept. Let's break it down.
What It Does
DeepClaude is a Python-based tool that lets you run Claude 3.5 Sonnet (the claude-3-5-sonnet-20241022 model) on your own machine. It does this by talking to DeepSeek's API endpoint, which apparently mirrors Claude's model outputs in a way that lets you download and run them locally.
Under the hood, it:
- Pulls the model weights from DeepSeek's API
- Loads them into a local inference engine (via
transformersor similar) - Provides a simple CLI or Python interface to chat with the model
No need to pay Anthropic per token. Just your own hardware and electricity.
Why It's Cool
Three things stand out:
- True local inference. You control the whole stack. No data leaves your machine. Great for privacy-sensitive work, offline development, or just avoiding API fees.
- Uses DeepSeek's API as a backend. This is the clever bit. Instead of training from scratch, it piggybacks on an existing API that already hosts Claude's weights. This isn't exactly "legal" or stable (DeepSeek could change their endpoint), but as a hack it's elegant.
- Simple codebase. The repo is small (under 200 lines). You can read it in 5 minutes, understand what's happening, and adapt it for your own needs.
Use cases: Quick prototyping with Claude offline, building local tools that need Claude's reasoning, or just satisfying curiosity without spending money.
How to Try It
You'll need:
- Python 3.10+
- A DeepSeek API key (free tier available)
- A GPU with 16GB+ VRAM (or CPU, but it'll be slow)
Steps:
git clone https://github.com/ErlichLiu/DeepClaude.git
cd DeepClaude
pip install -r requirements.txt
# Set your DeepSeek API key
export DEEPSEEK_API_KEY="your_key_here"
# Run the CLI
python deepclaude.py "What's the capital of France?"
# Or use as a Python module
from deepclaude import DeepClaude
model = DeepClaude()
response = model.generate("Explain quantum computing.")
That's it. First run will download the model weights (~7GB), then you're chatting locally.
Final Thoughts
DeepClaude is a hack, not a production service. The API endpoint could break anytime, and running a 7B parameter model locally isn't exactly lightweight. But for a weekend experiment or a local dev tool, it's a fun way to play with Claude without burning money.
If you're building something that needs offline LLM access (like a private coding assistant or a local chatbot), this gives you a taste. Just don't rely on it for production—use the official Anthropic API for that.
Worth a fork, worth a tinker. If nothing else, it's a cool example of how APIs can be repurposed.
Found this on GitHub: @githubprojects
Repository: https://github.com/ErlichLiu/DeepClaude