Running AI Locally: Lessons Learned

April 6, 20265 min read

What I learned from running AI models on local hardware.

Practical insights from running a self-hosted AI assistant, including hardware challenges, memory architecture, and when to choose local vs cloud AI.

After months of running OpenClaw as my personal AI assistant, I've learned a lot about what works—and what doesn't—when hosting AI infrastructure locally. Here are the key lessons.

The biggest misconception about local AI is that you can just run any model on any hardware. You can't. I learned this the hard way when I tried to run a 70B parameter model on my home server and watched it crash repeatedly.

The reality: Local inference requires serious hardware. For anything beyond small models (7B-13B parameters), you need:

My solution? Use cloud models via Ollama Cloud. The local Ollama server acts as a relay, but the actual inference happens on cloud GPUs. This gives me access to powerful models like Qwen3.5 (397B parameters) without needing a $10,000 GPU setup.

An AI assistant without memory is just a chatbot. The key insight: memory should be external to the model. I use Neo4j as a graph database to store:

This allows the AI to have contextual conversations. When I ask "How's the media server doing?", it can query Neo4j and tell me about Radarr, Sonarr, and any stalled downloads—not just give a generic response.

After experimenting with both, here's my framework:

My current setup uses both:

This balances cost, performance, and capability. Simple tasks stay local; complex work goes to the cloud.

The real power of a self-hosted AI isn't chat—it's automation. OpenClaw runs scheduled jobs that:

These aren't scripted commands—they're AI-driven decisions. The media manager, for example, analyzes torrent health, decides what's dead, removes it, and triggers re-searches. All autonomously.

The roadmap includes expanding automation (email triage, calendar management), improving memory (better entity extraction from conversations), and exploring multi-agent workflows (specialized AI assistants for different domains).

If you're building your own AI infrastructure, I'm happy to share what I've learned. Reach out via the contact page or connect on Discord.

Running AI Locally: Lessons Learned

Hardware Reality Check

Memory Architecture Matters

When to Go Local vs Cloud

The Hybrid Approach

Automation is Where It Shines

Lessons Learned

What's Next