Skip to main content

Command Palette

Search for a command to run...

Refactoring to a Hybrid Cloud Architecture

Updated
6 min read

When I started building Aethas, I started with a purely local-first approach. Everything would run on the user's machine: their data, their embeddings, their conversations. No cloud required. This was great for learning Tauri and iterating quickly.

I knew that would be just the first step. I wanted proactivity, I wanted something to act on my behalf without a prompt. How could an AI prepare your meeting context if it only runs when you open your laptop? How could you access your knowledge base from your phone? How could the system surface relevant information throughout your day?

It's time to think about the server architecture.


The Original Architecture

The initial stack was elegant in its simplicity:

  • Tauri 2.0 for the desktop app (Rust backend, native webview)
  • React + TypeScript frontend
  • SQLite for local storage
  • FastEmbed (Rust) for local embeddings

Everything ran locally. The Rust backend handled file indexing, vector search, and Claude API calls. My test users brought their own API keys. Data never left their machine.

This worked beautifully for the core use case: searching your knowledge base and chatting with an AI that understood your context. But, you can do that with MCP servers easily. Aethas was designed from the beginning for proactivity.


Why I Needed a Server

Three requirements drove the change:

Proactive AI. I want Aethas to prepare meeting briefs before calendar events, surface relevant context throughout the day, draft responses to incoming emails. This requires compute happening in the background, even when the desktop app is closed.

Mobile access. Querying your knowledge base from your phone means the data needs to be accessible somewhere other than your laptop. Running Tauri on mobile isn't practical, and I didn't want to build two native apps.

Continuous processing. Background agents that monitor for triggers, process new content, and maintain index freshness need a persistent runtime.

The solution: a hybrid architecture where the desktop app becomes a thin client that can sync local data, but a cloud server handles background processing and provides API access.


Choosing the Stack: Bun + Hono

I needed a backend stack with fast cold starts (for serverless later), TypeScript-native (shared types with frontend), minimal boilerplate, and great DX.

OptionProsCons
ExpressMature, huge ecosystemVerbose, slow cold starts
FastifyFast, schema validationStill needs Node
HonoUltra-light, runs anywhereNewer, smaller ecosystem
tRPCType-safe RPC, great DXOverkill for REST-ish API

Bun + Hono won.

Bun's Built-in SQLite

Bun ships with native SQLite bindings:

import { Database } from 'bun:sqlite';

const sqlite = new Database('./data/aethas.db');
sqlite.exec('PRAGMA journal_mode = WAL');

This eliminated my biggest Node.js pain point: cross-platform SQLite compilation. Anyone who's fought with better-sqlite3 on different platforms knows the pain. We still might switch to postgres later, but this allows me to iterate quickly and leverage some really nice tools for RAG and tokens.

Hono's Simplicity

The entire server setup:

const app = new Hono();

app.use('*', logger());
app.use('*', cors({ origin: ['http://localhost:1420'], credentials: true }));

app.route('/api/conversations', conversationsRoutes);
app.route('/api/chat', chatRoutes);
app.route('/api/sources', sourcesRoutes);

export default { port: 3000, fetch: app.fetch };

The server starts in under 100ms. Hono's SSE streaming support is equally clean, which matters for chat applications where streaming UX is everything.

It's great to not have webpack and have near immediate hot-reloads from typescript:

bun run --hot src/index.ts

Development velocity depends on feedback loops, especially when using an AI tool like Claude to help you code, any longer than 2 seconds and you lose flow.


Database Layer: Drizzle

For the ORM, I chose Drizzle over Prisma or raw SQL.

Drizzle schemas are TypeScript-first, giving compile-time checks and autocompletion. The query syntax mirrors SQL directly. I hate "magic" in ORMs. I want to no surprises and to be as close to sql as possible. And unlike Prisma's query engine, Drizzle compiles to thin wrappers around your database driver. Docker images stay small, startup stays fast.

The tradeoff: Drizzle's ecosystem is smaller than Prisma's. So far it's been pretty good. It's probably one of the best ORM experiences I've had with SQL, though I'll probably never get used to importing operators and using them.


Monorepo Structure

I restructured to a pnpm workspace monorepo:

aethas/
  apps/
    web/          # React frontend (Vite)
    desktop/      # Tauri app (wraps web)
  server/         # Bun + Hono backend
  packages/
    shared/       # TypeScript types shared across all

The @aethas/shared package contains every API contract, every event type, every data structure. Changes propagate to both frontend and backend compilation automatically. The TypeScript compiler catches interface mismatches.

This was the biggest productivity win of the refactor. Start with shared types early. I've found over the years that shared types can be one of the largest pains with any system, especially javascript. The monorepo approach is really well-suited to avoiding importing yet another package from NPM.


Development Environment

For local development, I use Docker Compose with Tilt for orchestration.

The Dockerfile is minimal thanks to Bun:

FROM oven/bun:1
WORKDIR /app
COPY package.json bun.lockb* ./
RUN bun install
COPY . .
EXPOSE 3000
CMD ["bun", "run", "src/index.ts"]

Tilt coordinates the server container and local web dev. Running tilt up starts everything with live reload and a dashboard showing all services. The whole stack comes up in seconds. Tilt allows me to have a lot of flexibility on top of docker-compose while remaining close to a live production deployment for easy deployment. Plus, I can switch it to a local kubernetes cluster later and get close to a mirror of my deployments.


Lessons Learned

Start with shared types. Every API contract in one package. TypeScript catches mismatches at compile time. This saved me hours of debugging.

Design for streaming from the start. Don't bolt SSE onto a request/response API later. The event types, streaming routes, and frontend handlers should be designed together.

SQLite scales further than you think. I debated PostgreSQL early on. For a single-user app with tens of thousands of documents, SQLite with WAL mode handles everything. The operational simplicity — one file, works everywhere — is worth more than theoretical scale.

Monorepo friction is real. pnpm workspaces work well, but the mental model of "which package am I in?" takes adjustment. Clear naming conventions help: @aethas/web, @aethas/server, @aethas/shared.

Hot reload or bust. Bun's instant restarts, Vite's HMR, and Tilt's live updates keep the feedback cycle under 2 seconds. Anything longer and you lose flow.


What This Unlocks

The hybrid architecture unlocks the proactive AI roadmap:

  • Calendar integration — prepare meeting briefs before events
  • Email triage — surface relevant context for incoming messages
  • Background agents — process new content continuously
  • Mobile access — same API, different client

The foundation is set!