Building Aethas | A Proactive AI Assistant Built in Public

Refactoring to a Hybrid Cloud Architecture

Stephen Ashmore — Tue, 20 Jan 2026 13:49:15 GMT

When I started building Aethas, I started with a purely local-first approach. Everything would run on the user's machine: their data, their embeddings, their conversations. No cloud required. This was great for learning Tauri and iterating quickly.

I knew that would be just the first step. I wanted proactivity, I wanted something to act on my behalf without a prompt. How could an AI prepare your meeting context if it only runs when you open your laptop? How could you access your knowledge base from your phone? How could the system surface relevant information throughout your day?

It's time to think about the server architecture.

The Original Architecture

The initial stack was elegant in its simplicity:

Tauri 2.0 for the desktop app (Rust backend, native webview)
React + TypeScript frontend
SQLite for local storage
FastEmbed (Rust) for local embeddings

Everything ran locally. The Rust backend handled file indexing, vector search, and Claude API calls. My test users brought their own API keys. Data never left their machine.

This worked beautifully for the core use case: searching your knowledge base and chatting with an AI that understood your context. But, you can do that with MCP servers easily. Aethas was designed from the beginning for proactivity.

Why I Needed a Server

Three requirements drove the change:

Proactive AI. I want Aethas to prepare meeting briefs before calendar events, surface relevant context throughout the day, draft responses to incoming emails. This requires compute happening in the background, even when the desktop app is closed.

Mobile access. Querying your knowledge base from your phone means the data needs to be accessible somewhere other than your laptop. Running Tauri on mobile isn't practical, and I didn't want to build two native apps.

Continuous processing. Background agents that monitor for triggers, process new content, and maintain index freshness need a persistent runtime.

The solution: a hybrid architecture where the desktop app becomes a thin client that can sync local data, but a cloud server handles background processing and provides API access.

Choosing the Stack: Bun + Hono

I needed a backend stack with fast cold starts (for serverless later), TypeScript-native (shared types with frontend), minimal boilerplate, and great DX.

Option	Pros	Cons
Express	Mature, huge ecosystem	Verbose, slow cold starts
Fastify	Fast, schema validation	Still needs Node
Hono	Ultra-light, runs anywhere	Newer, smaller ecosystem
tRPC	Type-safe RPC, great DX	Overkill for REST-ish API

Bun + Hono won.

Bun's Built-in SQLite

Bun ships with native SQLite bindings:

import { Database } from 'bun:sqlite';

const sqlite = new Database('./data/aethas.db');
sqlite.exec('PRAGMA journal_mode = WAL');

This eliminated my biggest Node.js pain point: cross-platform SQLite compilation. Anyone who's fought with better-sqlite3 on different platforms knows the pain. We still might switch to postgres later, but this allows me to iterate quickly and leverage some really nice tools for RAG and tokens.

Hono's Simplicity

The entire server setup:

const app = new Hono();

app.use('*', logger());
app.use('*', cors({ origin: ['http://localhost:1420'], credentials: true }));

app.route('/api/conversations', conversationsRoutes);
app.route('/api/chat', chatRoutes);
app.route('/api/sources', sourcesRoutes);

export default { port: 3000, fetch: app.fetch };

The server starts in under 100ms. Hono's SSE streaming support is equally clean, which matters for chat applications where streaming UX is everything.

It's great to not have webpack and have near immediate hot-reloads from typescript:

bun run --hot src/index.ts

Development velocity depends on feedback loops, especially when using an AI tool like Claude to help you code, any longer than 2 seconds and you lose flow.

Database Layer: Drizzle

For the ORM, I chose Drizzle over Prisma or raw SQL.

Drizzle schemas are TypeScript-first, giving compile-time checks and autocompletion. The query syntax mirrors SQL directly. I hate "magic" in ORMs. I want to no surprises and to be as close to sql as possible. And unlike Prisma's query engine, Drizzle compiles to thin wrappers around your database driver. Docker images stay small, startup stays fast.

The tradeoff: Drizzle's ecosystem is smaller than Prisma's. So far it's been pretty good. It's probably one of the best ORM experiences I've had with SQL, though I'll probably never get used to importing operators and using them.

Monorepo Structure

I restructured to a pnpm workspace monorepo:

aethas/
  apps/
    web/          # React frontend (Vite)
    desktop/      # Tauri app (wraps web)
  server/         # Bun + Hono backend
  packages/
    shared/       # TypeScript types shared across all

The @aethas/shared package contains every API contract, every event type, every data structure. Changes propagate to both frontend and backend compilation automatically. The TypeScript compiler catches interface mismatches.

This was the biggest productivity win of the refactor. Start with shared types early. I've found over the years that shared types can be one of the largest pains with any system, especially javascript. The monorepo approach is really well-suited to avoiding importing yet another package from NPM.

Development Environment

For local development, I use Docker Compose with Tilt for orchestration.

The Dockerfile is minimal thanks to Bun:

FROM oven/bun:1
WORKDIR /app
COPY package.json bun.lockb* ./
RUN bun install
COPY . .
EXPOSE 3000
CMD ["bun", "run", "src/index.ts"]

Tilt coordinates the server container and local web dev. Running tilt up starts everything with live reload and a dashboard showing all services. The whole stack comes up in seconds. Tilt allows me to have a lot of flexibility on top of docker-compose while remaining close to a live production deployment for easy deployment. Plus, I can switch it to a local kubernetes cluster later and get close to a mirror of my deployments.

Lessons Learned

Start with shared types. Every API contract in one package. TypeScript catches mismatches at compile time. This saved me hours of debugging.

Design for streaming from the start. Don't bolt SSE onto a request/response API later. The event types, streaming routes, and frontend handlers should be designed together.

SQLite scales further than you think. I debated PostgreSQL early on. For a single-user app with tens of thousands of documents, SQLite with WAL mode handles everything. The operational simplicity — one file, works everywhere — is worth more than theoretical scale.

Monorepo friction is real. pnpm workspaces work well, but the mental model of "which package am I in?" takes adjustment. Clear naming conventions help: @aethas/web, @aethas/server, @aethas/shared.

Hot reload or bust. Bun's instant restarts, Vite's HMR, and Tilt's live updates keep the feedback cycle under 2 seconds. Anything longer and you lose flow.

What This Unlocks

The hybrid architecture unlocks the proactive AI roadmap:

Calendar integration — prepare meeting briefs before events
Email triage — surface relevant context for incoming messages
Background agents — process new content continuously
Mobile access — same API, different client

The foundation is set!

Building a Sync Status Indicator

Stephen Ashmore — Tue, 13 Jan 2026 15:00:16 GMT

Aethas indexes local files like Obsidian vaults, or markdown folders for context-aware conversations. This happens at startup, when users manually trigger a re-index, and continuously via file watchers. After some use I found it annoying to guess if a file was synced when I made changes.

This is a small feature, but I thought it would be fun to talk about the design challenge. The sync indicator should be informative but not intrusive. Ideally as I build out some of the later features in my roadmap, we'll see the sync indicator move around and become less prominent.

State Machine Design

I love state machines, so I modeled sync status as a state machine with four states:

export type SyncStatus = 'idle' | 'syncing' | 'error' | 'watching';

State	Meaning	Visual
`idle`	All sources up to date	Green checkmark, "Last sync: 2 min ago"
`syncing`	Startup or manual sync in progress	Spinner, "Syncing: Source Name"
`watching`	File watcher detected changes	Spinner, "Indexing: filename.md"
`error`	Sync completed with errors	Amber warning, "Sync error"

The transitions:

The key insight: watching is a transient state. When file watchers detect changes, I briefly show indexing activity, then automatically return to idle after 2 seconds of inactivity. This prevents the indicator from flickering during rapid edits while still providing feedback.

The Debounce Pattern

The watching state needs special handling. If a user is actively editing, I don't want the indicator flickering on every keystroke save (the file watches can be that fast).

The solution: a debounce timeout that clears activity after 2 seconds of silence.

setFileIndexed: (sourceId, sourceName, filePath) =>
  set((state) => ({
    // Only transition to 'watching' if not already in a full sync
    status: state.status === 'syncing' ? 'syncing' : 'watching',
    watcherActivity: { sourceId, sourceName, filePath },
  })),

clearWatcherActivity: () =>
  set((state) => ({
    status: state.status === 'watching' ? 'idle' : state.status,
    watcherActivity: null,
  })),

The component resets the timeout on each file event:

if (watcherActivityTimeoutRef.current) {
  clearTimeout(watcherActivityTimeoutRef.current);
}
watcherActivityTimeoutRef.current = setTimeout(() => {
  clearWatcherActivity();
}, 2000);

User saves a file, indicator shows "Indexing: notes.md", then fades back to idle. User rapid-fires saves while editing, indicator stays on "Indexing" until they pause.

Cross-Platform Events

Aethas runs in two modes: desktop (Tauri) with full file system access, and web connecting to a backend server. The sync status needs to work in both.

Tauri has its own event system: the Rust backend emits events, the frontend subscribes via @tauri-apps/api/event. But that doesn't exist in web mode.

I created an abstraction that mirrors Tauri's API:

export async function listen<T>(
  event: string,
  callback: (event: { payload: T }) => void
): Promise<() => void> {
  if (isTauri()) {
    const { listen: tauriListen } = await import('@tauri-apps/api/event');
    return tauriListen(event, callback);
  } else {
    return chatEvents.on(event, (payload) => callback({ payload }));
  }
}

Components just call listen() without knowing which mode they're in. The abstraction handles the rest. We'll be making major changes to this in the future, I've already had the need to use this on my phone and other clients. I fully expect to refactor Aethas to make Tauri a thin-client.

Animation Trick

One subtle detail in the UI component. I render all three icons and use CSS to show/hide them:

Why not conditionally render? Because React would unmount and remount the spinner on state changes, restarting the animation from the beginning. Jarring visual jump. There could be a better way to do this, but this one is quite smooth.

Race Condition on Mount

What if the UI mounts after startup sync has already begun? The "started" event already fired. We'd miss it.

I handle this by checking sync status on mount:

useEffect(() => {
  isSyncing().then((syncing) => {
    if (syncing) {
      setStarted({ source_count: 0 });
    }
  });
}, []);

The Rust backend exposes this via an atomic boolean. Events may fire before your component mounts, always synchronize initial state.

Error UX Philosophy

Errors get collected during sync and displayed after completion. But I intentionally don't show a modal or toast for sync errors.

Why? Sync failures are usually recoverable. File temporarily locked, network hiccup, permission issue that resolves itself. They'll fix on the next sync. Intrusive error UX trains users to ignore notifications.

Instead: the indicator turns amber, the dropdown shows an error count, curious users can expand to see details. I might change this behavior in the future, but for debugging purposes it works pretty well. A resync usually fixes everything.

Key Takeaways

Model sync as a state machine. Explicit states with defined transitions make the logic predictable, debuggable, and testable.
Use transient states for ephemeral activity. The watching state with auto-timeout prevents flicker during rapid changes.
Abstract platform differences early. A unified event system means components work identically in Tauri and web modes.
CSS visibility over conditional rendering for animations. Keeps animations smooth across state transitions.
Match error UX to error severity. Not every error needs a modal. Background errors deserve subtle indicators.
Check state on mount. Events may fire before your component mounts. Always synchronize.

Building Aethas in public. Follow along at aethas.ai

Building a File Upload System for an AI Assistant

Stephen Ashmore — Thu, 08 Jan 2026 15:00:36 GMT

Aethas indexes Obsidian vaults and other sources, making that content searchable via RAG. Users can @mention files from their indexed sources. But what about that PDF someone just sent you? Or a code file from a different project?

The friction: copy file into indexed source, wait for indexing, then @mention it. We wanted: drop file into chat, ask your question.

This post walks through how we built the drag-and-drop file upload system.

Architecture

The system has three layers:

Frontend: The Drop Zone

The heart of the UX is FileDropZone.tsx, a wrapper component that makes any area file-droppable. The component handles drag events, validates files, reads their contents, and adds them to state.

The interesting bits are in the details.

File Type Detection

We support text-based files using both extension and MIME type detection:

function isFileSupported(file: File): boolean {
  // Check extension first (more reliable)
  const extension = getFileExtension(file.name);
  if (extension && SUPPORTED_EXTENSIONS.includes(extension)) {
    return true;
  }
  // Fall back to MIME type
  if (SUPPORTED_MIME_TYPES.includes(file.type)) {
    return true;
  }
  // Catch-all for anything text-like
  if (file.type.startsWith('text/')) {
    return true;
  }
  return false;
}

Why the dual approach? Different browsers report different MIME types for the same file. A .ts file might be application/typescript, text/typescript, or empty depending on the browser and OS. Extension checking is more reliable, MIME types are the fallback, and text/* is the safety net.

Visual Feedback

Users see attached files through UploadedFilesChips including data such as file type emoji, name (truncated if long), size in human-readable format, and a remove button. Small thing, but immediate visual feedback makes drag-and-drop feel responsive.

https://youtu.be/YJl7LIA3ovw

State Management

Uploaded files live in the chat store, separate from @mentioned files:

interface ChatState {
  // @mentioned files - reference indexed content by ID
  selectedFiles: SelectedFile[];

  // Dropped files - ephemeral, contain full content
  uploadedFiles: UploadedFile[];
}

The key distinction: selected files reference indexed content (we look up content via RAG), while uploaded files contain full content in memory. They're ephemeral, not persisted anywhere, just context for this conversation.

We enforce a combined limit of 5 files across both types. Users might @mention some indexed files and upload others in the same message, so the limit applies to the total.

Server: Assembling Context

When a user sends a message, the frontend sends both the message and any attached files. The server receives these and needs to build a prompt for Claude.

For now, we're keeping it simple: uploaded file contents get added directly to the prompt alongside the user's message. The format looks like:

## Files Attached by User

The user attached these files directly to this message:

### config.yaml
yaml
[file contents here]


### utils.ts
typescript
[file contents here]

---

[User's actual message here]

This works, but it's naive. We're not managing token budgets, not prioritizing content, not handling the case where files are too large. The files just get concatenated.

In a future post on Context Architecture, we'll revisit this and show how uploaded files fit into a larger system, one that manages RAG results, conversation history, tool outputs, and token budgets to prevent context rot. But that's getting ahead of ourselves. For now, concatenation gets us to a working feature.

Challenges We Solved

Drag Event Bubbling

HTML drag events bubble in unexpected ways. Dragging over a child element fires dragLeave on the parent even though you're still inside the drop zone. This causes the overlay to flicker as you move the cursor.

The fix: check if the event's relatedTarget is still inside the container before clearing the dragging state.

const handleDragLeave = useCallback((e: DragEvent) => {
  e.preventDefault();
  e.stopPropagation();

  // Only clear if actually leaving the zone, not entering a child
  if (e.relatedTarget && e.currentTarget.contains(e.relatedTarget as Node)) {
    return;
  }
  setIsDragging(false);
}, []);

Small fix, but without it the UX feels broken.

File Reading Errors

Not all files read cleanly as text. Binary files, encoding issues, permission problems. We handle errors gracefully and still show the file so users know which one failed:

try {
  const content = await readFileContent(file);
  addUploadedFile({ ...file, content, status: 'ready' });
} catch (error) {
  addUploadedFile({ ...file, content: '', status: 'error', error: error.message });
}

The file chip shows an error state. Users can remove it and try a different file.

Size Limits

We enforce a 50KB per-file limit client-side. Large files would blow up the context window and provide diminishing returns anyway. If you're uploading a 500KB log file, you probably want to search it, not stuff it into a prompt. In future blog posts, we will be handling large files differently.

The limit is generous enough for code files and notes, restrictive enough to prevent accidents.

Key Takeaways

Keep upload state ephemeral. Uploaded files don't need permanent storage. They're context for this conversation only.
Validate client-side, format server-side. Client validation for UX (immediate feedback), server formatting for consistent prompt structure.
Visual feedback matters. The drop overlay, file chips, and error toasts make drag-and-drop feel responsive. Without them, users don't trust it.
Browser inconsistencies are real. MIME type detection, drag event bubbling — test across browsers or you'll get bug reports.
Start simple, refine later. Concatenating files into the prompt isn't sophisticated, but it works. We'll build proper token budgeting when we tackle Context Architecture.

Building Aethas in public. Follow along at blog.aethas.ai

Building the AI Assistant I Always Wanted

Stephen Ashmore — Sun, 04 Jan 2026 02:12:07 GMT

Back in 1996, when I was just six years old, I watched Star Trek for the first time. I can remember sitting on the floor of my father's study in front of his cathode-ray television. We watched all of the Star Trek TV shows and films that had come out over the years, but I think my first enthrallment with the world was The Next Generation. I vividly remember watching Data and Geordi work technological miracles to save the Enterprise in countless episodes. Imagining what it would be like to have dinner with Captain Picard and ask him what it was like to be Captain of the Enterprise. But behind all of those characters and stories, there was one singular constant: the computer.

The computer of 1996 was nothing like the computer of the Enterprise. My father and I used to play Doom cooperatively at his work over the local-area network. Some days, I went to work with him and spent time away in the data entry room. After the office closed and everyone had left, we would race rolling office chairs back and forth while Doom installed, and then try to beat levels together until my mom arrived to eat dinner with us.

It was mesmerizing to see what the future might hold for humans in Star Trek. The computer of the Enterprise could not only navigate the ship but it could run an entire holodeck. It seemed to know everything that the characters knew and could respond whenever they needed it to. As I grew up and became more fascinated by computers, I saw how far off Gene Roddenberry's vision was from what we actually had.

Thirty years later, the world has dramatically shifted. Large language models have flooded the world with new capabilities and dangers. We may finally be on the cusp of a system that can be as useful as Jarvis from Iron Man or the Enterprise's computer.

The Problem

I'm not one for idle hands. So when I found myself with a free two weeks of holiday time from my day job, I turned my attention to planning my 2026. The process was difficult. I had notes scattered in journals, my Obsidian vault, online, emails, chats, slack, everywhere.

I immediately thought about having Claude or ChatGPT try to parse through all my documents and get the context, but it wasn't quite that simple. For a long time, I've needed an executive assistant to help with all the product work, project management, and other parts of my day job. I needed something proactive though. A tool that could remind me, be autonomous, and know everything I needed to know.

I needed Tony Stark's Jarvis.

So I Started Building It

I named it Aethas, after one of my favorite Dungeons & Dragons characters that I've played. Aethas was a fighter, but not without intellect. He was a tactician, prepared for every contingency, and carried multiple weapons designed to fell any enemy he came across. A system or AI that could do the crazy things that Jarvis could do would need to be equally well-prepared.

Here's where I got after about a week of work over the holidays.

What's Working

To start, I revamped my Obsidian vault. I added new projects, archived old notes, and generally consolidated some of my disparate ideas. Then I built an app that could actually use all of that context.

The core loop works like this:

Point Aethas at your Obsidian vault (or any folder of markdown files)
It indexes everything locally: parsing documents, chunking them intelligently, and generating embeddings
When you ask a question, it searches your knowledge base semantically
Relevant documents get injected into the conversation as context
The AI responds with actual knowledge of your notes

https://youtu.be/xoO6KlQe9GA

The UI shows matching notes with relevance scores. You can see which files the AI is drawing from and manually pin additional context using @mentions, similar to how Claude's file references work.

I can ask things like "What about the bugs we were looking at? Did we manage to fix the Lost Ability to Delete Chats issue?" and Aethas pulls in the relevant files, shows me what it found, and gives me an answer grounded in my actual notes.

The Technical Bits

I'll write more detailed technical posts later, but here's the high-level stack:

Desktop app built with Tauri 2.0: I wanted to try Rust for something real, and Tauri gives me a lightweight desktop app with a React frontend. The whole thing is under 20MB. I also chose Tauri because I want offline capability eventually, with my Obsidian vault staying local to my machine.

Local embeddings: All the vector search happens on-device using a small embedding model. No API calls for indexing, which means it's fast and your notes never leave your machine.

OpenRouter for LLM access: For LLMs, I hooked it up to OpenRouter. Mainly because I had $30 of credit still on my account. This way though I can switch models if I want to.

SQLite for everything: I chose SQLite for simple speed. I’ve been putting everything in it including the conversations, indexed documents, and embeddings.

The interesting part is the context injection. When you send a message, Aethas:

Searches your indexed vault semantically
Deduplicates to get the most relevant files (not just chunks)
Injects the full document content into the system prompt
Streams the response back in real-time

You can also explicitly reference files with @filename, which pins them into context with maximum priority. This is useful when you know exactly what you want to discuss.

What's Next

This is just the foundation. The vision for Aethas is an AI that can actually act on your behalf. I need Aethas to draft emails, create calendar events, file tickets, but only with my approval before execution. I’m going to be focusing on the first of these actions soon.

I'm also thinking about proactive behavior: an assistant that notices you have a meeting in 30 minutes and surfaces relevant context without being asked. Or one that detects you have free time and asks if you want to review your drafted actions.

But that's future work. For now, I have an AI that finally knows what I know, and that alone is already useful. Alongside centralizing my notes into my Obsidian vault, I’m expanding Aethas’ storage and ingestion integrations. I want to add Google Drive and Slack as inputs to Aethas, so it can search my documents and slack similar to other local LLM systems.

Follow Along

I'm building Aethas in public. I'll be posting updates here as I ship new features, make architectural decisions, and inevitably break things.

If you want to follow along:

Subscribe to this blog (button below)
Follow me on Twitter: [@_StephenAshmore]

The code isn't public yet, but it might be eventually. We'll see.

This is post #1 of building Aethas. Next up: we may focus on the action system and draft-approve-execute flow or the finer details of the context system.