Building a File Upload System for an AI Assistant

Aethas indexes Obsidian vaults and other sources, making that content searchable via RAG. Users can @mention files from their indexed sources. But what about that PDF someone just sent you? Or a code file from a different project?

The friction: copy file into indexed source, wait for indexing, then @mention it. We wanted: drop file into chat, ask your question.

This post walks through how we built the drag-and-drop file upload system.

Architecture

The system has three layers:

Frontend: The Drop Zone

The heart of the UX is FileDropZone.tsx, a wrapper component that makes any area file-droppable. The component handles drag events, validates files, reads their contents, and adds them to state.

The interesting bits are in the details.

File Type Detection

We support text-based files using both extension and MIME type detection:

function isFileSupported(file: File): boolean {
  // Check extension first (more reliable)
  const extension = getFileExtension(file.name);
  if (extension && SUPPORTED_EXTENSIONS.includes(extension)) {
    return true;
  }
  // Fall back to MIME type
  if (SUPPORTED_MIME_TYPES.includes(file.type)) {
    return true;
  }
  // Catch-all for anything text-like
  if (file.type.startsWith('text/')) {
    return true;
  }
  return false;
}

Why the dual approach? Different browsers report different MIME types for the same file. A .ts file might be application/typescript, text/typescript, or empty depending on the browser and OS. Extension checking is more reliable, MIME types are the fallback, and text/* is the safety net.

Visual Feedback

Users see attached files through UploadedFilesChips including data such as file type emoji, name (truncated if long), size in human-readable format, and a remove button. Small thing, but immediate visual feedback makes drag-and-drop feel responsive.

https://youtu.be/YJl7LIA3ovw

State Management

Uploaded files live in the chat store, separate from @mentioned files:

interface ChatState {
  // @mentioned files - reference indexed content by ID
  selectedFiles: SelectedFile[];

  // Dropped files - ephemeral, contain full content
  uploadedFiles: UploadedFile[];
}

The key distinction: selected files reference indexed content (we look up content via RAG), while uploaded files contain full content in memory. They're ephemeral, not persisted anywhere, just context for this conversation.

We enforce a combined limit of 5 files across both types. Users might @mention some indexed files and upload others in the same message, so the limit applies to the total.

Server: Assembling Context

When a user sends a message, the frontend sends both the message and any attached files. The server receives these and needs to build a prompt for Claude.

For now, we're keeping it simple: uploaded file contents get added directly to the prompt alongside the user's message. The format looks like:

## Files Attached by User

The user attached these files directly to this message:

### config.yaml
yaml
[file contents here]


### utils.ts
typescript
[file contents here]

---

[User's actual message here]

This works, but it's naive. We're not managing token budgets, not prioritizing content, not handling the case where files are too large. The files just get concatenated.

In a future post on Context Architecture, we'll revisit this and show how uploaded files fit into a larger system, one that manages RAG results, conversation history, tool outputs, and token budgets to prevent context rot. But that's getting ahead of ourselves. For now, concatenation gets us to a working feature.

Challenges We Solved

Drag Event Bubbling

HTML drag events bubble in unexpected ways. Dragging over a child element fires dragLeave on the parent even though you're still inside the drop zone. This causes the overlay to flicker as you move the cursor.

The fix: check if the event's relatedTarget is still inside the container before clearing the dragging state.

const handleDragLeave = useCallback((e: DragEvent<HTMLDivElement>) => {
  e.preventDefault();
  e.stopPropagation();

  // Only clear if actually leaving the zone, not entering a child
  if (e.relatedTarget && e.currentTarget.contains(e.relatedTarget as Node)) {
    return;
  }
  setIsDragging(false);
}, []);

Small fix, but without it the UX feels broken.

File Reading Errors

Not all files read cleanly as text. Binary files, encoding issues, permission problems. We handle errors gracefully and still show the file so users know which one failed:

try {
  const content = await readFileContent(file);
  addUploadedFile({ ...file, content, status: 'ready' });
} catch (error) {
  addUploadedFile({ ...file, content: '', status: 'error', error: error.message });
}

The file chip shows an error state. Users can remove it and try a different file.

Size Limits

We enforce a 50KB per-file limit client-side. Large files would blow up the context window and provide diminishing returns anyway. If you're uploading a 500KB log file, you probably want to search it, not stuff it into a prompt. In future blog posts, we will be handling large files differently.

The limit is generous enough for code files and notes, restrictive enough to prevent accidents.

Key Takeaways

Keep upload state ephemeral. Uploaded files don't need permanent storage. They're context for this conversation only.
Validate client-side, format server-side. Client validation for UX (immediate feedback), server formatting for consistent prompt structure.
Visual feedback matters. The drop overlay, file chips, and error toasts make drag-and-drop feel responsive. Without them, users don't trust it.
Browser inconsistencies are real. MIME type detection, drag event bubbling — test across browsers or you'll get bug reports.
Start simple, refine later. Concatenating files into the prompt isn't sophisticated, but it works. We'll build proper token budgeting when we tackle Context Architecture.

Building Aethas in public. Follow along at blog.aethas.ai

Building a File Upload System for an AI Assistant

Architecture

Frontend: The Drop Zone

File Type Detection

Visual Feedback

State Management

Server: Assembling Context

Challenges We Solved

Drag Event Bubbling

File Reading Errors

Size Limits

Key Takeaways

Comments

More from this blog

Refactoring to a Hybrid Cloud Architecture

Building a Sync Status Indicator

Building the AI Assistant I Always Wanted

Command Palette

Architecture

Frontend: The Drop Zone

File Type Detection

Visual Feedback

State Management

Server: Assembling Context

Challenges We Solved

Drag Event Bubbling

File Reading Errors

Size Limits

Key Takeaways

Comments

More from this blog