Building a File Upload System for an AI Assistant
Aethas indexes Obsidian vaults and other sources, making that content searchable via RAG. Users can @mention files from their indexed sources. But what about that PDF someone just sent you? Or a code file from a different project?
The friction: copy file into indexed source, wait for indexing, then @mention it. We wanted: drop file into chat, ask your question.
This post walks through how we built the drag-and-drop file upload system.
Architecture
The system has three layers:

Frontend: The Drop Zone
The heart of the UX is FileDropZone.tsx, a wrapper component that makes any area file-droppable. The component handles drag events, validates files, reads their contents, and adds them to state.
The interesting bits are in the details.
File Type Detection
We support text-based files using both extension and MIME type detection:
function isFileSupported(file: File): boolean {
// Check extension first (more reliable)
const extension = getFileExtension(file.name);
if (extension && SUPPORTED_EXTENSIONS.includes(extension)) {
return true;
}
// Fall back to MIME type
if (SUPPORTED_MIME_TYPES.includes(file.type)) {
return true;
}
// Catch-all for anything text-like
if (file.type.startsWith('text/')) {
return true;
}
return false;
}
Why the dual approach? Different browsers report different MIME types for the same file. A .ts file might be application/typescript, text/typescript, or empty depending on the browser and OS. Extension checking is more reliable, MIME types are the fallback, and text/* is the safety net.
Visual Feedback
Users see attached files through UploadedFilesChips including data such as file type emoji, name (truncated if long), size in human-readable format, and a remove button. Small thing, but immediate visual feedback makes drag-and-drop feel responsive.
State Management
Uploaded files live in the chat store, separate from @mentioned files:
interface ChatState {
// @mentioned files - reference indexed content by ID
selectedFiles: SelectedFile[];
// Dropped files - ephemeral, contain full content
uploadedFiles: UploadedFile[];
}
The key distinction: selected files reference indexed content (we look up content via RAG), while uploaded files contain full content in memory. They're ephemeral, not persisted anywhere, just context for this conversation.
We enforce a combined limit of 5 files across both types. Users might @mention some indexed files and upload others in the same message, so the limit applies to the total.
Server: Assembling Context
When a user sends a message, the frontend sends both the message and any attached files. The server receives these and needs to build a prompt for Claude.
For now, we're keeping it simple: uploaded file contents get added directly to the prompt alongside the user's message. The format looks like:
## Files Attached by User
The user attached these files directly to this message:
### config.yaml
yaml
[file contents here]
### utils.ts
typescript
[file contents here]
---
[User's actual message here]
This works, but it's naive. We're not managing token budgets, not prioritizing content, not handling the case where files are too large. The files just get concatenated.
In a future post on Context Architecture, we'll revisit this and show how uploaded files fit into a larger system, one that manages RAG results, conversation history, tool outputs, and token budgets to prevent context rot. But that's getting ahead of ourselves. For now, concatenation gets us to a working feature.
Challenges We Solved
Drag Event Bubbling
HTML drag events bubble in unexpected ways. Dragging over a child element fires dragLeave on the parent even though you're still inside the drop zone. This causes the overlay to flicker as you move the cursor.
The fix: check if the event's relatedTarget is still inside the container before clearing the dragging state.
const handleDragLeave = useCallback((e: DragEvent<HTMLDivElement>) => {
e.preventDefault();
e.stopPropagation();
// Only clear if actually leaving the zone, not entering a child
if (e.relatedTarget && e.currentTarget.contains(e.relatedTarget as Node)) {
return;
}
setIsDragging(false);
}, []);
Small fix, but without it the UX feels broken.
File Reading Errors
Not all files read cleanly as text. Binary files, encoding issues, permission problems. We handle errors gracefully and still show the file so users know which one failed:
try {
const content = await readFileContent(file);
addUploadedFile({ ...file, content, status: 'ready' });
} catch (error) {
addUploadedFile({ ...file, content: '', status: 'error', error: error.message });
}
The file chip shows an error state. Users can remove it and try a different file.
Size Limits
We enforce a 50KB per-file limit client-side. Large files would blow up the context window and provide diminishing returns anyway. If you're uploading a 500KB log file, you probably want to search it, not stuff it into a prompt. In future blog posts, we will be handling large files differently.
The limit is generous enough for code files and notes, restrictive enough to prevent accidents.
Key Takeaways
Keep upload state ephemeral. Uploaded files don't need permanent storage. They're context for this conversation only.
Validate client-side, format server-side. Client validation for UX (immediate feedback), server formatting for consistent prompt structure.
Visual feedback matters. The drop overlay, file chips, and error toasts make drag-and-drop feel responsive. Without them, users don't trust it.
Browser inconsistencies are real. MIME type detection, drag event bubbling — test across browsers or you'll get bug reports.
Start simple, refine later. Concatenating files into the prompt isn't sophisticated, but it works. We'll build proper token budgeting when we tackle Context Architecture.
Building Aethas in public. Follow along at blog.aethas.ai

