Building a Discord LLM Bot with OpenRouter, OpenAI, Gemini, and More

When I started tinkering with self-hosted large language models, my main goal was to see if I could actually use them to help with coding and automate simple tasks. After some successful experiments with basic web UIs, I realized most of my real interaction with friends and collaborators happens on Discord. So I wanted to bring this functionality to where I already communicate the most.

There are a few web-based tools out there—Open WebUI, for example—but I couldn’t find a solid, up-to-date Discord bot that worked with OpenRouter or supported multiple model providers. That’s when I decided to build my own, aiming for something flexible enough to swap out model providers for better features, lower costs, or just to stay on top of new developments.

Early Prototyping

My first working version was just a simple Python bot that replied to messages starting with !—basically, just relaying prompts to an LLM via API calls and posting back the answers. No fancy features, no context memory, and definitely no image handling. Even so, getting a working AI reply inside Discord was already refreshing compared to using clunky web interfaces.

Real-World Issues and Scaling Woes

Letting my friends loose on the bot quickly exposed every edge case. They’d manage to break it every time I added a feature—usually by sending messages or commands I hadn’t even considered. Error handling and input validation became a daily grind.

Trying to support more than two providers revealed architectural problems. Each AI platform’s API has its own quirks—different authentication, slightly different methods, subtle changes in output. My original approach led to repeated code and made it a pain to add new providers. Eventually, I started rewriting the whole thing in TypeScript for better structure and maintainability. With the improved codebase, I could finally start building more modular support for additional providers.

There was another problem: OpenRouter doesn’t support image generation. Image capabilities turned out to be the most popular feature request among my friends. To cover that gap, I integrated AI Horde for general-purpose image generation, and even spun up some Cloudflare Workers to bridge image models like flux1 into Discord.

Key Features and What Worked

  • Image Generation: By far the most used (and requested) feature. Seeing what the AI comes up with is a big hit in group chats.
  • Discord Thread Support: Keeping interactions contained to threads has been a big improvement. It makes conversations cleaner, more manageable, and closer to the Open WebUI experience where each chat is its own space.
  • Multiple Provider Support: With the TypeScript refactor, swapping or adding new AI providers is easier. I’m working toward seamless support for all major APIs, so I (or others) can pick the right model for the job, or fall back if something is down or too expensive.
  • Other Experiments: I tried adding things like a dungeon master/game mode for table top RPG style sessions and an RSS news aggregator. The game mode never really hit the quality I wanted, but it was useful for learning how to better structure code for more complex interaction.

Today’s Example: Fixing File Loading in Different Environments

Today I ran into a common but frustrating dev problem. The bot would load and run perfectly using npm run dev on my machine, but once I built the Docker container for production, the command modules stopped loading. When I finally adjusted things to work in Docker production (npm start), it broke again in development.

It turns out the root cause was file extension handling in the command loader (CommandHandler.ts). In dev mode, Discord loads TypeScript (.ts) files using tools like ts-node; in production (or Docker), everything has been compiled to JavaScript (.js). My bot was hardcoded to only expect commands of one filetype, resulting in missing functionality depending on the environment.

Fixing this meant updating the command loading mechanism to dynamically check which file extension to use based on the current execution environment. Now it reliably loads .ts files during development and .js files in production, making both npm run dev and Docker deployments work.

What’s Next

My focus now is on finalizing support for all major LLM providers—OpenAI, Gemini, Anthropic, and maybe more—so users can switch as needed or use fallback options for cost and reliability. I also want to make self-hosting more straightforward so others can run their own instance and adapt it without running into the same issues I did in the early stages.

Recap

Building this bot has been a back-and-forth of hitting technical snags, fixing problems, rewiring features, and learning along the way. There’s more to do (and always something to break or improve), but being able to pull LLMs and AI-generated images into Discord has already been pretty rewarding for the groups I chat with.

If you’re interested in trying it, need help, or want to contribute please stay tuned as this will be replacing my current project of Gideon. Until then please give that a look over and thanks for reading.

  • eoko