llm-service
PublicA fast, secure local LLM inference service with API-key authentication, rate limiting, latency tracking, and structured logging. Runs open-source models locally, exposes both custom and OpenAI-compatible endpoints, and includes a dashboard for testing and monitoring. Docker-ready and easy to deploy.