RAG.io
Retrieval-Augmented Generation.
Current Features
- Intelligent Document Search : ChromaDB-powered semantic search with adjustable retrieval parameters
- Multi-Provider LLM Support : Seamless integration with OpenAI, Claude, Gemini, Ollama, and 10+ providers
- Temperature Control : Per-conversation adjustment (0.0-2.0)
- Project-Based Organization : Isolate document collections and conversations by project
- Real-Time Streaming : Server-Sent Events for progressive responses
- Fine-Grained Control : Per-conversation temperature, top-k, and context window management
- Enterprise Security : JWT authentication, AES-256 encryption, GDPR compliance
Advanced Features
- Supported Formats : PDF, DOCX, TXT, MD, HTML, CSV, JSON (50+ file types)
- Smart Chunking : Adaptive chunk size (100-2000 tokens) with configurable overlap
- Metadata Extraction : Automatic filename, page number, and document type tagging
- Token Tracking : Real-time token counting for cost estimation
- Batch Processing : Background async processing with progress tracking
Providers Supported
- OpenAI : GPT-4o, GPT-4-turbo, o1-preview, o1-mini.
- Anthropic Claude : Claude 3.5 Sonnet, Claude 3 Opus
- Google Gemini : Gemini 1.5 Pro/Flash, Gemini 2.0
- OpenRouter : 200+ models (free + paid)
- xAI Grok : Grok-3, Grok-3-mini, Grok-3-vision
- Groq : Mixtral, LLaMA 3, Gemma
- HuggingFace : Zephyr, Mistral, LLaMA 2
- Ollama : ollama pull llama3
- LM Studio : GUI app
- vLLM : Python + CUDA
- LMDeploy : Python + TurboMind
- Oobabooga : Web UI
About
This project is licensed under the MIT License - see the LICENSE file for details.
Contact me
Built with ❤️ for the LLM community
For questions, suggestions, or support, please open an issue or contact the maintainers.@me.