This tool allows you to "chat" with your documents. Instead of relying solely on general knowledge, this AI looks up specific information from the files and links you provide to give you accurate, sourced answers.
We initially experimented with the local sentence-transformers/all-MiniLM-L6-v2 model.
However, to optimize cloud deployment and minimize server compute load, we opted for an API-based embedding approach using OpenAI.
We conducted a Proof of Concept using RAGAS for benchmarking.
To keep API token load low in production, we replaced the RAGAS framework with a lightweight, built-in "LLM-as-a-judge" to verify accuracy on the fly.
The application is fully containerized using Docker. This ensures consistent performance and streamlines the CI/CD pipeline.
The live site is hosted on Microsoft Azure App Service. The Docker image is pushed to a private Azure Container Registry (ACR).
This application uses a technique called Retrieval-Augmented Generation (RAG). Here is the process:
Any document you add via the "Add Knowledge" tab is considered temporary. To keep the database clean and efficient, user uploads are automatically deleted 1 hour after creation.
The system automatically "grades" every answer for accuracy. Check the badge above the answer.
Note: Documents uploaded here are temporary. They will be automatically deleted from the database 1 hour after upload.