Interact with your documents privately using local LLMs, without the need for internet connection.
PrivateGPT is an open-source project, prominently hosted on GitHub by Zylon AI (repository: zylon-ai/private-gpt
), designed to enable users to interact with their documents using the power of Large Language Models (LLMs) with complete privacy. Its core mission is to ensure that no data leaves the user's local execution environment at any point, making it an ideal solution for individuals and organizations dealing with sensitive or confidential information.
The project provides a production-ready AI setup that allows you to ingest documents and ask questions about them, leveraging local LLMs and embedding models, all without needing an internet connection (after initial setup and model downloads). It offers an API that follows OpenAI's schema, making it compatible with many existing tools and client libraries.
PrivateGPT offers a suite of features focused on private, local AI document interaction:
llama.cpp
, Ollama
, Hugging Face Transformers
, and vLLM
, offering flexibility in model choice and hardware utilization.curl
./v1/chat/completions
) and embeddings (/v1/embeddings
).settings-local.yaml
, settings-ollama.yaml
, or custom profiles via PGPT_PROFILES
environment variable) to manage LLM choices, embedding models, vector stores, and other parameters.PrivateGPT is particularly valuable for scenarios where data privacy and local processing are paramount:
Setting up and using PrivateGPT involves several steps, typically performed in a Python environment:
pip
with requirements.txt
).llama-cpp-python
).make
(optional, but helpful for running scripts).git clone [https://github.com/zylon-ai/private-gpt.git](https://github.com/zylon-ai/private-gpt.git)
cd private-gpt
# Ensure Poetry is installed (see official Poetry website)
# Upgrade Poetry to a tested version if needed (e.g., poetry self update 1.8.3)
poetry install --with ui,local # Installs core, UI, and local LLM dependencies
# Or for specific backends/features:
# poetry install --extras "ui llms-ollama embeddings-huggingface vector-stores-qdrant"
requirements.txt
is usually available for pip-based installation.gpt4all-j
or a similar small model, and a sentence-transformer embedding model):
poetry run python scripts/setup
models
directory. You can also manually download other GGUF models and configure PrivateGPT to use them.local
, openai
, ollama
) managed by YAML files (e.g., settings-local.yaml
).PGPT_PROFILES
environment variable (e.g., export PGPT_PROFILES=local
).settings-*.yaml
file or create a settings.yaml
to customize:
llm
: Specify the LLM mode (llamacpp
, ollama
, openai
, huggingface
), model path or name, context window, temperature, etc.embedding
: Specify the embedding mode (huggingface
, openai
, ollama
), model name.vectorstore
: Choose the vector store (chroma
, qdrant
, lancedb
) and its settings.source_documents
directory (or a custom directory configured in your settings).make ingest
# Or poetry run python -m private_gpt ingest
PGPT_PROFILES=local make run
# Or poetry run python -m private_gpt run
This typically starts a FastAPI server (e.g., on http://localhost:8001
) providing OpenAI-compatible API endpoints and a Gradio web UI accessible through your browser.curl
for chat):
curl -X POST "http://localhost:8001/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "your-configured-llm-name",
"messages": [{"role": "user", "content": "What are the main points in my documents about [topic]?"}],
"use_context": true
}'
make run
) and ask questions about your ingested documents.Hardware needs depend significantly on the size of the LLM and embedding models used:
llama.cpp
), a compatible NVIDIA GPU (CUDA) or AMD GPU (ROCm on Linux) or Apple Silicon (Metal) can significantly accelerate LLM inference if the chosen LLM backend and model support it.PrivateGPT is a free and open-source project, licensed under the Apache 2.0 License.
Q1: What is PrivateGPT? A1: PrivateGPT is an open-source AI project that allows you to ingest your documents and ask questions about them using Large Language Models (LLMs) that run entirely on your local machine. This ensures 100% privacy as no data leaves your execution environment.
Q2: How does PrivateGPT ensure privacy? A2: All components of PrivateGPT, including document parsing, embedding generation, vector storage, and LLM inference, are designed to run locally on your hardware. It does not require an internet connection to function (after initial setup and model downloads) and does not send your documents or queries to any third-party cloud services.
Q3: What types of documents can I use with PrivateGPT? A3: PrivateGPT, through its integration with LlamaIndex, supports a wide range of document formats, including PDF, TXT, DOCX (Microsoft Word), CSV, MD (Markdown), EML (Email), EPUB, HTML, and PPTX (PowerPoint).
Q4: What AI models does PrivateGPT use?
A4: PrivateGPT is flexible and can be configured to use various open-source LLMs (primarily in GGUF format via llama.cpp
or models via Ollama/Hugging Face Transformers) and embedding models (typically sentence-transformers from Hugging Face). Default setup often includes a smaller, generally capable model like GPT4All-J or a Mistral variant for the LLM, and a common sentence-transformer for embeddings.
Q5: Do I need a powerful GPU to run PrivateGPT? A5: While a GPU (NVIDIA, AMD, Apple Metal) will significantly improve performance for larger LLMs, PrivateGPT is designed to be runnable on CPU-only setups, especially with quantized GGUF models. Sufficient RAM is crucial regardless of GPU availability.
Q6: Is PrivateGPT free? A6: Yes, PrivateGPT is free and open-source software, licensed under the Apache 2.0 license.
Q7: How do I interact with PrivateGPT? A7: You can interact with PrivateGPT through its FastAPI-based API (which is OpenAI compatible, allowing use with various client libraries) or through the provided Gradio web UI for a more user-friendly chat experience with your documents.
Q8: Can I use PrivateGPT for commercial purposes? A8: The PrivateGPT software itself is Apache 2.0 licensed, which generally permits commercial use. However, you must also comply with the licenses of the specific LLMs and embedding models you choose to run with PrivateGPT, as these models have their own separate licenses (e.g., Llama 2 has specific commercial use conditions).
Here are examples of the types of articles and guides you can find online to help you get started and explore advanced uses of PrivateGPT:
(To find current and relevant articles, search for terms like "PrivateGPT tutorial," "PrivateGPT setup guide," "PrivateGPT [your OS] install," "PrivateGPT [specific LLM] guide" on Google, Medium, DEV.to, YouTube, and other tech communities.)
Last updated: May 16, 2025
The pure C implementation of the Llama 2 model developed by Andrej Karpathy, optimized for efficient inference.
A self-hosted, community-driven, and locally-focused OpenAI alternative. No GPU required, run language models and embeddings on CPU.
A framework for running, setting up, and using large language models on your local machine.