Use LlamaIndex, Streamlit, and OpenAI to Query Unstructured Data
Introduction
LlamaIndex is a data framework that makes it simple to build production-ready applications from your data using LLMs. Specifically, LlamaIndex specializes in context augmentation, a technique of providing custom data as context for queries to generalized LLMs allowing you to inject your specific contextual information without the trouble and expense of fine-tuning a dedicated model.
In this guide, we will demonstrate how to build an application with LlamaIndex and Streamlit, a Python framework for building and serving data-based applications, and deploy it to Koyeb. The application will deploy an example web app that allows users to ask questions about custom data. In our example, this custom text will be the story "The Gift of the Magi" by O. Henry.
You can deploy and preview the example application from this guide with our LlamaIndex One-Click app or by clicking the Deploy to Koyeb button below:
Be sure to set the OPENAI_API_KEY
environment variable during configuration. You can consult the repository on GitHub to find out more about the example application that this guide uses.
Requirements
To successfully follow and complete this guide, you need:
- Python 3.11 installed on your local computer.
- A GitHub account to host your LlamaIndex application.
- A Koyeb account to deploy and run the preview environments for each pull request.
- An OpenAI API key so that our application can send queries to OpenAI.
Steps
To complete this guide and deploy a LlamaIndex application, you'll need to follow these steps:
- Set up the project directory
- Install project dependencies and fetch custom data
- Create the LlamaIndex application
- Test the application
- Create a Dockerfile
- Deploy to Koyeb
Set up the project directory
To get started, create and then move into a project directory that will hold the application and assets we will be creating:
mkdir example-llamaindex
cd example-llamaindex
Next, create and activate a new Python virtual environment for the project. This will isolate our project's dependencies from system packages to avoid conflicts and offer better reproducability:
python -m venv venv
source venv/bin/activate
Your virtual environment should now be activated.
Install project dependencies and fetch custom data
Now that we are working within a virtual environment, we can begin to install the packages our application will use and set up the project directory.
First, install the LlamaIndex and Streamlit packages so that we can use them to build the application. We can also take this opportunity to make sure that the local copy of pip
is up-to-date:
pip install --upgrade pip llama-index streamlit
After installing the dependencies, record them in a requirements.txt
file so that we can install the correct versions for this project at a later time:
pip freeze > requirements.txt
Next we will download the story that we will be using as context for our LLM prompts. We can download a PDF copy of "The Gift of the Magi" by O. Henry from TSS Publishing a platform for short fiction that hosts free short stories.
Create a data
directory to hold the contextual data for our application and then download a copy of the story by typing:
mkdir data
curl -L https://theshortstory.co.uk/devsitegkl/wp-content/uploads/2015/06/Short-stories-O-Henry-The-Gift-of-the-Magi.pdf -o data/gift_of_the_magi.pdf
You should now have a PDF file that we cat load into our application and attach as context to our LLM queries.
Create the LlamaIndex application
We have everything in place to start writing our LlamaIndex application. Create an app.py
file in your project directory and paste in the following content:
# app.py
import os.path
import streamlit as st
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
load_index_from_storage,
)
# check if storage already exists
PERSIST_DIR = "./storage"
if not os.path.exists(PERSIST_DIR):
# load the documents and create the index
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
# store it for later
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
# load the existing index
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine()
# Define a simple Streamlit app
st.title('Ask Llama about "The Gift of the Magi"')
query = st.text_input("What would you like to ask? (source: data/gift_of_the_magi.pdf)", "What happens in the story?")
# If the 'Submit' button is clicked
if st.button("Submit"):
if not query.strip():
st.error(f"Please provide the search query.")
else:
try:
response = query_engine.query(query)
st.success(response)
except Exception as e:
st.error(f"An error occurred: {e}")
Let's go over what the application is doing.
It begins by importing the basic packages and modules it will use to create the application. This includes Streamlit (aliased as st
) as well as functionality from LlamaIndex for loading data and indexes from directories, building vector indexes, and attaching contexts.
Next, we set up some semi-persistent storage for the index files. This logic helps us avoid creating an index from our data document every time we run the application by storing index information in a storage
directory the first time it is evaluated. The application can load the index data from the storage
directory directly on subsequent runs to increase performance.
Afterward the index is created or loaded, we create a query engine based on it and begin constructing the application frontend with Streamlit. We add a input field that will be translated into our query and then display the results upon submission.
Test the application
We can test that the application works as expected on your local machine.
First, set and export the OPENAI_API_KEY
environment variable using your OpenAI API key as the value:
export OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"
Next, run the application by typing:
streamlit run app.py
This will start the application server. Navigate to http://127.0.0.1:8501
in your web browser to view the page prompting for your questions about "The Gift of the Magi". You can submit the default query or ask any other questions you have about the story.
When you are finished, press CTRL-C to stop the server.
Create a Dockerfile
During the deployment process, Koyeb will build our project from a Dockerfile
. This gives us more control over the version of Python, the build process, and the runtime environment. The next step is to create this Dockerfile
describing how to build and run the project.
Before we begin, create a basic .dockerignore
command. This will define files and artifacts that we don't want to copy over into the Docker image. In this case, we want to avoid copying the venv/
and storage/
directories since the image will install dependencies and manage cached index files at runtime:
# .dockerignore
storage/
venv/
Next, create a Dockerfile
with the following contents:
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install --requirement requirements.txt && pip cache purge
ARG PORT
EXPOSE ${PORT:-8000}
CMD streamlit run --server.port ${PORT:-8000} app.py
The image uses the 3.11-slim
tag of the official Python Docker image as its starting point. It defines a directory called /app
as the working directory and copies all of the project files inside. Afterwards, it installs the dependencies from the requiremnts.txt
file.
The PORT
environment variable is also defined as a build variable. This allows us to set it at build time to adjust the port that the image will listen on. We use this value in the EXPOSE
instruction and again in the main streamlit
command we run with the CMD
instruction. Both values will use port 8000 as a fallback if PORT
is not defined explicitly.
Publish the repository to GitHub
The application is almost ready to deploy. We just need to commit the changes to Git and push the repository to GitHub.
In the project directory, initialize a new Git repository by running the following command:
git init
Next, download a generic .gitignore
file for Python projects from GitHub:
curl -L https://raw.githubusercontent.com/github/gitignore/main/Python.gitignore -o .gitignore
Add the storage/
directory to the .gitignore
file so that Git ignores the cached vector index files:
echo "storage/" >> .gitignore
You can also optionally add the Python runtime version to a runtime.txt
file if you want to try to build the repository from the Python buildpack instead of the Dockerfile:
echo "python-3.11.8" > runtime.txt
Next, add the project files to the staging area and commit them. If you don't have an existing GitHub repository to push the code to, create a new one and run the following commands to commit and push changes to your GitHub repository:
git add :/
git commit -m "Initial commit"
git remote add origin git@github.com:<YOUR_GITHUB_USERNAME>/<YOUR_REPOSITORY_NAME>.git
git branch -M main
git push -u origin main
Note: Make sure to replace <YOUR_GITHUB_USERNAME>/<YOUR_REPOSITORY_NAME>
with your GitHub username and repository name.
Deploy to Koyeb
Once the repository is pushed to GitHub, you can deploy the LlamaIndex application to Koyeb. Any changes in the deployed branch of your codebase will automatically trigger a redeploy on Koyeb, ensuring that your application is always up-to-date.
To get started, open the Koyeb control panel and complete the following steps:
- On the Overview tab, click Create Web Service.
- Select GitHub as the deployment method.
- Choose the repository containing your application code. Alternatively, you can enter our public LlamaIndex example repository into the Public GitHub repository at the bottom of the page:
https://github.com/koyeb/example-llamaindex
. - In the Builder section, choose Dockerfile.
- Choose an Instance of size micro or larger.
- Expand the Environment variables section and click Add variable to configure a new environment variable. Create a variable called
OPENAI_API_KEY
. Select the Secret type and choose Create secret in the value. In the form that appears, create a new secret containing your OpenAI API key. - Choose a name for your App and Service, for example
example-llamaindex
, and click Deploy.
Koyeb will clone the GitHub repository and use the Dockerfile
file to build a new container image for the project. Once the build is complete, a container will be started from the image to run your application.
Once the deployment is healthy, visit your Koyeb Service's subdomain (you can find this on your Service's detail page). It will have the following format:
https://<YOUR_APP_NAME>-<KOYEB_ORG_NAME>.koyeb.app
You should see your LlamaIndex application's prompt, allowing you to ask questions about the story and get responses from the OpenAI API.
Conclusion
In this guide, we discussed how to build and deploy an LLM-based web app to Koyeb using LlamaIndex and Streamlit. The application loads a story from a PDF on disk and sends this as additional context when submitting user-supplied queries. This allows you to customize the focus of the query without having to fine-tune a model for the purpose.
This tutorial demonstrates a very simple implementation of these technologies. To learn more about how LlamaIndex can help you use LLMs to answer questions about your own data, take a look at the LlamaIndex documentation.