Build a Web-based GPT Chatbot with Custom Knowledge
The advancement of generative AI is remarkable. Now, with approximately 70 lines of Python code predominantly leveraging OpenAI GPT, llama-index, and Streamlit, you can craft chatbots infused with your specialized domain knowledge. This enables the creation of a personalized assistant tailored specifically for your needs.
This post is going to share how you can build your own powerful Chatbot for your own data!
1 Dependency
A few Python packages are needed for this app:
pip install streamlit openai llama-index nltk
Also, please get an OpenAI API key by following this guide:
- Go to https://platform.openai.com/account/api-keys.
- Click on the
+ Create new secret key
button. - Enter an identifier name (optional) and click on the
Create secret key
button. - Copy the API key to be used in this tutorial (the key shown below was already revoked).
2 Create the Chatbot
The project folder can be set up like this:
chatbot
| |_main.py
| |_data
| |_.streamlit
| |_secrets.toml
These files are:
- a
main.py
to store the code of the app. - a
data
folder that stores the input data. secrets.toml
in the.streamlit
folder that has the Open AI API key, like openai_key = 'Your OpenAI API key'
The main.py
looks like this, this is all that it needs to pull up the chatbot:
import streamlit as st
from llama_index import VectorStoreIndex, ServiceContext, Document
from llama_index.llms import OpenAI
import openai
from llama_index import SimpleDirectoryReader
st.set_page_config(page_title="Chat with the Streamlit docs, powered by LlamaIndex", page_icon="๐ฆ", layout="centered", initial_sidebar_state="auto", menu_items=None)
openai.api_key = st.secrets.openai_key
st.title("Chat with the domain knowledge, powered by LlamaIndex ๐ฌ๐ฆ")
st.info("Check out the full tutorial to build this app in our [blog post](https://blog.streamlit.io/build-a-chatbot-with-custom-data-sources-powered-by-llamaindex/)", icon="๐")
if "messages" not in st.session_state.keys(): # Initialize the chat messages history
st.session_state.messages = [
{"role": "assistant", "content": "Ask me a question about the fils you uploaded just now!"}
]
# Utilities functions
def save_uploadedfile(uploadedfile):
import os
with open(os.path.join("data",uploadedfile.name),"wb") as f:
f.write(uploadedfile.getbuffer())
return st.success(f"File saved to data folder!" )
@st.cache_resource(show_spinner=False)
def load_data():
with st.spinner(text="Loading and indexing the Streamlit docs โ hang tight! This should take 1-2 minutes."):
reader = SimpleDirectoryReader(input_dir="./data", recursive=True)
docs = reader.load_data()
service_context = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo", temperature=0.5, system_prompt="You are an expert on the fraud detection and your job is to answer technical questions. Assume that all questions are related to the credit card fraud detection. Keep your answers technical and based on facts โ do not hallucinate features."))
index = VectorStoreIndex.from_documents(docs, service_context=service_context)
return index
# Main app
datafile = st.file_uploader("Upload your data (string only)",type=['str','csv','txt'])
if datafile is not None:
save_uploadedfile(datafile)
index = load_data()
if "chat_engine" not in st.session_state.keys(): # Initialize the chat engine
st.session_state.chat_engine = index.as_chat_engine(chat_mode="condense_question", verbose=True)
if prompt := st.chat_input("Your question"): # Prompt for user input and save to chat history
st.session_state.messages.append({"role": "user", "content": prompt})
for message in st.session_state.messages: # Display the prior chat messages
with st.chat_message(message["role"]):
st.write(message["content"])
# If last message is not from assistant, generate a new response
if st.session_state.messages[-1]["role"] != "assistant":
with st.chat_message("assistant"):
with st.spinner("Thinking..."):
response = st.session_state.chat_engine.chat(prompt)
st.write(response.response)
message = {"role": "assistant", "content": response.response}
st.session_state.messages.append(message) # Add response to message history
Intuitively, it accepts the input from users, in this example it only accepts text and csv files, you can tweak the input types following the Streamlit guide.
Then the input file are saved and used to create a vector store, which has been used for the GPT to refer to in the question-answering sessions.
3 Activate the Chatbot
In the terminal of your laptop, run command:
streamlit run main.py
The the app is up and you can upload your own files to the app, once that is done, the chat box will appear and you can ask questions regarding the data.
The sample credit card transaction data can be downloaded at datacamp.
![](https://cdn.jsdelivr.net/gh/BulletTech2021/Pics/img/1_V/bot_2.png)
Reference
LlamaIndex app demo on Streamlit Blog
![](https://blog.streamlit.io/content/images/2023/08/LinkedIn-post--Expand-me---11.png)
Streamlit file uploader
![](https://docs.streamlit.io/sharing-image-facebook.jpg)