Learn how to build an advanced chatbot with a cloud vector database.

In this Blog, We built a QA chatbot that uses a custom knowledge base that is built on a Free cloud vector database — Qdrant.

Hargurjeet
6 min readDec 6, 2023

In the post, I am going to teach you how can you use a free cloud vector database and store your information. And using this stored information to get answers out of your data. In the process, you will also learn to generate embedding free of cost using the Hugging Face.

What Are Vector Databases?

Vector databases refer to databases that store and manage vector data, which consists of numerical values arranged in a specific order. These databases are used to store and analyze large volumes of vector-based data. Vector databases are designed to efficiently handle complex queries and operations on vector data, making them useful for applications in machine learning. To generate vectors from text data to be processed by a Language Model like LLM (Large Language Model), the process typically involves using techniques such as word embeddings or tokenization. In simple terms, embedding is a technique to convert textual information to numbers and there are many ways of doing it. You can read more about work embedding here

Table of Content

  1. Know about Qdrant
  2. Setting up the cloud vector database via Qdrant
  3. Request and Response test from the local machine
  4. Understanding collection
  5. Setting the collection and storing your vectors in collections
  6. Using hugging face embedding
  7. Building QA chatbot from Vector Database knowledge
  8. Conclusion
  9. Reference

№1:Know about Qdrant

Qdrant Cloud is our SaaS (software-as-a-service) solution, providing managed Qdrant instances on the cloud. It provides a fast and reliable similarity search engine but without the need to maintain your infrastructure.

Transitioning from on-premise to the cloud version of Qdrant does not require changing anything in the way you interact with the service. All you have to do is create a Qdrant Cloud account and provide a new API key to each request.

№2:Setting up the Free Cloud vector database via Qdrant

  1. Go to https://qdrant.tech/
  2. Sign in using ‘Github’ or ‘User Account’
  3. Enter a cluster name and click on ‘free tier cluster’

4. Go to the next section and click on the ‘Get API’ key. Make a note of the API key and cluster URL that is all you need.

5. Click on ‘complete’ and you should be able to see your vector database like this. This marks the completion of the database setup. Easy isn’t it 😃

№3:Request and Response to vector database

Now to test if you can make requests and receive responses from this cloud database you can use a ‘Thunder Client’ at your VS code editor. You can download it from the extension if not already available to you.

Now paste the cluster URL at under the GET request and create a variable ‘api_key’ under the header section and paste the api_key.

Now click on send, and you should receive the response confirming the connection is okay something like this

№4:Understanding Collections

A collection is a kind of sub-database or collection of vectors among which you can search. A vector under a collection should be of the same dimension as all other vectors within the collections.

In our use case, We will store all the embedding vectors created in the collections and direct LLM to get the answer by referring to the embedding stored in the collection.

Sample code to create the collection in Python. I have provided the complete code and link to the GitHub repo at the end of this post


# creating client
client = qdrant_client.QdrantClient(
os.getenv('QDRANT_HOST'),
api_key=os.getenv('QDRANT_API_KEY')
)

os.environ['QDRANT_COLLECTION_NAME'] = "my-collection-hf"

# create config
vectors_config=qdrant_client.http.models.VectorParams(
size=768, #instructor has size 768, Open-ai has size 1536
distance=qdrant_client.http.models.Distance.COSINE
)

# create collection
client.create_collection(
collection_name=os.getenv('QDRANT_COLLECTION_NAME'),
vectors_config=vectors_config,
)

№5:Creating a Vector Store

A vector store in a vector database is a data storage system that is optimized for storing and querying vector data. We create a vector store under the collection. Vector store uses an embedding model to embed the input text to the desired embedding. For embedding you can you models from Hugging Face Hub which are free to use but they are slow if you want to you a faster model then you can use OpenAI embedding.

We create a vector store under the collection as follows —

# CREATE VECTOR STORE

embeddings = HuggingFaceInstructEmbeddings(model_name = "hkunlp/instructor-xl")

vectore_store = Qdrant(
client=client,
collection_name=os.getenv('QDRANT_COLLECTION_NAME'),
embeddings=embeddings,
)

№6:Building QA chatbot with knowledge limited to Vector Store

Firstly we need to decide on the knowledge base. In my case, I copied the Wikipedia page of India and pasted it on a text file. This acts as a knowledge base for any queries on India.

Now we use the LANGCHAIN framework to build the QA chatbot. Firstly we use the ‘CharacterTextSplitter’ method to split the text into the chunks of our desired size. In my case, I have chosen 500. We read through the entire text or our knowledge base and split them into the chuck size of 500 and embed them using vector store. This is finally saved in a collection. The following code block will help to achieve this task.

from langchain.text_splitter import CharacterTextSplitter

# Extracting data into chunks
def get_chunks(text):
text_splitter = CharacterTextSplitter(
separator = "\n",
chunk_size = 500,
chunk_overlap = 100,
length_function=len
)
chunks = text_splitter.split_text(text)
return chunks

# Reading the text file
with open("india.txt") as f:
raw_text = f.read()

texts = get_chunks(raw_text)

# Adding the extracted data to vector store
vectore_store.add_texts(texts)

Now we use the knowledge base embedding to create the ‘RetrievalQA’ chain. In this process, the vector store serves as a retriever, allowing the LLM to access it for answering each question. The following code demonstrates the implementation of ‘RetrievalQA’ along with the vector store.

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

qa = RetrievalQA.from_chain_type(
llm = HuggingFaceHub(repo_id="google/flan-t5-xxl", model_kwargs={"temperature":0.7, "max_length":512}),
chain_type = "stuff",
retriever = vectore_store.as_retriever()
)

Now you are all set to perform Q and A to the knowledge base created. To pass a query to the knowledge block you can use the following code block

query = "What do you know about India? Can you answer elaborately"

response = qa.run(query)

print(response)

№7:Conclusion

Qdrant offers a simple method for setting up a vector database. Generating embeddings from the advanced OpenAI model can be costly. By using a vector database, you can save money by reusing stored embeddings instead of creating new ones each time.

Link to Github Repo — https://github.com/hargurjeet/VectoreStore_Streamlit_Projects

Link to the notebook — https://github.com/hargurjeet/VectoreStore_Streamlit_Projects/blob/main/qdrant.ipynb

№8:References

Qdrant — https://qdrant.tech/

Langchain — https://js.langchain.com/docs/get_started/introduction

Thank you for taking the time to read my article. I hope you found it valuable. For more content like this, feel free to follow me on Medium or connect with me on LinkedIn.

Photo by Jan Tinneberg on Unsplash

--

--

Hargurjeet

Data Science Practitioner | Machine Learning | Neural Networks | PyTorch | TensorFlow