Building a question answering system using semantic search with GPT3
How to build a neural search system that can be used for chat bots, Q&A on books, documentation, websites etc.
Semantic neural search has been around for a long time. As soon as we figured out that we can capture the essence of a sentence using sentence embeddings, we immediately realized that we can store these embeddings in a database and then query these embeddings to find “similar meaning” sentences. The sentences need not have common words. As long as the meaning was similar we can find them.
Immediately people figured out that we can apply this for document search. But with the advent of ChatGPT and exposure of this technology to new people, lots of innovative ideas are cropping up.
We are turning books into chatbots, building interactive interfaces to websites etc.
So I thought it might be a good idea to list down the steps if you want to create a similar application utilizing semantic search.
Step 1: Segment and get embeddings:
Here, we are assuming we have already scraped the website or got the content of a book in a text file. First we need to extract the sentences using segmentation. We can use libraries like https://stanfordnlp.github.io/stanza/tokenize.html. We can also use a basic segmentation with rules like parsing using a full stop “.”
Once you have the sentences, we need to get the embeddings for these sentences. We can use Sentence Transformers for getting the embeddings. In our case, we used our own embeddings which provide a 40-50 times compression of embeddings provided by sentence transformers.
Step 2: Store embeddings in a vector database
You can choose a vector database like Weviate, Pinecone, Jina and Vespa. In our case, we chose Vespa as that has the ability to support bit embeddings and storage efficiency is better.
Step 3: Find nearest neighbors:
Whenever a user submits a query, take the query, get the sentence embedding for the query and search the vector database to find “k” nearest neighbors for the query. You can adjust “k” as per your domain. There are no hard and fast rules, you can experiment and choose a “k”.
Step 3.1: Adjust thresholds to give only valid answers
One important point that has worked for us is to use the distance when fetching the nearest neighbors. If the neighbors are beyond a certain threshold, we don’t include them in the result set. And if the result set has less than a certain neighbors, then we assume our dataset does not have the answer for the query. In this way we can try to control what our AI engine responds to.
Step 4: Generative AI to do the magic
Let’s assume in the above step we got 3 nearest sentences for the user query. You can pass these queries to GPT3 and ask it to generate an answer from these sentences. Prompts like
“Use the below sentences and generate an answer for the query: What is Ozonetel
Sentences:
Ozonetel is a cloud telephony company
Ozonetel is a cloud call center company
ozonetel is an award-winning advanced cloud call center solution
Answer:
”
When we send the above prompt, we get an answer like:
“Ozonetel is an award-winning advanced cloud call center solution that is trusted by thousands of businesses worldwide to manage their customer communications cost-effectively. Our deep CRM integrations ensure high productivity and excellent customer experience thanks to the support across all our pricing plans. Ozonetel is loved by businesses of all sizes and fast-growing businesses love the unmatched flexibility and scalability we offer.”
You can play around with the prompts to generate fun or marketing or fact filled responses. So basically we filter using sentence embeddings and get creative using generative AI. That we get the best of all worlds.
Pretty good I would say :)