For example, if a user asks, 'I gained 3 kg during the holiday season. How do I lose them?' — The AI assistant will consult the FAQ database and suggest an article titled 'How to Lose Weight.'
- How to organize your product or company’s knowledge base
- How to teach a virtual assistant information extraction from the knowledge base and form responses to user questions.
- How to manage costs: Utilize LLM and avoid hefty expenses with OpenAI or other providers, especially if you have a large customer base.
Ingredients
- Chatwoot: An open-source operator interface and knowledge base.
- Rasa: An open-source framework for creating chatbots.
- Botfront: A visual interface for building chatbots on RASA.
- Qdrant: A vector database storing vector representations of articles from the knowledge base.
- Datapipe: The ETL tool extracts articles from Chatwoot, processes them, and places them into Qdrant.
Recipe
It's best to keep each article short and focused on one topic.
2. Programming: converting all knowledge base articles into vector form
- Documents should be segmented so each vector corresponds to a single logical theme. This is crucial because with more encoded text comes an average and fuzzy resulting vector. Consequently, it becomes more challenging to perceive any specific theme within it. Therefore, initial document segmentation into parts is essential, and there is no one-size-fits-all solution here. Typically, segmentation is done using structural heuristics (chapters or paragraphs), then refined by Next Sentence Prediction models (NSP). Finally, it is verified by a human. This step wasn't necessary in the context of FAQs since we only had short answers. However, to enrich the search field, we generated human-like questions for the answers and a synthetic "answer representation." This is then converted into a vector and added as examples for the target article.
- We need to choose an effective method for vector generation. We've employed the encoder from OpenAI or the multilingual-e5 model. Both are effective due to their training on parallel Corpus in multiple languages.
3. Programming: Configuring the FAQ Service
4. Programming: Configuring the Chatbot Assistant
5. Optional: Free-form Responses with LLM (Using RAG, Retrieval-augmented Generation)
Our client faced precisely this scenario, so we disabled response generation, leaving only article-based responses from the knowledge base. This approach avoids using expensive LLMs for every query.
6. Programming: ensuring overall relevance
We must ensure that RASA correctly identifies the intent of the FAQ. When the chatbot undergoes retraining, RASA captures the most diverse set of data possible to cover the entire search field with minimal examples and adds it to the training.
Project Conclusions
Statistics
As a result of the implementation, the number of support requests handled by the chatbot increased from 30% to 70%. The content team continues to add articles so that the chatbot can handle an increasing number of requests.