Ada Support

Knowledge base best practices for generative AI

Jérôme Solis
Senior Product Manager
AI & Automation | 9 min read

The rise of generative AI has brought significant advancements to the fields of natural language processing and conversational interfaces — the parameter size of Large Language Models (LLMs) is growing exponentially along with the perceived quality of their predictions. When users ask a question, the AI seems to provide an elegant answer using natural language, to varying degrees of confidence in the knowledge.

This is a key point to unpack. The “knowledge” that the AI is serving to users is distilled from vast amounts of data from all over the internet. If you’re using a publicly available LLM to power AI in a customer-facing chatbot, the AI is at risk of serving answers that are irrelevant, inaccurate, or harmful.

More than ever, subject matter expertise and reliable sources are necessary to make the most of generative AI. It’s critical to ensure that the AI you’re using in customer service automation only distills information found in your support documentation.The easiest and fastest way to do this is to train the AI to use your knowledge bases (KBs).

In this blog post, we will discuss best practices for knowledge management to make the most out of generative AI for customer service automation while ensuring it remains relevant, accurate, and safe.

Best practices for building a knowledge base

1. Mutually exclusive and collectively exhaustive ontology

As the saying goes “measure twice, cut once.” Thoughtful planning and preparation of your KB architecture upfront will save you so much wasted time and effort in the future.

We refer to KB architecture as ontology. Categories at each level of the knowledge tree should be mutually exclusive and collectively exhaustive. Let’s dive into that:

  1. Mutually exclusive means that no two categories contain knowledge that overlaps. Having mutually exclusive categories ensures that you have a single source of truth for a piece of information. This removes the risk of inconsistency and if you ever need to update the information, you only ever have to update it in one place.
  2. Collectively exhaustive means that all together, the categories cover all the information that your customers need to know or may ask about.

With a mutually exclusive and collectively exhaustive ontology, the AI will be able to find the answer to a large percentage of customer questions, and deliver it with a high degree of confidence.

2. Precise and exhaustive titles for clear context setting

The best way to illustrate context setting in this, well, context, is with an example. Let’s say you’re a support agent and you’re having a conversation with a customer:

  • The customer asks, “are there any fees?”
  • You look at the KB and you see that there are multiple sections that cover different kinds of fees. So you ask, “what fees would you like to know about?”
  • The customer replies, “Fees for damaged rental equipment.”
  • Now you scrub the “fee” sections to find something relating to damaged rental equipment. You find that different rental equipment have different fees if damaged. So you ask, “which equipment are you inquiring about?”

And so on…

What this example illustrates is the necessity for precise and exhaustive titles. If the KB had multiple sections about “fees” that had the same title, it would have been very hard for you to scrub it and give the customer an accurate answer. The AI would find it equally challenging.

You may have also already noticed that titles and section headers should become increasingly precise as a customer goes down a branch of the knowledge tree — fees, then equipment rental damage fees, then the kind of equipment, etc...

Descriptive titles are always better than questions, but if your KB is already organized as a list of questions and answers, ensure the answers are self-contained. For example, “Can I pay by credit card?” → “Yes, you can pay by credit card”, rather than simply “Yes.”

If many KB articles describe similar topics from different perspectives or market segments (for example, return policies in North America, Europe, Asia, etc.), ensure the distinction is made in the titles and article body. Here again, a good ontology will ensure the information won’t be used out of context when generating an answer. The objective is to make it simple for the system to rank the best possible information given a customer inquiry.

Finally, from a formatting perspective, html headers will always work best than rich text formatting, such as bold or underlined fonts.

3. Tags for augmented context

We know what you’re thinking: do we really need precise and exhaustive titles if we have tags?

To some extent, yes, you can use tags as a proxy for context, especially for search engines that only rely on keyword overlap and not semantics.

However, we see them as more of a bandaid solution for imprecise topic categories. Depending on which system you use, some KBs don’t support tags at all. You’re better off investing in a proper ontology that follows the best practices above. It’s easier to update, more future-proof, and friendlier to the AI.

4. Self-contained articles

Now that we have the ontology down, let’s talk about content: how much detail is too much detail? What’s the right balance between being concise and being thorough?

It really depends on your product or service, and what your customers expect from your KB — are they looking for step-by-step instructions for a procedure or a quick answer to a quick question?

Either way, the best practice is to maintain one topic per knowledge article, and ensure that customers can get all the essential information about that one topic from that one article so that they don’t have to jump around between pages.

Best practices for prioritizing content updates

You’ve built your KB, you’ve trained your AI on it, and you’re all set to automatically resolve customer inquiries. What you need to do now is ensure that the content is always up to date so that the answers the AI generates are always relevant and accurate.

The best case scenario is that you update content as soon as it needs updating — for example if you added a new product, changed your return policy, started serving a new market, etc…

However, this can be challenging, especially if you are understaffed. In this case, you should prioritize updates based on:

  • High traffic pages
  • Pages that have a high helpfulness rate based on review votes (thumbs up/down)
  • Pages rank highly on any other “success” metrics that you’ve established for your KB

Additionally, you can disable indexing from low-traffic articles. This way, the bot will know not to surface answers from them and you can minimize potential inaccuracies during generation.

If you’ve set up your KB ontology correctly, and you maintain it as the single source of truth that other systems (such as your AI) pull from, it’ll greatly minimize the effort of updating content.

Best practices for performance metrics

Measuring the performance of a KB accurately can be challenging. Helpfulness metrics, like the ones we listed above, can provide a good signal on the success of your KB. However, they often miss context about the positive or negative review. Other metrics like time spent per article are subjective and ambiguous.

The ideal measurement is the automated resolution rate, and without a doubt, knowledge management excellence will have a profound positive impact on that. Tracking the resolution rate of automated conversations involving KB articles and tracking trends as CX organization iterates on the ontology is a great way to measure the performance of the KB and its ROI.

Additionally, finding a place for users to provide feedback to improve the documentation results in valuable learning. Feedback can help fill gaps in the KB and prioritize efforts. Feedback can feel threatening for both parties, but thoughtful design can greatly unlock this gold mine of insights, ranging from CX improvement opportunities to new market offerings.

Wrapping it up

In conclusion, as generative AI continues to advance, it is increasingly important to rely on a well-structured and up-to-date knowledge base to ensure safe, accurate, and relevant interactions between humans and machines. By following the best practices outlined in this blog post, companies can ensure that their knowledge management practices are optimized for use with generative AI, resulting in better customer experiences and more efficient operations.

The generative AI toolkit for customer service leaders

Evolve your team, strategy, and tech stack for an AI-first future.

Get the toolkit