Designing More Robust and Lower Risk Generative AI

RIViR Reads logo

If you’re asking yourself when the last time you received human support was, you’re not alone. After decades of dealing with sub-par IVRs (Interactive Voice Response systems) and grammatically incorrect chatbots, we now live in an era powered by conversational agents using regional dialects. New copilots accompany us while shopping on the web who can help us make decisions when booking hotels and flights.

In a rush to beat the competition and slash costs, companies have connected chatbots and generative AI systems to their websites and databases to initiate customer support. A company offering an instant agent to chat their way through a problem or sale is an alluring prospect for companies looking for an edge, but can expose them to costly consequences. Air Canada shut down its customer support bot after losing a court case to someone who received incorrect policy information for requesting a refund about a bereavement flight.

Generative AI is still a brand spanking new area of computing. Even though there aren’t many proven design patterns for integrating AI into software systems, there are ways of lowering the risk of having your conversational agent or chatbot run wild.

Avoid Common Pitfalls

One of the most common design patterns for integrating generative AI use Retrieval Augmented Generation (RAG). The RAG model augments a generative AI implementation with a data source. In a conversational scenario, the conversational agent’s software doesn’t rely on its training corpus for information, but retrieves information from a data source, usually a database. After retrieving data, the conversational agent returns generated content based upon the data.

Generative AI ChatbotConversational agents and chatbots require more than simple search. A way to bolster RAG implementations is to use a vector database. Vector databases are a storage technology that utilizes mathematical modeling to compute distances, or associations, between words and phrases. This technique can more closely align with a user’s intentions than regular ‘search for’ matches.

Vector databases should be configured to store information using the same embeddings as the generative AI being used. Embeddings are numerical representations of words that are closely related. Generative AI models, like LLMs, use embeddings to better understand user communications. Different AI models may use varying embeddings. For example, it’s better to use OpenAI’s embedding model to store data for information retrieval while using OpenAI’s GPT APIs.

Thorough prompt engineering can also minimize hallucinations and rampancy. The true power of large language models is their ability to interpret and understand what being asked of them. Tuning prompts to decipher and understand user intent programmatically can lead to greater success of finding the most appropriate information for users. OpenAI’s GPT models can perform simple logic and reasoning on user inputs that can be resolved in software. This helps conversational agents fully understand the context and need a user may have before simply generating a response.

Established software development and testing practices can further improve reliability. Adversarial prompt testing is a relatively new concept using older negative testing and malformed input practices. Adversarial testing can include everything from asking a conversational agent out on a date to embedding prompts in a conversation. This type of testing can be useful for developing prompts that understand valid inputs. Finally, keeping your company’s policies and procedures up-to-date and storing them appropriately for the conversation agent will go a long way to avoiding Air Canada’s mistake.

 

We are at the beginning of seeing generative AI embedded into applications and devices all around us. The rush to integrate GenAI is tempting, but the risks can be lowered by properly planning the use case, implementing solid designs, and avoiding pitfalls. Additionally, a human element should still be desired. Once upon a time, call supervisors would monitor conversations between human call center employees and customers. A similar system can be deployed for managing the conversations our AI employees will engage in tomorrow.

about the author

Will Mapp

As Chief Technology Officer, Will Mapp keeps a constant eye on the future and ensures Qlarant is at the forefront of the latest and emerging technologies. See all posts from Will Mapp, III.

Leave a Reply

Let's Talk About Solutions

How can Qlarant help you improve quality and program performance?