Harnessing AI for Quality Assurance in Customer Interactions: Our Journey
Introduction
As an organization at the forefront of the fusion between human intelligence and artificial intelligence, we have embarked on an exciting journey. We want to show you how we are automating QA in our contact center with AI.
The Role of QA in Our Contact Centers
Our QA team has always been the backbone of customer satisfaction, ensuring every interaction aligns with our company standards. We have traditionally relied on manual methods or third-party solutions to assess these interactions. However, the emergence of AI has opened a new, more cost-effective avenue and exciting route that we decided to explore.
Our Proof of Concept with AI and QA
We conducted a proof of concept where we fed chat transcripts into a popular language model, ChatGPT. Our goal was to evaluate the model’s efficiency and accuracy in assessing these interactions based on a scoring system we have honed over the years.
Automating QA Forms with AI: Our Approach
Step 1: Choosing GPT-4 as Our AI Model
Based on our specific needs, we selected GPT-4 as our AI model of choice. This model has been trained on an extensive dataset and is capable of generating contextually relevant responses.
Step 2: Analyzing Our QA Form
Understanding our QA form was critical to the success of automating the process. We scrutinized every aspect of the forms that our clients have us use, considering the number and types of fields, the kinds of answers expected, and any specific formatting or validation rules. We also evaluated the context in which these forms are used.
Most of our QA forms comprise a series of fields each representing a different aspect of the customer interaction, from the initial contact to the final resolution. Some fields required binary (yes/no) responses, while others demanded more detailed descriptions or evaluations. We also had fields that required numerical ratings and sentiment scoring.
It was crucial for us to know how these fields worked together to represent a comprehensive picture of the interaction. This involved a thorough understanding of the interdependencies between fields, the flow of information, and the conditions that determined the responses.
Our team spent a substantial amount of time familiarizing themselves with the form and its nuances, ensuring that every detail was accounted for in the automation process.
Step 3: Crafting Effective Prompts
Crafting the prompts to guide the AI model was arguably one of the most challenging yet creative aspects of the process. We understood that the efficacy of our AI solution was directly tied to the quality of these prompts.
To begin with, we had to ensure that the prompts were clear and specific. We learned that vague or overly broad prompts could confuse the model or lead to inaccurate evaluations. For example, instead of asking “Was the agent polite?”, we found it more effective to ask, “Did the agent use respectful language and maintain a positive tone throughout the interaction?”.
We also discovered that we could use prompts to instruct the AI on how to interpret and evaluate certain parts of the chat transcripts. We could ask it to look for specific phrases, keywords, or sentiment, or instruct it to evaluate the interaction based on the time taken to resolve the issue.
The process of creating prompts was iterative. We started with a basic set of prompts and then refined them based on the results we obtained. We continually tested and tweaked the prompts to improve their effectiveness, incorporating feedback from our QA and customer service teams.
Overall, the time and effort spent on analyzing our QA form and crafting effective prompts were instrumental in the success of our AI implementation for QA in customer interactions.
Step 4: Thorough Testing and Validation
After defining the prompts and fine-tuning the model, we embarked on extensive testing. This involved running the model with various QA forms and comparing the AI-generated responses against the ones scored by our human QA team. For the most part, we have been less than 10% off of the human scoring, many times though we have found the Ai model was “more correct” than the human scoring after looking at both scores and outcomes, so it is most likely much lower than 10% “off”.
Here is a sample of the output results we received:
Scaling with OpenAI/ChatGPT APIs
Once we had our QA form analyzed and effective prompts crafted, the next crucial step was to scale this process using OpenAI’s GPT API. The APIs allowed us to automate the scoring of customer interactions at a much larger scale than could ever be achieved manually. They provided the framework that let our crafted prompts and the AI model work together in assessing customer interactions based on our QA form.
One of the challenges we encountered during this process was managing the continuity of AI’s “thought process.” Unlike interactive multi-turn conversations where the AI model remembers the past prompts and responses in the session, the API calls are stateless — they don’t maintain a memory of the previous interactions.
This means that each prompt has to be self-contained and provide all the necessary context for the AI to generate an accurate response. This required us to craft our prompts in a way that includes all the necessary information and instructions for each assessment, instead of relying on the continuity of an ongoing conversation.
For instance, if we wanted to evaluate whether the agent addressed the customer’s problem effectively, we had to craft the prompt so that it not only asks the AI to assess the effectiveness of the problem-solving but also provides the necessary context like what the problem was, what solution was offered by the agent, and how the customer responded to it.
Despite these challenges, the APIs proved to be an invaluable tool in our efforts to automate QA in customer interactions. They not only made the process more efficient but also more consistent by minimizing the human error factor.
Using the OpenAI’s GPT API, we managed to take our QA process to the next level, marking a significant step forward in harnessing AI for quality assurance in customer interactions.
The Complexity and Team Effort
The complexity of our QA systems demanded different sets of questions and prompts for various scenarios. This task required collective input from our IT department, management, floor staff, and QA teams. Together, we were able to create robust prompts that could cater to a wide range of customer interactions.
The Potential Benefits and Challenges Ahead
We believe our AI-led QA initiative could revolutionize our customer interactions, offering a cost-effective and efficient alternative to our current solutions. However, we also recognize that there are challenges ahead, particularly concerning data security and the integration of these scores into our existing CRM platforms.
In Conclusion
As of now, our AI isn’t learning from interactions but is providing invaluable support to our QA process. Although our journey is still a work in progress, our initial results are promising. automating QA in contact centers with AI might not be a distant dream, after all. It’s a reality we are actively shaping.