Glossary

A

AI (Artificial Intelligence)

is a type of computer system that is able to perform tasks that normally require human-like interaction: decision-making, speech recognition and understanding, translation between languages, and more.

AI agent

AI agents are autonomous systems that can perform tasks without constant human intervention. They analyse data, make decisions based on algorithms, and interact with other systems and users. These agents can be virtual assistants, chatbots, data analysis systems, and other tools that simplify and automate various business processes.

Through more flexible automation, they can improve customer service efficiency, analyse business metrics, and even enhance a company’s competitiveness. With such systems, companies can create intelligent chat assistants for customers, automate document analysis, optimise HR processes, and much more.

AI Copilot

AI Copilot is an AI-powered virtual assistant designed to assist with various tasks, from text and image generation to script creation from audio and more. These tools are developed for businesses, company employees, and individual users to improve personal productivity and eliminate work-related routines. In the AI Copilot interface, all services are often presented as convenient applications that are built on various neural network models and help solve specific business tasks.

In addition to the dialogue mode, applications can be created to automate the most common tasks: preparing meeting minutes, Q&A based on large datasets, summarisation, rewriting and generating text, transcribing audio recordings, converting texts into speech in various formats with different voices and in multiple languages, developing code and generating images. AI Copilot also simplifies working with other document formats – you can send the bot a link to a website or file (txt, docx, pdf, audio files) to translate the text into another language, get a summary of the content, or use other functions.

AI Copilot for businesses may also offer APIs for integration directly with companies’ information systems and employees’ workstations.

AI search

An AI search engine is an “intelligent” search system that goes beyond simply providing links and websites. It independently studies sources, analyses information, and delivers a complete answer to the user’s query.

AI search services that are transforming internet search are based on machine learning, specifically Natural Language Processing (NLP). Language models are trained to interpret audio and text, analyse large amounts of data, summarise information, and extract relevant details from text queries. Language neural networks can also generate text from predefined prompts and translate it into various languages. Companies like Microsoft, Google, Baidu, and others have already implemented such technologies.

Alternative paths (unexpected conversational turns)

outcomes of conversation that are not expected, that represent exceptions if the user sends answers that the chatbot cannot process. For example, for delivery within Germany, the user enters an address that is outside of Germany. In this case, no delivery can take place. Such edge cases should be considered in advance. For example, you can inform the user that you only deliver within Germany or you can give them the option to enter the address again in case of a misunderstanding.

Artificial general intelligence (AGI)

AGI can be defined as artificial intelligence that operates across a wide range of tasks and has strong generalisation abilities in various contexts. In other words, it refers to AI capable of completing tasks as successfully as a human.

AGI represents the next level of AI development, unlike most artificial intelligence models today, which specialise in specific tasks. General AI will be able to do everything. While it currently exists only in prototype form, experts anticipate the emergence of AGI within this decade.

Automated Speech Recognition (ASR)

is a technology that allows users of information systems to speak entries rather than punching numbers on a keypad.

B

Benchmarking

AI benchmarking is key for evaluating the performance of Large Language Models (LLMs). It offers a reliable way to assess and compare these models and AI systems across different tasks. By defining specific setups and relevant tasks, benchmarks ensure that LLMs are tested under similar conditions.

Standard benchmarks cover language generation, comprehension, translation, logical reasoning, and code understanding. They often use standardised tests similar to those in human education to measure performance. With many models available, these benchmarks help users determine which LLM is best suited for their goals by clarifying what “good performance” looks like for different tasks.

Bot

is a computer program that acts as an intelligent intermediary between people, digital systems, and Internet-enabled things and is able to interpret text, hand gestures, images or video provided by users in real-time and respond to questions.

C

Channel

a channel is where your chatbot will exist. These days, chatbots can operate in almost any channel where a two-way conversation is possible, pick one where your target audience hangs out. Examples of channels are Facebook Messenger, WhatsApp, Telegram, Web-widget, etc.

Chatbot

is a computer program that simulates a human conversation either written or spoken, allowing users to interact with digital devices as if they were communicating with a real person. Chatbots often use combinations of click commands and keywords (such as asking a customer to choose a topic e.g. money transferring, checking balance) and machine learning to help resolve problems or direct customers to a live agent for further troubleshooting and resolution.

ChatGPT

ChatGPT (Chat Generative Pre-trained Transformer) is a chatbot developed by OpenAI in 2022, based on generative AI. Within five days of its launch, it gained 1 million users and exceeded 100 million in two months.

The GPT bot is both a conversational partner and a business assistant, capable of tasks like replacing customer support, generating content, or providing information.

It analyses vast amounts of data, giving it a solid knowledge base, speed, and accuracy. With access to terabytes of online content, it quickly retrieves information and continues conversations instantly.

Limitations of ChatGPT:

— Inaccuracy of responses. The quality of the GPT bot’s answers depends on the data gathered and analysed, which may be incorrect or outdated. Additionally, the effectiveness of the responses heavily relies on the proper phrasing of the prompt (the question).

— Lack of 100% reliability. In many fields, relying solely on verified sources is crucial, but with AI, it’s impossible to know what data was used to generate the response. This uncertainty makes ChatGPT less dependable in situations where accuracy is critical.

— Vagueness in responses. ChatGPT is sometimes criticised for overly wordy and formal responses. Behind the polished phrasing, it can be difficult to grasp the real meaning or substance of the answer.

— Copyright infringement. ChatGPT generates responses based on information from open sources, including copyrighted materials, without obtaining permission from the content creators.

Chat widget

a graphical user interface (GUI) component that enables users to interact with a conversational AI system through a chat-based interface. It is a user-facing element that provides a platform for users to input their queries or messages and receive responses from the AI system in a conversational format.

The chat widget is typically embedded within a website, mobile app, or other digital platforms, allowing users to have interactive conversations with the conversational AI system.

Conversational AI

is a dialogue system that interacts with users based on the principles of human-to-human communication. This communication usually happens through voice or text messages, or some non-verbal signal (i.e., gestures) which is available on the device. The key to this interaction is that the speaker and the machine can understand each other and hold a conversation on topics the AI has learned about. Developers use Conversational AI to build conversational user interfaces, chatbots and virtual assistants for a variety of use cases and integrate them into chat interfaces such as messaging platforms, smart devices, social networks and websites.

Conversational Script

is a set of dialogues in the conversation, which is usually created by the Conversational Designer before development.

Conversational Flow

has two meanings: a) the way the conversation is going, b) a kind of flowchart that represents parts of the whole conversation. In terms of Playbook, the conversational flow is used primarily in the above mentioned (b) meaning. Since the bot is technically a state-machine system, every flow should illustrate those small independent bot states, users’ reactions as an input, and the bot’s output, the connection between all of these parts. Every state is a stage after some user input, during which the bot performs some logic or reacts somehow, after that bot moves internally into the following state.

Confirmation

conversational script parts, which assure chatbot users that their input was received. “Sure”, “Okey”, “Excellent”, and “I see” are all various ways that the bot can acknowledge the user input and make them feel that they are being heard. These also add a touch of humanity to the bot and build trust with the user. Confirmation might be implicit and explicit.

Customer agent platform

is a centralised platform that helps businesses manage their customer communication touchpoints through various channels.

D

Data labelling

the process of annotating or tagging data to provide labels or annotations that describe specific aspects or elements of the data. Data labelling is a critical step in training machine learning models for conversational AI, as it helps the models understand and learn from the labeled data, improving their ability to comprehend and generate appropriate responses during conversations. Data labelling typically involves annotating different components of a conversation, such as user utterances, system responses, intents, entities, dialogue acts, sentiment, or other relevant attributes.

Deep learning

Deep learning is a collection of machine learning methods that gained popularity with the development of artificial neural networks.

It is a set of machine learning methods that use neural networks with many neurons and layers to extract features. In a multilayer neural network, in addition to the input layer (which receives data) and the output layer (which provides results), one or more hidden layers of computational neurons process the data. Each subsequent layer receives the output from the previous layer as its input.

Artificial intelligence employing deep learning autonomously discovers algorithms to solve the initial task, learns from its mistakes, and provides more accurate results after each training iteration.

Deep learning is used to solve tasks related to facial recognition, speech recognition, text analysis, as well as image and video processing.

Dual-tone multi-frequency (DTMF)

is the sounds or tones generated by a telephone when the numbers are pressed. Used in old voicebots via phone calls.

E

Enterprise AI

Enterprise AI refers to the use of artificial intelligence (AI) technologies in large organisations to enhance various business functions. It involves applying AI tools and techniques to automate processes, improve decision-making, and develop new products and services.

In recent years, the corporate sector has closely observed the progress of generative AI. Now, the trend has shifted: companies are actively adopting AI technologies and modifying their development strategies accordingly. Generative AI is no longer merely a source of entertainment; it is now seen as a tool to enhance user efficiency. As a result, companies are integrating AI into their business processes.

If your organisation plans to integrate AI automation for improved efficiency and business results, consult our team. We will clarify where the technology can provide the most value to your enterprise and assist you in its adoption.

Entity

are data buckets that contain words and phrases with similar characteristics, they can be fields, data or text describing just about anything – time, place, person, item, number, etc. With entities, it becomes easy to extract important information from the user’s utterances. Examples of extracted info are: phone number, e-mail address, name, type of insurance, etc.

Explicit confirmation

is a situation when a bot asks the user to confirm by repeating parts of the query explicitly. It is useful for situations when the bot’s confidence in recognising the intent is not high enough or when the stakes are high. For example, if the tasks involve transferring large sums of money, or sending a message to a number of contacts it helps to check the data again to make sure there is no confusion.

F

Fallback (CatchAll)

is a common term for bot reaction on “No match” or on “No input” events. Fallback is a designed, not system state, so CUI/UX designers should be responsible to create them.

Example #1 (No match):

User: Do you know where the kangaroo lives?

Bot: I didn’t get that. Can you please rephrase?

Example #2 (No input):

User: [speech not recognised]

Bot: Can you please repeat?

Fine-tuning

Fine-tuning is the process of further training a pre-trained AI model on smaller, specialised datasets to improve its abilities and increase accuracy for specific tasks. If we compare neural network training to human learning, fine-tuning is much like university education. It’s during this stage that the algorithm gains the specialised knowledge it needs to perform well in a particular field, just as people do.

The first step in fine-tuning is selecting the pre-trained model that best suits the task. Next, it’s necessary to decide which layers will be frozen and which will be adjusted during fine-tuning. Freezing layers helps retain the capabilities of the pre-trained model.

The next stage involves setting the training parameters and processing the training dataset. This may include augmentation, which generates new elements from existing ones to increase the dataset size. Once the model and dataset are prepared, the fine-tuning process begins. After training, the model must be evaluated on a validation set or test dataset to assess its performance and answer quality.

Foundation Model

a base or fundamental model that serves as the starting point for the development of more specialized or advanced models in the field of artificial intelligence. In the context of large language models, a foundation model is a powerful and comprehensive language model that is pre-trained on a vast amount of text data to understand and generate human-like language. It serves as the building block or baseline for creating more specific models tailored to particular tasks or domains.

G

Generative AI

Generative AI refers to a class of AI models and algorithms that are designed to generate new content, such as text, images, music, or even entire videos. These models are trained on large datasets and learn patterns and structures to create new content that is similar to the data they were trained on. Generative AI models use techniques like neural networks, deep learning, and reinforcement learning to understand and mimic the underlying patterns in the training data and produce novel outputs. While conversational AI involves understanding and generating human-like conversations, generative AI is broader in scope and can generate content beyond just conversational interactions. Generative AI can be used for creative purposes, such as generating artwork or music, as well as for other applications like data synthesis, content creation, and even deepfakes.

Generative pre-trained transformer (GPT)

GPT (Generative Pre-trained Transformer) is a powerful machine learning model designed for natural language processing (NLP). It can generate text, answer questions, translate languages, and handle various other text-related tasks. Developed by OpenAI, GPT is one of the most advanced models in the field. In recent years, it has gained widespread recognition for its ability to produce text that is difficult to distinguish from human writing, making it valuable in content creation and task automation.

GPT uses transformer architecture, which allows it to efficiently process large amounts of data and generate text with a high degree of accuracy. The model has undergone several stages of development, from its initial version to more recent versions, that have significantly enhanced its capabilities and performance. Notably, GPT generates text and understands context, making it particularly useful for complex tasks requiring a deeper understanding of language.

GPT has a wide range of applications, from automatically answering messages and creating personalised recommendations to writing texts and analysing natural language. A notable example of GPT’s success is the creation of human-like conversational agents that can engage in discussions on virtually any topic, ask questions, and respond according to the context. This opens up new possibilities for the digital world and enhances human capabilities in communication and information exchange.

Global intent

according to the bot flow structure and code realisation, there are such intents which are available from direct user requests at every moment during the dialogue.

H

Hallucination/Hallucinate

Before the emergence of ChatGPT, hallucinations were considered an exclusively human phenomenon. However, according to many, this neural network often displays traits associated with human behaviour. For example, many people are familiar with the feeling of not knowing something but attempting to guess or imagine the answer based on existing knowledge. ChatGPT’s hallucinations occur similarly—when it encounters a gap in its knowledge.

The neural network is programmed to provide an answer, even if it’s not based on any data it was trained on or follows no apparent pattern. As a result, it generates a response based on the statistical patterns from its training data, which can sometimes lead to incorrect or disconnected information.

AI hallucinations can take various forms, such as false news, incorrect statements, or fictional documents about people, historical events, or scientific facts that the neural network fabricates by predicting likely word sequences within a given context.

Happy Path (golden path, main path)

is a kind of conversational flow, when users achieve their goal in the most simple and obvious way, engage with the bot, writes, click on or say the right thing, and follow through until the desired outcome is achieved — whatever that is. For example, if you have a pizza restaurant chatbot, then the Happy Path will be that the customer orders a pizza from your restaurant. If you’re using your chatbot internally for HR or onboarding purposes, then the Happy Path is that the user successfully receives the right information that they need.

I

Implicit confirmation

is one that does not require confirmation from the user, but also leaves the option open for the user to confirm or deny. It makes the conversation a lot more natural, and closer to how humans talk with each other.

Intent

the main idea or purpose of the user’s utterance. Bots are usually built by using a set of intents. The scope of intents is representing the user needs and the bot’s possible replies. An intent is a finite set of phrases. A banking NLU system, for example, should be able to respond to “Can you show me my bank account?” or “Can you send money to someone?”, and both those phrases correspond to one intent — “banking account”.

IVR (Interactive Voice Response)

a technology that allows users to interact with a computerised system using voice and telephone keypad inputs. It is commonly used in telephone-based systems to provide automated self-service options and route calls to the appropriate resources or departments. IVR systems are designed to handle a large volume of incoming calls and provide a seamless and efficient user experience. They use pre-recorded voice prompts and menus to guide callers through a series of options and gather information or provide automated responses. IVR systems often integrate with speech recognition technology to enable voice-based input from callers.

L

LLMs (Large Language Models)

sophisticated artificial intelligence models that have been trained on vast amounts of text data to understand and generate human-like language. These models leverage deep learning techniques, particularly using neural networks with many layers, to process and analyze textual information. LLMs are designed to understand and generate text in a way that resembles human language patterns, allowing them to perform a wide range of natural language processing tasks. They can comprehend and generate coherent sentences, understand context, detect sentiment, translate languages, answer questions, summarize documents, and even engage in conversational interactions. One of the most well-known examples of an LLM is OpenAI’s GPT (Generative Pre-trained Transformer) series, including models like GPT-3. These models have achieved remarkable language generation capabilities, demonstrating the potential for various applications in content creation, virtual assistants, chatbots, language translation, and more.

Local intent

in contrast to global intents, there are such intents, which are available within some dialogue branches, after certain conditions or circumstances, or after, for example, slot-filling.

Low-code

Low-code platforms enable the creation of IT products with minimal need for writing and editing code. These services are similar to no-code platforms but allow developers to modify code as needed. The main advantage of low code is the time savings for developers while still providing the flexibility to make adjustments to the product.

M

Mock object

a simulated or imitation object that is created for testing and development purposes. Mock objects are used to mimic the behaviour of real objects or components within a conversational AI system, allowing developers to test and validate their code in isolation.

Multimodal interface (also mixmodal interface)

it processes various user input modes, such as speech, text, touch, hand gestures, etc. and supports various bot outputs. Also, it combines modalities in a way to replicate interpersonal human interaction. The multimodal interfaces allow users to flexibly switch between different types of interaction. For example, using voice technology as an input mechanism mixed with a graphical user interface (GUI) as the output for the user.

N

NER (Named Entity Recognition)

a subtask of natural language processing (NLP) that focuses on identifying and classifying named entities in text. Named entities are specific objects, locations, names of people, organizations, dates, quantities, and other elements that carry semantic meaning. NER helps extract important information from user queries or statements, enabling the system to understand and respond appropriately. By identifying and categorising named entities, NER provides context and relevance to the conversation, facilitating more meaningful interactions.

NLG (Natural Language Generation)

the component or process of a conversational AI system that focuses on generating human-like, natural language responses to interact with users. By generating coherent and contextually appropriate responses, NLG enhances the user experience, improves user engagement, and creates a more interactive and satisfying conversation. NLG is a key technology in transforming structured data and system prompts into meaningful and engaging human language responses, allowing conversational AI systems to have dynamic and interactive conversations with users.

NLP (Natural Language Processing)

a branch of artificial intelligence that focuses on the interaction between computers and human language. It involves the analysis, understanding, and generation of natural language to enable effective communication and interaction between humans and machines.

NLU (Natural Language Understanding)

the component or process of a conversational AI system that focuses on interpreting and comprehending the natural language input provided by users. NLU is responsible for extracting the meaning, intent, and context from user queries, commands, or statements, enabling the conversational AI system to understand and respond appropriately. It involves a range of techniques and algorithms that analyse and process the user’s input to derive actionable insights. NLU plays a critical role in conversational AI systems as it bridges the gap between human language and machine understanding. By accurately interpreting user input, NLU enables the conversational AI system to provide relevant and contextually appropriate responses, improving the overall user experience.

NLU Engine

is an AI-powered solution for extracting information from utterances in a human language (the user’s utterances) to use it in further dialogue. A good NLU system is designed to keep the conversation going smoothly even when it doesn’t receive enough information from the client. In its most simplified form, the process of “understanding” a language consists of the following major steps: text preprocessing (query); classification of the request, correlation with one of the classes known to the system (definition of intent); retrieving query parameters (entity retrieval).

No-code

No-code, or Zero-code, technologies allow developers to develop IT products (websites, web, and mobile applications) without writing code. These tools do not require programming skills or software development expertise. No-code services automate various tasks for developers, making creating and managing applications easier. The best use of no-code is found in standard business processes relevant to nearly all types of businesses, such as building simple websites or chatbots.

No input

an event means that a dialogue system doesn’t detect any input (voice, text, etc.) from a user.

No match

an event when a dialogue system doesn’t match users’ utterances with the predesigned states in the bot’s code.

O

Open-source

Open-source software is software with source code available to everyone. Having open source means anyone can examine how the program is built, identify vulnerabilities, and create something compatible or similar. Users can also take an algorithm and build something based on it or find a flaw and suggest improvements.

Popular examples of open-source software include operating systems like Linux, Android, and BSD. Open-source software is developed using languages such as Python, Java, and JavaScript. Among well-known open-source AI models are LLaMA, Stable Diffusion, and many others.

P

Pattern

a predefined structure or template used to recognise and interpret user input or generate system responses. Patterns play a crucial role in natural language understanding and generation, allowing conversational AI systems to understand user queries, extract relevant information, and generate appropriate responses. By defining and recognising patterns in user input, conversational AI systems can accurately interpret user intent, gather relevant information, and generate appropriate and contextually relevant responses. Patterns serve as a foundational building block for designing conversational AI systems that can engage in meaningful and effective conversations with users.

Prompt engineering

Prompting, or prompt engineering, is a method of interacting with neural networks that helps set tasks for them. Prompting involves crafting prompts or instructions with detailed information to ensure the system generates high-quality content.

With the rise of AI models like GPT or Midjourney, it became evident that bots don’t always fully understand human queries. To address this issue, a new approach emerged, allowing users to communicate with AI models almost like they would with a person without needing specialised programming languages. This expertise is held by prompt engineers, developers who train AI, identify errors, and generate responses using well-constructed prompts and commands.

Prompting has evolved significantly in a short time. Many foundational prompts that previously required human input are now integrated into various AI applications. However, manual prompting remains an essential skill for those working with neural networks, especially when tackling complex, creative, or critical tasks.

Prompter

a component or system that assists users in formulating or refining their input during a conversation with a conversational AI system. The purpose of a prompter is to guide users, suggest possible options, or provide contextually relevant prompts to facilitate a smoother and more effective conversation. Prompters play a crucial role in improving user engagement, reducing user effort, and ensuring successful interactions between users and conversational AI systems. They are designed to make conversations more intuitive, efficient, and user-friendly, ultimately leading to a more satisfying conversational AI experience.

R

Retrieval augmented generation (RAG)

RAG (Retrieval-Augmented Generation) is a method used with large language models (LLMs) that incorporates external information to enhance a user’s query. This additional data is combined with the query and then processed by the language model, enabling it to deliver more accurate and comprehensive responses.

It’s beneficial for foundation models, like LLMs, when they are asked questions beyond their training data. For example, suppose you’re developing an AI-powered customer support system using a large language model and have a knowledge base filled with FAQs or detailed functional descriptions. In that case, you can utilise retrieval-augmented generation (RAG) instead of continually retraining the model. This approach is more cost-effective and efficient.

Here’s how it works: when a user asks a question, the RAG system searches for the relevant article or content in the knowledge base and passes both the user’s question and the pertinent information from the knowledge base into the LLM. This way, the model can generate a more accurate and informed response without needing to be retrained every time the knowledge base changes.

S

Sample dialogues

is a conversational design technique, that allows understanding of how our future CUI will work.

Example:

Bot: Hi! This is a Pizza bot. Would you order a huge cheesy pizza?

User: Yes, sure!

Bot: What kind of pizza would you like to order?

Script

a predefined sequence of instructions or dialogue that guides the behaviour of a conversational AI system during a conversation with a user. A script outlines the flow, structure, and specific responses of the system based on different user inputs or system prompts.

Scripts are commonly used in conversational AI to design and control conversational interactions, ensuring a consistent and effective user experience.

Skill discovery

is a crucial element in making virtual assistants more effective and humanlike. Skill is a Conversational AI component containing the dialogue configuration. Skill usually consists of some functionality, i.e. “weather forecasting”, “applying forms”, etc. Skill discovery is a process of informing the user about the bot’s capabilities.

Slots

there are chunks of information incorporated into the user’s speech. While speaking or typing, some people will give info up front, while others will provide it piece by piece. Two different people might say the same things in two different ways and to make your bot flexible, it is important to teach a dialogue system to understand what information already exists in a request. Designed slots help to define, which bits of information (slots) still need to be requested from the user by the bot. Slots are usually entities.

Slot-filling

this is a process of filling slots during bot-human conversation. Creating an appointment is a good use case for filling the slots. A scheduling appointment requires a date, time, and location. The user can say “I need an appointment for Elm on Tuesday at 3 p.m.” (filling all slots at once) or “I need an appointment” (filling no slots, so the bot has to ask additional questions until all the slots would be filled). CUI/UX specialists should predesign all possible variants including full- and partially-filled slots.

Speech synthesis (text-to-speech, TTS)

this is the technology and process of converting written text into spoken audio. It involves generating human-like speech from textual input, allowing conversational AI systems to interact with users using natural-sounding voices.

State

this term comes from system analysis and state-machine model. Each step in dialogue systems is a state. Some examples of states are waiting for user input, performing some internal logic, reacting to a user’s request, moving to the following state, keeping or performing clients’ or external system data, etc.

Structured data

Structured data has a defined format that is easy to describe. It is convenient to work with because it can be easily stored, sorted, analysed, and processed. With the emergence of large language models, companies can now convert unstructured data — such as customer reviews, reports, documentation, and social media content — into structured formats that are simpler to analyse.

Summarisation

Summarisation is one of the key capacities of neural networks, and it has many applications in the workplace. AI excels in the ability to process and analyse large volumes of text, producing a concise summary and generating insights. It can be used in document summarisation, analysis of recorded conversations, creation of follow-ups and meeting notes, and many other business applications.

T

Training datasets

A dataset is a processed and structured collection of data. Each object within it has specific attributes, such as features, relationships between objects, or a particular position in the data sample. For training neural networks, a dataset must contain a sufficient amount of data, especially when multiple features are being analysed.

Training phrases

are a bunch of user utterance examples. It is better to collect or create a variety of training phrases for each intent, according to chosen NLU classification model. The best sources for gathering groups of training phrases are sets of saved dialogues with users and customer support agents.

Token

a specific unit or element within a script that represents a meaningful component of a conversation. Script tokens are used to break down the dialogue or instructions into smaller, manageable parts, allowing for easier processing, analysis, and manipulation by conversational AI systems. In addition to their role in model training, script tokens are also important in conversation management, allowing developers to modify or extend the conversation flow by manipulating or replacing specific tokens. This flexibility enables conversational AI systems to adapt to different use cases, handle various scenarios, and provide personalised and dynamic conversations.

U

UI (user interface)

any graphical interface, which usually interacts with people via display using a touchscreen or some peripheral devices (buttons, mouse). It can be modal windows on websites, mobile applications, devices with hardware buttons, etc.

Unstructured data

Unstructured data consists of data sets without a defined structure. This leads to challenges when processing and extracting value from them. Although organisations have access to a vast amount of information, they often struggle to derive useful insights because the data is in raw form.

One of the key applications of AI is extracting important information from large volumes of unstructured data. This is particularly relevant when dealing with lengthy reports, legal documents, and other types of documentation that are difficult to analyse manually.

Utterance

is a continuous piece of speech beginning and ending with a clear pause. As speakers take turns to produce utterances, they strive to make them recognisable: that is, not only for the counterpart to distinguish words and sentences, but also to interpret the meaning of the utterance and react in the desired way. For example, when a user states “I’d like to order a pizza please”, the entire sentence is the utterance. There is no strict rule about what an utterance comprises. It can be a sentence, but it does not need to be a complete sentence. It can also consist of multiple sentences. Fun fact: even a simple sigh can be understood as an utterance.

UX (user experience)

the term used to describe the whole user’s interaction with a system and the user’s emotional reaction. User experience designers take utmost care to ensure that systems are intuitive and easy to comprehend. Conversational User Experience is a user experience that combines chat, voice, or any other natural language-based technology to mimic a human conversation.

V

Voicebot

is a bot which communicates through voice replies. Voicebots can be smart (with NLU) and rule-based, with DTMF. Voicebots usually run on IVR, Smart IVR, and smart assistants in multispeakers. Voicebot can imitate specific human voice patterns: pauses, accents, etc.

W

Wake word

is the gateway between you and your digital assistant. Common wake words include “Hey, [bot name]”. The phrase causes the bot to begin recording an end user’s request so it can be sent for processing. When the bot detects its wake word, it records the next spoken request and sends a recording of the user’s request for intent processing and sends back a response or initiates an action.

Webhook

a mechanism or integration method that allows real-time communication and data transfer between two applications or systems. Specifically, it enables communication between a conversational AI platform (such as a chatbot or virtual assistant) and external services or applications. Webhooks enable conversational AI systems to interact with various external services and systems seamlessly. They can be used for a wide range of purposes, such as querying databases, fetching real-time data, integrating with APIs, performing calculations, or connecting with other applications.

Glossary

A

B

C

D

E

F

G

H

I

L

M

N

O

P

R

S

T

U

V

W

Further reading

Call Demo