Alan AI basics

Before you start building a conversational experience with Alan AI, it is important to see the big picture. This section provides a high-level overview of Alan AI’s fundamentals, outlines available tools, components and how they work together.

Actionable AI Platform

Alan AI is a complete Actionable AI Platform to build, deploy and manage AI agents in days. AI agents built with Alan AI can interact with users in natural language via voice or text and take actions in any app or software to increase productivity and delight users.

The Alan AI Platform offers a comprehensive AI stack to build AI agents, including:

  • Alan AI Studio: web-based IDE featuring a toolkit for dialog design, testing, management and analytics

  • Alan AI Cloud: AI backend including natural language understanding and natural language generation, speech recognition, speech-to-text and text-to-speech conversion modules

  • Alan AI SDK: toolset for building and deploying conversational experiences across multiple platforms, OSes and devices

Alan AI takes care of all the heavy lifting, eliminating the need to configure separate speech components, set up the infrastructure for voice and text command processing and train speech recognition software. All deployment, maintenance and data processing tasks are managed by Alan AI, allowing businesses to focus on their core objectives.

To create an AI agents with the Alan AI Platform, you need to:

  1. Design a dialog script for your AI agents

  2. Add the Alan AI SDK to your app

  3. Analyze the users’ behavior and iterate

Voice and text AI agents

The Alan AI Platform enables the rapid development and deployment of voice and text AI agents.

Alan AI agents are AI-powered virtual agents that employ artificial intelligence and natural language processing to handle conversations with users and fulfill their requests. They can perform a variety of tasks: provide information, collect data, navigate in the app, complete app-specific activities and so on, all through voice- or text-based interactions.

Alan AI agents are in-app agents. They are embedded into the app and designed to provide guidance and support within the context of this particular app, software or platform. Due to their in-app nature, Alan AI agents extend the app functionality with conversational capabilities, making it more user-friendly and intuitive.

You can design the following types of AI agents with Alan AI:

  • Voice AI agents: AI-powered agents that understand users’ voice requests, respond to them by voice and takes actions in the app

  • Voice and text AI chat: AI-powered chatbots or agents that leverage the chat interface to handle conversations, provide information and take actions through text- or voice-based messaging

Multimodal conversational design

Human interaction is a complex process that goes beyond speech and text. To convey information and meaning, we use a wide range of non-verbal cues and signals.

Alan AI employs a multimodal approach to enable organizations to create conversational experiences that closely resemble human communication. It allows combining several modalities: voice, text and visuals, for a more natural and intuitive user experience.

When performing users’ requests, AI agents can leverage the app’s GUI to provide better context and visual support for users. For example, if the user wants to know about a product, the AI agent can navigate to the product page for the user to lean on. Or, it can display the cart during the checkout process. Blending voice and text with the app’s visuals helps increase accessibility and improve the reliability of performed actions and requested information.

Dialog scripts

To build an AI agent, you must write a dialog script in Alan AI Studio. The dialog script describes the anticipated conversation between users and the AI agent, including all topics, questions and phrases that users may ask or say, as well as the replies and actions that the AI agent must take in response.

Dialog scripts are written in JavaScript, which provides unlimited flexibility for creating customized conversation flows.

To streamline dialog design, it can be helpful to map out all potential conversation paths and prototype alternative routes that users may take beforehand. With a clear “navigation flow” of the conversation, creating a dialog script becomes much more straightforward.

Dialog models

Alan AI offers two approaches to dialog design: intentless and intent-driven. You can use either approach independently or combine them.

Intentless dialogs

The intentless dialog model does not require prior knowledge of the user’s intent — the goal or purpose behind the user’s message. It employs the capabilities of Large Language Models (LLMs) and generative AI to understand the user’s goal and respond appropriately.

The intentless dialog approach significantly accelerates dialog development. The script developer does not need to think up all possible utterances and phrases that users may say to express their need and purpose. Rather, the developer needs to provide content for the AI agent as a list of plain-text strings or URL-based data in the dialog script.

At the dialog model built time, Alan AI ingests the content from the specified data sources to produce the app- and business-specific AI model. When the user asks a question or gives a command, Alan AI analyzes the user’s input, derives meaning from it and generates an appropriate response based on the context of the conversation.

The intentless dialog model is designed to handle undirected, open-ended conversations. It can be advantageous when the user’s goal is unclear or when the dialog is free flowing. It enables users to freely express their requests without being constrained by predefined intents and answers.

To build a dialog script using the intentless dialog approach, you can use the Q&A service.

Intent-driven dialogs

To let users accomplish tasks with text or voice commands or get a response from the AI agent, you can add intents to the dialog script. In Alan AI, intent is a function that allows it to identify the goal or purpose behind the user’s utterance and take the appropriate action. For example, when building an AI agent for flight booking, you may want to have the following intents:

To design an intent-driven dialog, the script developer needs to define all possible users’ intents and describe them in the dialog script.

When the user gives a voice a text command, Alan AI leverages NLU techniques such as intent classification and named entity recognition to analyze the user’s input and understand the intent. The user’s utterance is then matched to the best intent in the dialog script, and the AI agent provides an appropriate response or takes an action defined in this intent.

Intent-driven dialogs are best suited when the user’s goal is well-defined, the dialog is structured and follows a rigid path. For example, an AI agent for a food ordering or hotel booking app can make use of the intent-based dialog model.

Intent-driven dialogs can be designed using the following tools:

Dialog tools

The Alan AI Platform offers a set of tools and functions for dialog design.

Q&A service

The Q&A service enables you to develop dialog scripts using the intentless approach. With it, you can instantly build an AI agent that responds to users’ questions in the natural language without requiring to specify their intent first.

To use the Q&A service, the script developer needs to provide a collection of texts in the corpus() function in the dialog script. These can be any types of materials: product manuals, guidelines, FAQ pages, articles, policies and so on, provided as plain-text strings or URL-based data. The Q&A service ingests the text input and crawls the specified URLs to obtain the content and build an app-specific AI model for the AI agent, which is then used in dialogs with users.

Intents

To create dialogs using the intent-driven approach, the script developer must add a set of intents to the dialog script. A typical dialog script includes many intents that can be used to satisfy various types of user requests to the AI agent.

An intent is essentially a function with some logic that is used to handle a specific user’s request or intent in the dialog. When a user sends a voice or text message to the AI agent, Alan AI identifies the user’s message as matching one of the intents in the dialog script and triggers this intent.

An intent is triggered by a specific utterance that the user says or types. In Alan AI, this invoking phrase is known as a pattern. When designing a dialog script, the script developer needs to think of the different ways users may formulate their requests and come up with a list of likely spoken phrases. For example, the list of phrases for a coffee ordering intent may be the following:

  • I want a cup of coffee

  • One coffee, please

  • Can I get a coffee?

Along with the pattern, the script developer defines the action that must be executed when this intent is triggered. For example, the AI agent can play a response to the user or send a command to the client app to do some app-side activity.

Slots

Alan AI allows the script developer to define slots in intent patterns. A slot is a variable or parameter indicating a specific piece of information that the AI agent needs to collect from the user’s input to fulfill their request or intent. For example, the script developer can use a slot to collect information about the user’s location for a booking app or the essential product category for a shopping app.

Alan AI allows using the following types of slots:

  • User-defined slots to get custom values

  • Predefined slots to get specific type of information: location, time, numbers and so on

Contexts

Similar to real-world conversations, dialog scripts can involve user queries that are only meaningful within a particular context. For example, when interacting with a weather forecast agent, the user can ask: And what about tomorrow? To provide an optimal response, the agent needs to comprehend the specific context the user is referring to — the location the user is asking about.

Contexts are especially useful in multi-step dialogs where certain steps can only occur under certain conditions. In such cases, the script developer can place a certain portion of the dialog in a context and activate this context when the user says a specific phrase or reaches a given point in the conversation. For example, when creating a dialog for an online store, the script developer can define a context to collect address and time details and activate it only after the checkout process has been initiated.

Predefined script objects

Alan AI offers a set of predefined objects that the script developer can use to store and communicate data in dialog scripts:

Lifecycle callbacks

The script developer may need to perform specific actions or execute some code at a given point in the dialog lifecycle. Alan AI offers a set of predefined callbacks that get called as the dialog state transitions from one state to another:

Built-in JavaScript libraries

The script developer can utilize the following JavaScript libraries when developing dialog scripts:

  • axios and request to make API calls from the dialog script

  • moment-timezone and luxon to work with time

  • lodash to work with arrays, strings, objects and so on

Alan AI SDK

To bring the voice and text conversational experience to the app, you need to integrate the AI agent to the client app. Upon integration, the app receives a draggable button on top of its UI that allows users to interact with the AI agent. The AI agent button is located in the bottom left corner by default and can be moved around as needed.

Alan AI integrates with client apps through Alan’s AI SDKs. To learn about integration details, check out:

Client API methods

To let businesses design customized and personalized conversational experiences, the Alan AI SDK exposes a set of client API methods. The client API methods can be used to implement any business logic in the client app. Through client API methods, the client can interact with Alan AI and initiate conversational activities in the app.

Handlers

The Alan AI SDK provides a collection of handlers, a crucial instrument for creating interactive and responsive conversational experiences. Alan AI handlers can be used to manage specific events or respond to actions triggered by the user or Alan AI in real time.