API Reference

Data corpuses

Static corpus

Defines a corpus of static documents to be processed and indexed. For details, see Static corpus.

Syntax

Function syntax

corpus({ [title,] urls [, auth] [, include] [, exclude] [, depth] [, maxPages] [, query] [, transforms] [, priority] })

Function syntax

corpus({ [title,] text [, query] [, transforms] [, priority] })

Parameters

Name

Type

Required/Optional

Description

title

string

Optional

Corpus title.

urls

string array

Required

List of URLs from which information must be retrieved. You can define URLs of website folders and pages.

auth

JSON object

Optional

Credentials to access resources that require basic authentication: {username: 'johnsmith', password: 'password'}. For details, see Protected web resources.

include

string array

Optional

Resources to be obligatory indexed. You can define an array of URLs or use RegEx to specify a rule. For details, see Corpus includes and excludes.

exclude

string array

Optional

Resources to be excluded from indexing. You can define an array of URLs or use RegEx to specify a rule. For details, see Corpus includes and excludes.

query

function

Optional

Transforms function used to process user queries. For details, see Static corpus transforms.

transforms

function

Optional

Transforms function used to format the corpus output. For details, see Static corpus transforms.

depth

integer

Optional

Crawl depth for web and PDF resources. The minimum value is 0 (crawling only the page content without linked resources). For details, see Crawling depth.

maxPages

integer

Optional

Maximum number of pages and files to index. If not set, only 1 page with the defined URL will be indexed.

priority

integer

Optional

Priority level assigned to the corpus. Corpuses with higher priority are considered more relevant when user requests are processed. For details, see Corpus priority.

Name

Type

Required/Optional

Description

title

string

Optional

Corpus title.

text

plain text or Markdown-formatted strings

Required

Text corpus presented as plain text strings or Markdown-formatted strings.

query

function

Optional

Transforms function used to process user queries. For details, see Static corpus transforms.

transforms

function

Optional

Transforms function used to format the corpus output. For details, see Static corpus transforms.

priority

integer

Optional

Priority level assigned to the corpus. Corpuses with higher priority are considered more relevant when user requests are processed. For details, see Corpus priority.

Example

Dialog script
corpus({
    title: `HTTP corpus`,
    urls: [
        `https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview`,
        `https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages`,
        `https://developer.mozilla.org/en-US/docs/Web/HTTP/Session`],
    auth: {username: 'johnsmith', password: 'password'},
    include: [/.*\.pdf/],
   exclude: [`https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Evolution_of_HTTP`],
    query: transforms.queries,
    transforms: transforms.answers,
    depth: 1,
    maxPages: 5,
    priority: 0,
});
Dialog script
corpus({
    title: `HTTP corpus`,
    text: `
        # Understanding **async/await** in JavaScript

       **async/await** is a feature in JavaScript that makes working with asynchronous code easier and more readable. It allows you to write asynchronous code that looks and behaves like synchronous code, making it easier to follow and understand.

       ## How Does **async/await** Work?

       ### **async** Keyword:

       - The **async** keyword is used to declare a function as asynchronous.
       - An **async** function returns a **Promise**, and it can contain **await** expressions that pause the execution of the function until the awaited **Promise** is resolved.

       ### **await** Keyword:

       - The **await** keyword can only be used inside an **async** function.
       - It pauses the execution of the function until the **Promise** passed to it is settled (either fulfilled or rejected).
       - The resolved value of the **Promise** is returned, allowing you to work with it like synchronous code.

       ## Why Use **async/await**?

       ### Readability:

       - By using **async/await**, you can avoid the complexity of chaining multiple **.then()** methods when dealing with Promises.
       - Your code looks more like traditional synchronous code, making it easier to read.

       ### Error Handling:

       - Error handling with **async/await** is simpler and more consistent with synchronous code.
       - You can use **try/catch** blocks to handle errors.
    `,
    query: transforms.queries,
    transforms: transforms.answers,
    priority: 0,
});

Dynamic corpus

Defines a dynamic corpus of data to be processed.

The dynamic corpus allows retrieving JSON data from external data sources and using it to answer user queries in natural language. For details, see Dynamic corpus.

Syntax

Function syntax

corpus({ [title,] [input,] query [, output] [, transforms] [, priority] })

Parameters

Name

Type

Required/Optional

Description

title

string

Optional

Corpus title.

input

function

Optional

Function used to populate the Input field of the query transform.

query

function

Required

Transforms function used to process user queries and generate code to retrieve necessary data.

output

function

Optional

Function used to process obtained data, before it is passed to the transforms function.

transforms

function

Optional

Transforms function used to process and format data obtained with the query transform, and, optionally, the output function.

priority

integer

Optional

Priority level assigned to the corpus. Corpuses with higher priority are considered more relevant when user requests are processed. For details, see Corpus priority.

Example

Dialog script
corpus({
   title: `Infrastructure requests`,
   input: project.objects,
   query: transforms.vms_queries,
   output: project.cleanObjects,
   transforms: transforms.vms_answer,
   priority: 1
});

Corpus with the Puppeteer crawler

Defines a corpus of pages with complex structures or interactions.

The corpus with the Puppeteer crawler allows crawling specific page content and dynamically loaded resources and using the retrieved content to answer user queries. For details, see Puppeteer crawler.

Syntax

Function syntax

corpus({ [title,] urls, crawler [, auth] [, include] [, exclude] [, depth] [, maxPages] [, query] [, transforms] [, priority] })

Parameters

Name

Type

Required/Optional

Description

title

string

Optional

Corpus title.

urls

string array

Required

List of URLs from which information must be retrieved. You can define URLs of website folders and pages.

crawler

JSON object

Required

Type of crawler and function to be used to index corpus content: puppeteer. Crawler parameters:

  • waitAfterLoad: duration in milliseconds for which the function execution must be paused after the page is loaded

  • excludeSelectors: list of selectors to exclude from crawling

auth

JSON object

Optional

Credentials to access resources that require basic authentication: {username: 'johnsmith', password: 'password'}. For details, see Protected web resources.

include

string array

Optional

Resources to be obligatory indexed. You can define an array of URLs or use RegEx to specify a rule. For details, see Corpus includes and excludes.

exclude

string array

Optional

Resources to be excluded from indexing. You can define an array of URLs or use RegEx to specify a rule. For details, see Corpus includes and excludes.

query

function

Optional

Transforms function used to process user queries. For details, see Puppeteer transforms.

transforms

function

Optional

Transforms function used to format the corpus output. For details, see Puppeteer transforms.

depth

integer

Optional

Crawl depth for web and PDF resources. The minimum value is 0 (crawling only the page content without linked resources). For details, see Crawling depth.

maxPages

integer

Optional

Maximum number of pages and files to index. If not set, only 1 page with the defined URL will be indexed.

priority

integer

Optional

Priority level assigned to the corpus. Corpuses with higher priority are considered more relevant when user requests are processed. For details, see Corpus priority.

Name

Type

Required/Optional

Description

title

string

Optional

Corpus title.

urls

string array

Required

List of URLs from which information must be retrieved. You can define URLs of website folders and pages.

crawler

JSON object

Required

Type of crawler and function to be used to index corpus content. Crawler parameters:

  • puppeteer: function used to crawl data.

  • args: arguments to be passed to the crawler function.

  • browserLog: parameter to control the Puppeteer logging mode. Set the parameter to on to print logs from the browser interaction process to Alan Studio Logs, or off to disable logging.

auth

JSON object

Optional

Credentials to access resources that require basic authentication: {username: 'johnsmith', password: 'password'}. For details, see Protected web resources.

include

string array

Optional

Resources to be obligatory indexed. You can define an array of URLs or use RegEx to specify a rule. For details, see Corpus includes and excludes.

exclude

string array

Optional

Resources to be excluded from indexing. You can define an array of URLs or use RegEx to specify a rule. For details, see Corpus includes and excludes.

query

function

Optional

Transforms function used to process user queries. For details, see Puppeteer transforms.

transforms

function

Optional

Transforms function used to format the corpus output. For details, see Puppeteer transforms.

depth

integer

Optional

Crawl depth for web and PDF resources. The minimum value is 0 (crawling only the page content without linked resources). For details, see Crawling depth.

maxPages

integer

Optional

Maximum number of pages and files to index. If not set, only 1 page with the defined URL will be indexed.

priority

integer

Optional

Priority level assigned to the corpus. Corpuses with higher priority are considered more relevant when user requests are processed. For details, see Corpus priority.

Example

Dialog script
corpus({
    title: `Slack docs`,
    urls: [
        `https://slack.com/help/articles/360017938993-What-is-a-channel`,
        `https://slack.com/help/articles/205239967-Join-a-channel`,
        `https://slack.com/help/articles/201402297-Create-a-channel`,
    ],
    crawler: {
        puppeteer: api.defaultCrawler({
            waitAfterLoad: 1000,
            excludeSelectors: [
                `header.header`,
                `section.banner`,
                `.hidden`,
                `.category_list`,
                `.article_footer`,
                `footer.c-nav--expanded-footer`,
                `#onetrust-consent-sdk`
            ]
        }),
    },
    auth: {username: 'johnsmith', password: 'password'},
    include: [/.*\.pdf/],
    exclude: [/.*\.zip/],
    query: transforms.queries,
    transforms: transforms.answers,
    depth: 3,
    maxPages: 3,
    priority: 1
});
Dialog script
corpus({
    title: `Knowledge Base`,
    urls: [`urls to crawl`]
    crawler: {
        puppeteer: crawlPages(),
        browserLog: 'on',
        args: {arg1: 'value1', agr2: 'value2'},
    },
    auth: {username: 'johnsmith', password: 'password'},
    include: [/.*\.pdf/],
    exclude: [/.*\.zip/],
    query: transforms.queries,
    transforms: transforms.answers,
    depth: 10,
    maxPages: 10,
    priority: 1
});

async function* crawlPages({url, page, document, args}) {

    // crawlPages function code ...

}

api.createApiStubs

Generates API stub code from one or more OpenAPI specifications using the provided configuration options.

Use api.createApiStubs() to automatically generate client code stubs based on API definitions from OpenAPI and/or Postman specifications. The function accepts either an object or an array as its parameter.

Syntax

Function syntax

api.createApiStubs({ callFun, callFunScript, outScript [, fileUrls] [, postmanIds] [, specUrls] [, convertOperationIdToCamelCase] [, removeDeprecated] [, disableSSL] })

Parameters

options (Object or Array) A configuration object or an array of configuration objects. Each configuration object may include:

Name

Type

Required/Optional

Description

callFun

string

Required

The name of the function to be used in the generated code.

callFunScript

string

Required

The helper script file that contains the API function.

outScript

string

Required

The file name where the generated code will be saved.

fileUrls

string array

Optional

An array of URLs pointing to OpenAPI specification files.

postmanIds

string array

Optional

An array of Postman specification IDs.

specUrls

string array

Optional

Deprecated: Use fileUrls instead.

convertOperationIdToCamelCase

boolean

Optional

Converts operation IDs to camelCase in the generated code. (default: true)

removeDeprecated

boolean

Optional

Removes deprecated methods and deprecated properties from types. (default: false)

disableSSL

boolean

Optional

Disables SSL certificate verification for specification URLs. (default: false)

Example

Dialog script
async function generateStubs() {
    const configObject = {
       callFun: 'customApi_1',
       callFunScript: 'api_helper',
       specUrls: ["https://example.com/openapi1.json"],
       outScript: '_gen_api_example_1',
       convertOperationIdToCamelCase: false,
       removeDeprecated: true,
       disableSSL: true
    };

    const configArray = [
       {
          callFun: 'customApi_2',
          callFunScript: 'api_helper',
          fileUrls: [
             "https://example.com/openapi2.json",
             "https://example.com/openapi3.json"
          ],
          outScript: '_gen_api_example_2',
          convertOperationIdToCamelCase: true
       },
       {
          callFun: 'customApi_3',
          callFunScript: 'api_helper',
          postmanIds: ["12345-abcde"],
          outScript: '_gen_api_example_3',
          removeDeprecated: true,
          disableSSL: false
       }
    ];

    await api.createApiStubs(configObject);
    await api.createApiStubs(configArray);
    return 'ok';
}

api.sleep

Pauses the execution of the current script for the specified number of milliseconds.

Use api.sleep() in the data crawling function if a page takes a long time to load and there is no selector to wait for. For details, see Puppeteer crawler.

Syntax

Function syntax

api.sleep(time)

Parameters

Name

Type

Description

time

integer

Duration for which the script should be paused in milliseconds.

Example

Dialog script
async function* crawlKB({url, page, document}) {
    if (url.includes(`knowledge.hubspot.com/search?`)) {
        let linksSelector = `a.st-search-result-link`;
        await page.waitForSelector(linksSelector, { timeout: 20000 });
        const urls = await page.evaluate(() => {
            let linksSelector = `a.st-search-result-link`;
            return Array.from(document.querySelectorAll(linksSelector)).map(e=>e.href);
        });
        console.log(`# index [](${url}) => `, urls);
        yield {urls};
    } else if (url.includes(`knowledge.hubspot.com`)) {
        console.log(`# parse [](${url})`);
        await page.waitForSelector(`div.blog-post-wrapper.cell-wrapper`, {timeout: 20000});
        await api.sleep(4000);
        const html = await page.evaluate(() => {
            const selector = 'div.blog-post-wrapper.cell-wrapper';
            return document.querySelector(selector).innerHTML;
        });
        let content = await api.html2md_v2({html, url});
        yield {content, mimeType: 'text/markdown'};
    }
}

api.html2md_v2

Converts the HTML content into Markdown format, preserving the structure of the original HTML. For details, see Puppeteer crawler.

Syntax

Function syntax

api.html2md_v2(html [, url])

Parameters

Name

Type

Description

html

HTML string

A string of HTML content to be converted to Markdown.

url

URL

The URL of a crawled page. The function uses the URL of a crawled page to replace relative links with the full domain value.

Example

Dialog script
async function* crawlPages({url, page, document}) {
     const html = await page.evaluate(() => {
         const selector = 'div.content_col';
         return document.querySelector(selector).innerHTML;
     });
     let content = await api.html2md_v2({html, url});
     yield {content, mimeType: 'text/markdown'};
 }

Transforms

Pre-processes and formats input data according to the defined template. For details, see Transforms.

Syntax

Function syntax

query: transforms.transformName

Or

Function syntax

transform: transforms.transformName

Parameters

Name

Type

Description

tranformName

string

Name of the transform created in the Agentic Interface project.

Example

Dialog script
corpus({
    title: `Infrastructure requests`,
    query: transforms.infrastructure_queries,
    transform: transforms.infrastructure_queries,
    priority: 1
});

Action Transformer

act

Allows controlling the behavior of the Action Transformer: provide UI context, prioritize data, gather additional information and execute tasks. For details, see Action Transformer.

Syntax

Function syntax

act({[uiContext,] [context,] [fallback,] [execution,] [merge] })

Parameters

Name

Type

Required/Optional

Description

uiContext

function

Optional

Function used to retrieve the UI context of the page currently open in the app when the user query is given. For details, see UI context.

context

function

Optional

Function used to gather conversational context to pass it to the Action Transformer. For details, see UI context.

fallback

function

Optional

Transforms function used to handle unclear queries that require additional information, sensitive or biased queries. For details, see Fallback transforms.

execution

function

Optional

Transforms function used to specify what actions the Agentic Interface must take in response to the user query. For details, see Action execution.

merge

function

Optional

Transforms function used to combine data from different corpuses to provide a comprehensive response. For details, see Data merging.

Example

Dialog script
act({
    uiContext: getUiContext,
    context: getContext,
    fallback:  transforms.act_fallback,
    execution: transforms.act_execute,
    merge: transforms.act_merge
})

Session-specific objects and methods

Session-specific objects and methods are accessible through the predefined p object. They persist throughout a user session, which lasts until either 30 minutes of inactivity or when the user ends the dialog with the Agentic Interface.

userData

A runtime object used to store any relevant data. The data in p.userData is available only for the duration of the user session. You can access it at any time from any dialog script in the Agentic Interface project, regardless of the context. For details, see userData.

authData

A runtime object used to store static data specific to the device or user, such as user credentials or product version. For details, see authData.

visual

A runtime object used to store any arbitrary JSON data. Use it to provide dynamic information about the app’s state or visual context to the Agentic Interface. For details, see Visual state.

Global objects and methods

project

A global object used to store data that can be accessed by any dialog scripts in the project.

When Alan AI builds the dialog model, it loads scripts from top to bottom as listed in the scripts panel. As a result, the project object will be accessible in any script that follows the one where it was defined.

Dialog script
// Script 1
project.config = {initValue: 1};

// Script 2
console.log(`Init value is ${project.config.initValue}`);

project API

Allows sending data from the client app to the dialog script or executing script logic without a user command.

Define the logic using projectAPI in the dialog script and then invoke it with the callProjectApi() method in the Alan AI SDK. For details, see Project API.

Syntax

Function syntax

projectAPI.functionName = function(p, data, callback) {}

Parameters

Name

Type

Description

p

object

Predefined object containing the user session data and exposing Alan AI’s methods.

data

object

An object containing the data you want to pass to your script.

callback

function

A callback function used to receive data back to the app.

Example

Dialog script
projectAPI.setToken = function(p, param, callback) {
    if (!param || !param.token) {
        callback("error: token is undefined");
    }
    p.userData.token = param.token;
    callback();
};

Predefined callbacks

To perform actions at different stages of the dialog lifecycle, use the following predefined callback functions:

onCreateProject

onCreateProject is invoked when the dialog model for the dialog script is built. Use this function for activities that must be accomplished after the creation of the dialog model, such as data initialization.

Syntax

Function syntax

onCreateProject(()=> {action})

Parameters

Name

Type

Description

action

function

Defines what actions must be taken when the dialog model is created on the server in Alan AI Cloud.

Example

In the example below, onCreateProject is used to define values for project.drinks.

Dialog script
onCreateProject(() => {
    project.drinks = "green tea, black tea, oolong";
});

onCreateUser

onCreateUser is invoked when a new user starts a dialog session. Use this function to set up user-specific data.

Syntax

Function syntax

onCreateUser(p => {action})

Parameters

Name

Type

Description

p

object

Predefined object containing the user session data and exposing Alan AI’s methods.

action

function

Defines what actions must be taken when a new user starts a dialog session.

Example

In the example below, the onCreateUser function is used to assign the value to p.userData.favorites:

Dialog script
onCreateUser(p => {
    p.userData.name = "John Smith";
});

onUserEvent((p, e) => {
    if (e.event == 'firstClick') {
        p.play(`Hi, ${p.userData.name}, how can I help you today?`);
    }
});

onCleanupUser

onCleanupUser is invoked when the user session ends. Use this function for cleanup tasks.

Syntax

Function syntax

onCleanupUser(p => {action})

Parameters

Name

Type

Description

p

object

Predefined object containing the user session data and exposing Alan AI’s methods.

action

function

Defines what actions must be taken when the user session ends.

Example

In the example below, the onCleanupUser function is used to reset p.userData.favorites value:

Dialog script
onCleanupUser(p => {
    p.userData.name = "";
});

onVisualState

onVisualState is invoked when the visual state object is set. Use this function to process any data stored in the visual state or to accomplish tasks to be performed when the new visual state is set.

Syntax

Function syntax

onVisualState((p, s) => {action})

Parameters

Name

Type

Description

p

object

Predefined object containing the user session data and exposing Alan AI’s methods.

s

object

JSON object passed with the visual state.

action

function

Defines what actions must be taken when the visual state is sent.

Example

In the example below, when the user opens the Admittance section of the website, the Agentic Interface plays a greeting to the user.

Setting the visual state in the app:

Client app
<script>
  function myFunction() {
    alanBtnInstance.setVisualState({"page": "admittance"});
  }
</script>

Playing a greeting in the dialog script:

Dialog script
onVisualState((p, s) => {
    if (p.visual.page === "admittance") {
        p.play("Hello there! I'm your Agentic Interface, here to guide you through your journey towards academic success.")
    }
});

onUserEvent

onUserEvent is invoked when Alan AI emits an event driven by users’ interactions with the Agentic Interface. For the events list, see User events.

Syntax

Function syntax

onUserEvent((p, e) => {action})

Parameters

Name

Type

Description

p

object

Predefined object containing the user session data and exposing Alan AI’s methods.

e

object

Event fired by Alan AI.

action

function

Defines what actions must be taken when the event is fired.

Example

In the example below, the Agentic Interface listens to the firstClick event and, if the user activates the Agentic Interface for the first time, plays a greeting to the user.

Dialog script
onCreateUser(p => {
    p.userData.name = "John Smith";
});

onUserEvent((p, e) => {
    if (e.event == 'firstClick') {
        p.play(`Hi, ${p.userData.name}, how can I help you today?`);
    }
});

onEnter

onEnter() is invoked when the dialog script enters a context. For details, see Using onEnter() function.

Syntax

Function syntax

onEnter(action)

Parameters

Name

Type

Description

action

function

Defines what actions must be taken when the context is activated.

Example

In the example below, when the user enters the countContext, the p.state.result value is set to 0:

Dialog script
let countContext = context(() => {
    onEnter(p => {
        p.state.result = 0;
    });
});

Debugging

console.log

Outputs informational messages to Alan AI Studio logs. Use console.log() for debugging purposes to display slot values, messages and other data.

Syntax

Function syntax

console.log(message)

Parameters

Name

Type

Description

message

string

Message to be logged.

Example

Dialog script
try {
    console.log("This is a debug message");
}
catch (e) {
    console.error(e);
}

console.error

Outputs error messages to Alan AI Studio logs. Use it to report errors or exceptions that occur during the dialog script execution.

Syntax

Function syntax

console.error(message)

Parameters

Name

Type

Description

message

string

Message to be logged.

Example

Dialog script
try {
    // your code
}
catch (e) {
    console.error(e)
}