atticus - A versatile voice agent library for seamless UI interactions.

atticus

A versatile voice agent library for seamless UI interactions.

Pitch

Atticus is a framework-agnostic voice agent library designed to enhance user interface interactions through voice control. Leveraging OpenAI's Realtime API, it enables developers to integrate sophisticated voice capabilities into their applications, creating intuitive and responsive user experiences.

Description

Atticus is a framework-agnostic voice agent library designed to facilitate voice-controlled user interface interactions. It is powered by OpenAI's Realtime API, providing seamless integration into applications without a steep learning curve.

Key Features

Framework Agnostic: Integrate with any existing application, regardless of the underlying technology stack.
Voice-Controlled UI: Empower users to interact with your application using natural voice commands, enhancing accessibility and user experience.
Multiple Language Support: Capable of understanding and responding in over 40 languages, providing a truly global reach for applications.
Customizable Voices: Choose from various voice options to match the tone and personality of your application.

Quick Start Example

To get started with Atticus, simply import the library and create a voice agent:

import { Atticus } from "atticus";

// Obtain a client secret from the backend
const clientSecret = await fetchClientSecret();

const agent = new Atticus({
    clientSecret,
    voice: "shimmer",
    language: "en",
    agent: {
        name: "Assistant",
        instructions: "You are a helpful assistant that helps users interact with the UI.",
    },
    ui: {
        enabled: true,
        rootElement: document.getElementById("app"),
    },
});

// Listen to connection events
agent.on("connected", () => console.log("Connected!"));
await agent.connect();

UI-Aware Mode

Atticus provides a UI-aware mode, where users can control interface elements directly with their voice. Using the library’s capabilities, actions are automatically executed, enhancing usability:

const agent = new Atticus({
    clientSecret,
    agent: {
        name: "UI Assistant",
        instructions: "Help users fill out the form on this page.",
    },
    ui: {
        enabled: true,
        rootElement: document.getElementById("app")!,
    },
});

await agent.connect();
// Voice command example: "Fill the name field with John Doe"

Preserving DOM Structure

To maintain full nested DOM structures for complex components, Atticus allows certain sections of the DOM to be preserved using the data-preserve attribute, enabling the AI to interact with detailed user interfaces effectively:

<div class="product-list" data-preserve="List of available products with prices">
    <!-- Product items here -->
</div>

Events and Actions

Atticus comes equipped with a range of events to track status and user interaction:

Event	Description
`connected`	Triggered when connected to the voice agent
`message`	Fires when a new message is received

It also supports various UI actions, including clicking buttons, typing in fields, scrolling, and navigating pages, all through voice commands.

Conclusion

Atticus stands as an innovative solution for developers looking to implement voice-interactive UIs with ease and flexibility. With its rich features and intuitive setup, it enables applications to be more accessible and user-friendly, catering to a diverse audience while harnessing the power of AI-driven interactions.

0 comments

No comments yet.

New comment