PitchHut logo
WhiteCollarAgent
An intelligent agent for automating complex tasks seamlessly.
Pitch

White Collar Agent is a versatile AI tool designed to automate complex computer and browser tasks efficiently. With a seamless transition between CLI and GUI modes, it empowers users to execute intricate workflows. Ideal for developers and researchers, this open-source project offers a foundation for creating specialized agents, fostering innovation in system-based AI.

Description

White Collar Agent is an innovative, multi-functional computer-use AI agent designed to perform intricate computer-based and browser-based tasks with ease. This tool enhances productivity by autonomously interpreting tasks, planning appropriate actions, and executing them to accomplish sophisticated objectives. Depending on task complexity, it seamlessly toggles between Command-Line Interface (CLI) and Graphical User Interface (GUI) modes, providing flexibility and accommodating various user interactions.

Key Features

  • Intelligent Task Execution: Utilize the built-in agent to autonomously tackle complex tasks.
  • Custom Agent Development: Extend the base agent architecture to create specialized agents tailored to specific workflows and behaviors.
  • TUI Interface: Engage with the agent through an intuitive Text User Interface, enabling interactive task management.
  • Cross-Platform Compatibility: Built to operate effortlessly on both Windows and Linux environments.

Usage Scenarios

Ideal for organizations, researchers, and developers investigating System-Based Agentic AI, Runtime Code Generation, and Autonomous Execution, White Collar Agent is positioned as a comprehensive solution to automate workflows, thereby increasing efficiency and accuracy in task execution.

System Requirements

  • Requires Python version 3.9 or higher, along with git, conda, and pip.
  • Users must obtain an API key from a language model provider such as OpenAI or Gemini to maximize functionality.

Quick Start Example

To run the built-in White Collar Agent, simply export the necessary API key and invoke the CLI tool:

export OPENAI_API_KEY=<YOUR_KEY_HERE>
python -m core.main

Custom Agent Development

The base agent structure allows users to create personalized agents by subclassing core components. Here’s an example of how to create a custom agent:

import asyncio
from core.agent_base import AgentBase

class MyCustomAgent(AgentBase):
    def __init__(self, *, data_dir: str = "core/data", chroma_path: str = "./chroma_db"):
        super().__init__(data_dir=data_dir, chroma_path=chroma_path)
        # Your implementation
        def _generate_role_info_prompt(self) -> str:
            return "You are MyCustomAgent — an intelligent research assistant. Your role is to find, summarize, and synthesize information from multiple sources."

agent = MyCustomAgent(data_dir=os.getenv("DATA_DIR", "core/data"), chroma_path=os.getenv("CHROMA_PATH", "./chroma_db"))
asyncio.run(agent.run())

Architecture Overview

  • BaseAgent: Core engine for reasoning and execution, which can be used as is or extended.
  • Action/Tool Library: Provides reusable functions for tasks like web searches and file operations.
  • Task Document: A structured outline detailing the agent’s objectives and methodology.
  • Planner/Executor: Manages goal decomposition and the execution process.
  • LLM Wrapper: Facilitates interactions with various language models, enhancing versatility.

Community Contributions

This open-source project encourages community involvement. Contributions and suggestions are welcomed to improve White Collar Agent. For feedback, the maintainer can be contacted through their GitHub profile or email at thamyikfoong(at)craftos.net.

Explore the capabilities of White Collar Agent and leverage its functionality to automate complex tasks and enhance productivity in diverse environments.

0 comments

No comments yet.

Sign in to be the first to comment.