White Collar Agent is a versatile AI tool designed to automate complex computer and browser tasks efficiently. With a seamless transition between CLI and GUI modes, it empowers users to execute intricate workflows. Ideal for developers and researchers, this open-source project offers a foundation for creating specialized agents, fostering innovation in system-based AI.
White Collar Agent is an innovative, multi-functional computer-use AI agent designed to perform intricate computer-based and browser-based tasks with ease. This tool enhances productivity by autonomously interpreting tasks, planning appropriate actions, and executing them to accomplish sophisticated objectives. Depending on task complexity, it seamlessly toggles between Command-Line Interface (CLI) and Graphical User Interface (GUI) modes, providing flexibility and accommodating various user interactions.
Key Features
- Intelligent Task Execution: Utilize the built-in agent to autonomously tackle complex tasks.
- Custom Agent Development: Extend the base agent architecture to create specialized agents tailored to specific workflows and behaviors.
- TUI Interface: Engage with the agent through an intuitive Text User Interface, enabling interactive task management.
- Cross-Platform Compatibility: Built to operate effortlessly on both Windows and Linux environments.
Usage Scenarios
Ideal for organizations, researchers, and developers investigating System-Based Agentic AI, Runtime Code Generation, and Autonomous Execution, White Collar Agent is positioned as a comprehensive solution to automate workflows, thereby increasing efficiency and accuracy in task execution.
System Requirements
- Requires Python version 3.9 or higher, along with
git,conda, andpip. - Users must obtain an API key from a language model provider such as OpenAI or Gemini to maximize functionality.
Quick Start Example
To run the built-in White Collar Agent, simply export the necessary API key and invoke the CLI tool:
export OPENAI_API_KEY=<YOUR_KEY_HERE>
python -m core.main
Custom Agent Development
The base agent structure allows users to create personalized agents by subclassing core components. Here’s an example of how to create a custom agent:
import asyncio
from core.agent_base import AgentBase
class MyCustomAgent(AgentBase):
def __init__(self, *, data_dir: str = "core/data", chroma_path: str = "./chroma_db"):
super().__init__(data_dir=data_dir, chroma_path=chroma_path)
# Your implementation
def _generate_role_info_prompt(self) -> str:
return "You are MyCustomAgent — an intelligent research assistant. Your role is to find, summarize, and synthesize information from multiple sources."
agent = MyCustomAgent(data_dir=os.getenv("DATA_DIR", "core/data"), chroma_path=os.getenv("CHROMA_PATH", "./chroma_db"))
asyncio.run(agent.run())
Architecture Overview
- BaseAgent: Core engine for reasoning and execution, which can be used as is or extended.
- Action/Tool Library: Provides reusable functions for tasks like web searches and file operations.
- Task Document: A structured outline detailing the agent’s objectives and methodology.
- Planner/Executor: Manages goal decomposition and the execution process.
- LLM Wrapper: Facilitates interactions with various language models, enhancing versatility.
Community Contributions
This open-source project encourages community involvement. Contributions and suggestions are welcomed to improve White Collar Agent. For feedback, the maintainer can be contacted through their GitHub profile or email at thamyikfoong(at)craftos.net.
Explore the capabilities of White Collar Agent and leverage its functionality to automate complex tasks and enhance productivity in diverse environments.
No comments yet.
Sign in to be the first to comment.