Tokuin is a powerful CLI tool designed to streamline the estimation of token usage and API costs for various LLM providers like OpenAI and Anthropic. Built in Rust for superior performance and safety, it offers features like multi-model comparison and load testing to enhance the development workflow.
Tokuin is a powerful command-line interface (CLI) tool designed to assist users in estimating token usage and API costs when interacting with various large language model (LLM) providers such as OpenAI, Anthropic, and others. Developed in Rust, Tokuin prioritizes performance, portability, and safety to deliver an efficient experience.
Key Features
- Token Count Estimation: Quickly analyze text prompts to determine token counts for specific models, including
gpt-4andgpt-3.5-turbo. - Cost Estimation: Calculate the associated API costs based on token pricing for each model, enabling effective budgeting and forecasting.
- Multi-Model Comparison: Assess token usage and costs across different providers to make informed decisions.
- Role-Based Breakdown: View token counts categorized by system, user, and assistant roles, providing deeper insights into interactions.
- Multiple Input Formats: Accept plain text as well as JSON chat formats, accommodating a variety of input styles.
- Flexible Output Options: Generate human-readable text or JSON outputs, ideal for scripting and further analysis.
- Load Testing: When enabled, conduct concurrent load tests against LLM APIs, offering real-time metrics on performance, latency, and cost analysis.
Usage Examples
Basic Token Counting
Estimate the token count for a given prompt:
echo "Hello, world!" | tokuin --model gpt-4
Output:
Model: gpt-4
Tokens: 4
Cost Analysis
Estimate the cost along with the token count:
echo "Hello, world!" | tokuin --model gpt-4 --price
Output:
Model: gpt-4
Tokens: 4
Cost: $0.0001 (input)
Role Breakdown
View a breakdown by role:
echo '[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Hello!"}]' | tokuin --model gpt-4 --breakdown --price
Output:
Model: gpt-4
Tokens: 15
System: 8 tokens
User: 2 tokens
Assistant: 0 tokens
------------------------------
Total: 15 tokens
Cost: $0.0005 (input)
Multi-Model Comparison
Compare token counts and costs across multiple models:
echo "Hello, world!" | tokuin --compare gpt-4 gpt-3.5-turbo --price
Output:
Model Tokens Cost
-----------------------------------------------
gpt-4 4 $0.0001
gpt-3.5-turbo 4 $0.0000
Load Testing
Run load tests against various LLM APIs to measure performance, latency, and costs effectively:
# Basic load test with OpenAI
export OPENAI_API_KEY="sk-openai-..."
echo "What is 2+2?" | tokuin load-test --model gpt-4 --runs 100 --concurrency 10 --openai-api-key "$OPENAI_API_KEY"
Output: Running test...
Supported Models
Tokuin supports a range of models from various providers, including:
- OpenAI:
gpt-4,gpt-3.5-turbo - Anthropic:
claude-3-sonnet - Google Gemini (requires --features gemini)
For extensive model support, including numerous APIs provided through OpenRouter, visit the OpenRouter catalog.
Tokuin's modular architecture allows easy scalability and customization for developers looking to integrate token estimation and load testing into their workflows. The project welcomes contributions to enhance functionality further and to support an ever-expanding range of models.
No comments yet.
Sign in to be the first to comment.