PitchHut logo
Efficiently estimate LLM token usage and API costs with ease.
Pitch

Tokuin is a powerful CLI tool designed to streamline the estimation of token usage and API costs for various LLM providers like OpenAI and Anthropic. Built in Rust for superior performance and safety, it offers features like multi-model comparison and load testing to enhance the development workflow.

Description

Tokuin is a powerful command-line interface (CLI) tool designed to assist users in estimating token usage and API costs when interacting with various large language model (LLM) providers such as OpenAI, Anthropic, and others. Developed in Rust, Tokuin prioritizes performance, portability, and safety to deliver an efficient experience.

Key Features

  • Token Count Estimation: Quickly analyze text prompts to determine token counts for specific models, including gpt-4 and gpt-3.5-turbo.
  • Cost Estimation: Calculate the associated API costs based on token pricing for each model, enabling effective budgeting and forecasting.
  • Multi-Model Comparison: Assess token usage and costs across different providers to make informed decisions.
  • Role-Based Breakdown: View token counts categorized by system, user, and assistant roles, providing deeper insights into interactions.
  • Multiple Input Formats: Accept plain text as well as JSON chat formats, accommodating a variety of input styles.
  • Flexible Output Options: Generate human-readable text or JSON outputs, ideal for scripting and further analysis.
  • Load Testing: When enabled, conduct concurrent load tests against LLM APIs, offering real-time metrics on performance, latency, and cost analysis.

Usage Examples

Basic Token Counting

Estimate the token count for a given prompt:

echo "Hello, world!" | tokuin --model gpt-4

Output:

Model: gpt-4
Tokens: 4

Cost Analysis

Estimate the cost along with the token count:

echo "Hello, world!" | tokuin --model gpt-4 --price

Output:

Model: gpt-4
Tokens: 4
Cost: $0.0001 (input)

Role Breakdown

View a breakdown by role:

echo '[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Hello!"}]' | tokuin --model gpt-4 --breakdown --price

Output:

Model: gpt-4
Tokens: 15

System:     8 tokens
User:       2 tokens
Assistant:  0 tokens
------------------------------
Total:      15 tokens
Cost: $0.0005 (input)

Multi-Model Comparison

Compare token counts and costs across multiple models:

echo "Hello, world!" | tokuin --compare gpt-4 gpt-3.5-turbo --price

Output:

Model              Tokens    Cost
-----------------------------------------------
gpt-4              4         $0.0001
gpt-3.5-turbo      4         $0.0000

Load Testing

Run load tests against various LLM APIs to measure performance, latency, and costs effectively:

# Basic load test with OpenAI
export OPENAI_API_KEY="sk-openai-..."
echo "What is 2+2?" | tokuin load-test --model gpt-4 --runs 100 --concurrency 10 --openai-api-key "$OPENAI_API_KEY"

Output: Running test...

Supported Models

Tokuin supports a range of models from various providers, including:

  • OpenAI: gpt-4, gpt-3.5-turbo
  • Anthropic: claude-3-sonnet
  • Google Gemini (requires --features gemini)

For extensive model support, including numerous APIs provided through OpenRouter, visit the OpenRouter catalog.

Tokuin's modular architecture allows easy scalability and customization for developers looking to integrate token estimation and load testing into their workflows. The project welcomes contributions to enhance functionality further and to support an ever-expanding range of models.

0 comments

No comments yet.

Sign in to be the first to comment.