tokentoll is a CLI tool and GitHub Action designed to analyze LLM API calls in your code. It estimates potential costs and highlights cost changes during code review, helping prevent unexpected expenses before they go live. With zero runtime dependencies, it integrates smoothly into workflows for increased efficiency and cost management.
tokentoll is an innovative CLI tool and GitHub Action designed to help developers track and manage the costs associated with Large Language Model (LLM) API calls directly within their code. With a focus on transparency, tokentoll enables users to analyze their code for potential cost implications, providing detailed insights into expenses incurred by code changes before they are merged into production.
Key Features
- Static Analysis: Automatically scans Python code to identify LLM API calls and estimate their associated costs, ensuring developers are aware of the financial impact of their changes.
- Cost Impact Reporting: Outputs cost estimates for API calls in the terminal as well as comments on pull requests, streamlining the code review process and preventing costly surprises.
- Simplicity and Efficiency: Operates without any runtime dependencies, offering a straightforward setup and operation while maintaining a robust framework for cost estimation.
The Problem Addressed
Switching between different models or adding new API calls can lead to significant cost increases—such as a 15x cost hike when changing from gpt-4o-mini to gpt-4o. An unguarded API call could add $10,000/month to operational expenses. tokentoll addresses this issue directly by ensuring developers have real-time insights into their LLM spending.
Usage
To leverage tokentoll for cost management, users can utilize simple command-line instructions:
tokentoll scan . # Scan current directory for LLM API calls and their costs
tokentoll diff HEAD~1 # Show cost impact of the last commit
tokentoll diff main..feature-branch # Compare costs between branches
GitHub Action Integration
Integrating tokentoll into GitHub CI pipelines is straightforward. Here’s how you can set up a simple cost diff action:
name: LLM Cost Diff
on:
pull_request:
paths:
- "**.py"
permissions:
pull-requests: write
jobs:
cost-diff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: Jwrede/tokentoll@v0.6.1
Comprehensive Detection
tokentoll supports multiple SDKs, including:
- OpenAI
- Anthropic
- Google GenAI
- LiteLLM
- LangChain
- Zhipu AI
This breadth of support makes tokentoll versatile and adaptable across various projects and libraries.
Example Scenarios
When scanning your code, tokentoll produces detailed reports:
Example Output for tokentoll scan
LLM API Calls Detected
============================================================
File: src/agents/summarizer.py
Line 42: openai client.chat.completions.create
Model: gpt-4o | Max tokens: 4096
Est. cost/call: $0.03 | Monthly (1000 calls/month per call site): $26.50
Line 78: openai client.chat.completions.create
Model: gpt-4o-mini | Max tokens: 1000
Est. cost/call: $0.000301 | Monthly (1000 calls/month per call site): $0.30
--
Total estimated monthly cost: $26.80
1000 calls/month per call site
This detailed breakdown illustrates both current and potential costs, fostering informed decision-making among developers.
Configuration and Customization
Developers can customize tokentoll's behavior via a .tokentoll.yml configuration file, specifying defaults and overrides to tailor the tool to their specific project requirements.
Future Enhancements
Currently focused on Python applications, tokentoll has future plans to extend support to JavaScript/TypeScript SDKs, expanding its utility to a broader spectrum of development environments.
Conclusion
Overall, tokentoll is a powerful tool that combines cost management and code analysis, helping developers maintain budgetary awareness while utilizing advanced LLM technology in their applications.
No comments yet.
Sign in to be the first to comment.