Billable Lines of Code (BLoC)
Overview
AuditAgent uses a token-based estimation method called Billable Lines of Code (BLoC) to determine scan pricing. This page explains exactly how it works and why.
The Formula
estimated_bloc = ceil(tokens / 10)
Worked example: A contract with 5,000 tokens = 500 BLoC.
Here, "tokens" refers to the tokenizer count from our language models — the same tokenization used during the actual AI analysis.
Why Token-Based Counting?
Formatting Independence
- Traditional SLOC varies with style (80 vs 120 char lines, one-liners, etc.)
- Token-based counting produces the same result regardless of formatting
- Prettier config, tab vs spaces, compressed code — all treated equally
- Prevents gaming (no one can one-line a contract to be charged 1 line)
Why Comments Are Included
- Unlike traditional SLOC tools, BLoC includes comments
- LLMs use comments to understand code intent, context, and documentation
- Comments are the key differentiator between static analysis and LLM analysis
- Well-documented code is NOT penalized — it's fairly reflected
Including comments means the count reflects the actual content our AI models process during analysis. This is fair because comments directly contribute to analysis quality.
Validation: Is This Fair?
Our Analysis
We analyzed 344 scans over 3 months comparing BLoC estimates vs actual traditional line counts.
Key Results
| Metric | Value |
|---|---|
| Scans analyzed | 344 (336 completed) |
| Analysis period | 3 months |
| Mean difference | -50 lines (-4.86%) |
| Median difference | -32.5 lines (-7.14%) |
| Interquartile range (IQR) | -16.84% to +2.43% |
A negative percentage means we underestimate the line count. On average, users are billed for about 5% fewer lines than a traditional counter would report. The median user is billed for ~7% fewer lines. We intentionally round in favor of our users.
Edge Cases
- Very small files (fewer than 20 lines) can show higher % overestimation due to rounding, but the absolute difference is minimal
- Larger files show wider absolute variance, but the percentage stays reasonable
- The largest observed underestimation was -1,455 lines on a large codebase
Why a Ratio of 10.0?
- We tested ratios from 9.0 to 12.0
- 10.0 minimizes the mean error while keeping the bias user-favorable
- Lower ratios (9.8-9.9) would shift toward overcharging users
- Higher ratios (10.5-12.0) would increase underestimation unsustainably
Summary
BLoC is formatting-independent, reflects actual analysis workload, and statistically favors users by a small margin. We publish this data so you can verify our approach.