Billable Lines of Code (BLoC)

AuditAgent uses a token-based estimation method called Billable Lines of Code (BLoC) to determine scan pricing. This page explains how it works and why. For the per-scan cost calculation, see Scan Pricing.

The formula

BLoC is calculated from the tokenizer count used by our language models.

estimated_bloc = ceil(tokens / 10)

A 5,000-token contract equals 500 BLoC. "Tokens" here means the count from the model's tokenizer, the same tokenisation used during the actual AI analysis.

Why token-based counting

Formatting independence

Traditional SLOC varies with style (80 versus 120 character lines, one-liners, and so on).
Token-based counting produces the same result regardless of formatting.
Prettier config, tab versus spaces, and compressed code are all treated equally.
It prevents gaming. No one can one-line a contract to be charged a single line.

Why comments are included

Unlike traditional SLOC tools, BLoC includes comments.
LLMs use comments to understand code intent, context, and documentation.
Comments are the differentiator between static analysis and LLM analysis.
Well-documented code is not penalised. It is fairly reflected.

The agent reads NatSpec and inline comments as part of its working context, alongside any documentation you attach. See How It Works for the full pipeline.

info

Including comments means the count reflects the actual content our AI models process during analysis. This is fair because comments directly contribute to analysis quality.

Validation data

We analysed 344 scans over three months comparing BLoC estimates against traditional line counts.

Metric	Value
Scans analysed	344 (336 completed)
Analysis period	3 months
Mean difference	-50 lines (-4.86%)
Median difference	-32.5 lines (-7.14%)
Interquartile range (IQR)	-16.84% to +2.43%

What the numbers mean

A negative percentage means we underestimate the line count. On average, users are billed for about 5% fewer lines than a traditional counter would report. The median user is billed for ~7% fewer lines. We intentionally round in favour of our users.

Edge cases

Very small files (fewer than 20 lines) can show higher percentage overestimation due to rounding, though the absolute difference is minimal.
Larger files show wider absolute variance, but the percentage stays reasonable.
The largest observed underestimation was -1,455 lines on a large codebase.

Why a ratio of 10.0

We tested ratios from 9.0 to 12.0.
10.0 minimises the mean error while keeping the bias user-favourable.
Lower ratios (9.8 to 9.9) would shift toward overcharging users.
Higher ratios (10.5 to 12.0) would increase underestimation unsustainably.

Our commitment

BLoC is formatting-independent, reflects actual analysis workload, and statistically favours users by a small margin. We publish this data so you can verify our approach.

Last validated January 2026, based on 344 scans.

The formula​

Why token-based counting​

Formatting independence​

Why comments are included​

Validation data​

Edge cases​

Why a ratio of 10.0​