Billable Lines of Code (BLoC)

Overview

AuditAgent uses a token-based estimation method called Billable Lines of Code (BLoC) to determine scan pricing. This page explains exactly how it works and why.

The Formula

estimated_bloc = ceil(tokens / 10)

Worked example: A contract with 5,000 tokens = 500 BLoC.

Here, "tokens" refers to the tokenizer count from our language models — the same tokenization used during the actual AI analysis.

Why Token-Based Counting?

Formatting Independence

Traditional SLOC varies with style (80 vs 120 char lines, one-liners, etc.)
Token-based counting produces the same result regardless of formatting
Prettier config, tab vs spaces, compressed code — all treated equally
Prevents gaming (no one can one-line a contract to be charged 1 line)

Why Comments Are Included

Unlike traditional SLOC tools, BLoC includes comments
LLMs use comments to understand code intent, context, and documentation
Comments are the key differentiator between static analysis and LLM analysis
Well-documented code is NOT penalized — it's fairly reflected

info

Including comments means the count reflects the actual content our AI models process during analysis. This is fair because comments directly contribute to analysis quality.

Validation: Is This Fair?

Our Analysis

We analyzed 344 scans over 3 months comparing BLoC estimates vs actual traditional line counts.

Key Results

Metric	Value
Scans analyzed	344 (336 completed)
Analysis period	3 months
Mean difference	-50 lines (-4.86%)
Median difference	-32.5 lines (-7.14%)
Interquartile range (IQR)	-16.84% to +2.43%

What the numbers mean

A negative percentage means we underestimate the line count. On average, users are billed for about 5% fewer lines than a traditional counter would report. The median user is billed for ~7% fewer lines. We intentionally round in favor of our users.

Edge Cases

Very small files (fewer than 20 lines) can show higher % overestimation due to rounding, but the absolute difference is minimal
Larger files show wider absolute variance, but the percentage stays reasonable
The largest observed underestimation was -1,455 lines on a large codebase

Why a Ratio of 10.0?

We tested ratios from 9.0 to 12.0
10.0 minimizes the mean error while keeping the bias user-favorable
Lower ratios (9.8-9.9) would shift toward overcharging users
Higher ratios (10.5-12.0) would increase underestimation unsustainably

Summary

Our commitment

BLoC is formatting-independent, reflects actual analysis workload, and statistically favors users by a small margin. We publish this data so you can verify our approach.

Last validated: January 2026 — based on 344 scans

Overview​

The Formula​

Why Token-Based Counting?​

Formatting Independence​

Why Comments Are Included​

Validation: Is This Fair?​

Our Analysis​

Key Results​

Edge Cases​

Why a Ratio of 10.0?​

Summary​