Skip to main content

Billable Lines of Code (BLoC)

Overview

AuditAgent uses a token-based estimation method called Billable Lines of Code (BLoC) to determine scan pricing. This page explains exactly how it works and why.

The Formula

estimated_bloc = ceil(tokens / 10)

Worked example: A contract with 5,000 tokens = 500 BLoC.

Here, "tokens" refers to the tokenizer count from our language models — the same tokenization used during the actual AI analysis.

Why Token-Based Counting?

Formatting Independence

  • Traditional SLOC varies with style (80 vs 120 char lines, one-liners, etc.)
  • Token-based counting produces the same result regardless of formatting
  • Prettier config, tab vs spaces, compressed code — all treated equally
  • Prevents gaming (no one can one-line a contract to be charged 1 line)

Why Comments Are Included

  • Unlike traditional SLOC tools, BLoC includes comments
  • LLMs use comments to understand code intent, context, and documentation
  • Comments are the key differentiator between static analysis and LLM analysis
  • Well-documented code is NOT penalized — it's fairly reflected
info

Including comments means the count reflects the actual content our AI models process during analysis. This is fair because comments directly contribute to analysis quality.

Validation: Is This Fair?

Our Analysis

We analyzed 344 scans over 3 months comparing BLoC estimates vs actual traditional line counts.

Key Results

MetricValue
Scans analyzed344 (336 completed)
Analysis period3 months
Mean difference-50 lines (-4.86%)
Median difference-32.5 lines (-7.14%)
Interquartile range (IQR)-16.84% to +2.43%
What the numbers mean

A negative percentage means we underestimate the line count. On average, users are billed for about 5% fewer lines than a traditional counter would report. The median user is billed for ~7% fewer lines. We intentionally round in favor of our users.

Edge Cases

  • Very small files (fewer than 20 lines) can show higher % overestimation due to rounding, but the absolute difference is minimal
  • Larger files show wider absolute variance, but the percentage stays reasonable
  • The largest observed underestimation was -1,455 lines on a large codebase

Why a Ratio of 10.0?

  • We tested ratios from 9.0 to 12.0
  • 10.0 minimizes the mean error while keeping the bias user-favorable
  • Lower ratios (9.8-9.9) would shift toward overcharging users
  • Higher ratios (10.5-12.0) would increase underestimation unsustainably

Summary

Our commitment

BLoC is formatting-independent, reflects actual analysis workload, and statistically favors users by a small margin. We publish this data so you can verify our approach.

Last validated: January 2026 — based on 344 scans