Skip to content

Detection Tiers

Ki!‘s privacy engine runs entirely on your device. Detection happens across six tiers in sequence — each tier catches what the previous one missed, with increasing sophistication and cost. Every tier completes before the prompt is sent.

Your prompt
┌─────────────────────────────────────────────────────┐
│ Tier 0.1 — Base64 / JWT Evasion Decoder │
│ Tier 0 — Dictionary (10,000+ names) │
│ Tier 0.5 — Entropy Scanner (API keys, secrets) │
│ Tier 1 — Rule Engine (50+ structured formats) │
│ Tier 1.1 — Phone Shield (libphonenumber) │
│ Tier 1.2 — Address Shield (EN/FR/ES/DE) │
│ Tier 1.5 — Greeting / Signature Context │
│ Tier 2 — Local SLM NER (Sovereign) │
└─────────────────────────────────────────────────────┘
Masked prompt → Cloud LLM

Decodes Base64-encoded strings and JWT payloads before scanning. This closes the most common evasion vector: PII hidden inside encoded blobs (e.g., a Base64-encoded CSV or a JWT containing an email claim) passes through naive regex scanners undetected. Ki! decodes, scans, and re-encodes before the downstream tiers run.


A high-speed dictionary of 10,000+ common names using greedy longest-match. This is the fastest tier — pure in-memory lookup with no regex or AI overhead. It fires on full names, surnames, and common given names across EN/FR/ES/DE name corpora.

Any term you add to your custom dictionary in Settings fires here, before any network call.


Shannon entropy scoring on every token in the prompt. Strings above the entropy threshold (e.g., ghp_aBcD1234XyZ..., sk-live-..., base64-encoded keys) are flagged as high-confidence credentials even if they match no known pattern.

Catches: API keys, bearer tokens, private keys, long random identifiers.


Deterministic regex rules with mathematical validation:

TypeValidation method
EmailRFC 5321 syntax
IBANMOD-97 checksum
Credit cardLuhn algorithm
SSN (US)Format + range check
PESEL, NIN, BSN, DNI, CPF…Country-specific checksum
PhonePassed to Tier 1.1
Street addressPassed to Tier 1.2

50+ national ID formats are covered across EU, US, UK, Brazil, and APAC jurisdictions.


Phone numbers are validated using libphonenumber rather than regex alone. This eliminates false positives on digit sequences (order numbers, product codes, timestamps) that match naive phone patterns but fail international validation. Only numbers that parse as real phone numbers in any supported country code are masked.


A heuristic street address parser recognising EN, FR, ES, and DE address formats. Catches “23 rue de Rivoli, Paris 75001” and “4200 Wilson Blvd, Arlington VA” using structural grammar, not just keyword matching.


Detects names in salutations, sign-offs, and introductory phrases that the dictionary might miss:

  • “My name is Jean-Baptiste Dubois”
  • “Kind regards, Sophie Müller”
  • “Hi Dr. Chen, as discussed…”

Trigger phrases activate a context window that treats the following proper noun as a probable name regardless of dictionary coverage.


Available on the Sovereign tier. A small language model (SLM) runs locally — no network call — and performs semantic Named Entity Recognition over the full prompt. This catches entities that all previous tiers missed: unusual names, contextual PII, ambiguous references.

The SLM runs inside a sandboxed sidecar process. If it fails or times out (> 5 seconds), Ki! blocks the prompt (fail-closed) rather than sending unmasked text.


Every detected value is replaced with a deterministic token:

[TYPE_xxxxxxxx]
  • TYPE is the PII category (PERSON, EMAIL, IBAN, PHONE, SSN, ADDRESS, CREDENTIAL, CUSTOM, …)
  • xxxxxxxx is an 8-character hex hash derived from the original value — consistent within a session so the LLM can refer back to the same entity

The mapping between token and original value is stored in your local Vault (vault.db). It never leaves your machine.


If any tier of the pipeline encounters an error, Ki! blocks the prompt entirely. There is no fallback to sending the unmasked original. The sidecar health check must pass before any prompt is processed.

This behaviour is testable: Ki! ships with a fail-closed verification test in the egress log — kill the sidecar process and observe that the next prompt is blocked, not sent.