Dossier · Internal docs

Dossier Engine

Schema-driven document assembly. The engine knows nothing about law, insurance, or tax — all domain knowledge lives in schemas, forms, and bindings.

The Insight

Every regulated industry has the same problem: structured forms that need to be filled with data that already exists somewhere. A bankruptcy attorney types the debtor's name into Form 101, then types the same name into Schedule A/B, Schedule D, Schedule E/F, the Statement of Financial Affairs, and every other form in the package. The data exists — it was entered once. The routing is what's missing.

Dossier solves this with three ideas that compose into a general-purpose document assembly engine. The engine itself knows nothing about law, bankruptcy, insurance, tax, or any specific domain. All domain knowledge is externalized into three JSON artifacts: schemas, forms (with bindings), and data sources.

The result: populate a few schema keys from any source, and the binding engine cascades the values to every form that needs them — across 70+ forms in a bankruptcy package, or across any set of forms in any domain.

Shared Vocabulary (Schema)
Every data point in a domain gets a canonical name. debtor1.first_name is the debtor's first name, regardless of which form asks for it, regardless of which source provided it.
Bindings Route Data to Forms
Each form declares how its PDF fields map to schema keys. Enter debtor1.first_name once, and bindings carry it to every form that needs it — across 70+ forms in a bankruptcy package.
Data Sources Are Interchangeable
Manual entry, credit report XML, a case management API, or an LLM reading a pay stub all produce the same thing: schema key = value pairs. The binding engine doesn't care where the data came from.

How It Works

The Schema sits at the center. DataSources write to it (inbound), Bindings read from it (outbound). Every interaction — manual or automated — produces the same artifact: an Entry (a batch of schema key = value pairs with source tracking).

Fill Once, Populate Everywhere

The debtor's first name is entered once. The binding engine routes it to every form that needs it. This works for every data point — creditor names cascade to every schedule, addresses appear on every form that asks, social security numbers are masked where required. One schema key, many targets.

debtor1.first_name routes to 6+ form fields
debtor1.first_name
     Form 101, field "Debtor1.First name"
     Schedule A/B, field "Debtor 1 First name"
     Schedule D, field "Debtor 1 Name"
     Schedule E/F, field "Debtor 1 Name"
     Statement of Financial Affairs, field "Debtor 1"
     Declaration, field "Name of Debtor"
     ...every form in the package that needs it

Any Source, Same Result

All data sources produce the same thing: an Entry with schema key = value pairs. The binding engine doesn't distinguish between a lawyer typing a name and a credit report XML providing 80 creditor records.

SourceWhat HappensResulting Entry
Manual entryLawyer types 5 fields, clicks save1 entry, 5 values, source=“manual”, auto-confirmed
Credit reportMISMO XML parsed by DataSource config1 entry, 80 values, source=“credit-report”, pending review
Case mgmt APIREST sync from Clio/MyCase1 entry, 12 values, source=“case-mgmt”, auto-confirmed
Document uploadLLM reads a pay stub, maps to schema keys1 entry, 5 values, source=“pay-stub”, pending review
Manual correctionLawyer fixes a creditor name from the credit report1 entry, 1 value, overrides the credit report's value
Current state = merge all entries by timestamp
Latest wins per key. The case history shows every entry — what changed, where it came from, when. A correction doesn't erase the original; both entries stay in the timeline.

Example Flows

Every path through the system follows the same pattern: something produces entries on a case, bindings route them to PDF fields.

A. Lawyer fills in debtor information and clicks save
Lawyer types 5 fields, saves Entry (manual, 5 values, confirmed) Bindings route each value to every form field
No DataSource needed. Source = "manual", auto-confirmed.
B. Credit report fills creditor schedules
MISMO XML DataSource: credit-report (parse + classify + map) Entry (credit-report, 80 values, pending) Bindings Schedule D rows, Schedule E/F rows, ...
Lawyer reviews and confirms before values flow to PDFs.
C. Case management sync fills debtor demographics
Clio REST API DataSource: case-mgmt (field mapping) Entry (case-mgmt, 12 values, confirmed) Bindings Form 101, Schedule A/B, Schedule I, ...
Same schema keys, same bindings, different source. If the lawyer already typed the name manually, the API value doesn't override.
D. Pay stub upload fills income
Pay stub PDF DataSource: pay-stub (LLM extraction) Entry (pay-stub, 5 values, pending) Bindings Schedule I income fields
First time: lawyer confirms the LLM mapping (saved as reusable template). Second time: auto-applied.
E. Lawyer corrects a value from the credit report
Fixes creditor name Entry (manual, 1 value, confirmed) Overrides credit report's value for creditors.secured[0].name
Credit report entry still exists in timeline — history shows what changed and why.

What's Been Built

The first domain is bankruptcy. The extraction pipeline has processed every federal form, generated bindings, and composed packages — proving the engine works end-to-end.

Schemas
bankruptcy.individual — ~1,100 keys covering debtor identity, income, expenses, assets, liabilities, creditors (secured, unsecured, priority), executory contracts, co-debtors, prior filings, and administrative data.

bankruptcy.nonindividual — ~640 keys covering entity information, officers, revenue, assets, liabilities, and corporate-specific data.

12 shared administrative keys (case.*, attorney.*).
Forms Processed
70 federal leaf forms — Every fillable PDF in the bankruptcy form set. Fields extracted, bindings generated, schema keys mapped.

Local forms from IL and GA — State-specific local bankruptcy forms processed with the same pipeline.

16 composite forms — Chapter packages (Ch.7 Individual, Ch.7 Non-Individual, Ch.13 Individual, etc.), Petition group, Schedules group, and other logical groupings.
What Bindings Encode
debtor1.first_name appears on 40+ forms under different field names. Creditor arrays map to repeating table rows on Schedule D, E/F, and the creditor matrix. The means test uses income from Schedule I and expenses from Schedule J. Summary totals aggregate values from individual schedules. Conditional forms are included or excluded based on case data.

The Extraction Pipeline

Building a new domain follows a repeatable 5-step pipeline:

1. Define the schema — Enumerate every data point in the domain. Give each a canonical key, type, label, and group.

2. Process the forms — Take each government/standard PDF, extract its AcroForm fields (key, type, page, position, rect).

3. Generate bindings — Map each PDF field to the appropriate schema key. This is where domain knowledge is captured: understanding that "Debtor1.First name" on Form 101 and "Debtor 1 Name" on Schedule D both mean debtor1.first_name.

4. Compose form packages — Group leaf forms into composites (Petition, Schedules, Chapter 7 Package). Write cross-child bindings that route data between sibling forms.

5. Configure data sources — Define how external data (credit reports, APIs, documents) maps to schema keys.

Engine Implementation

Expression Engine
Tokenizer, parser, AST, evaluator. 22 Excel-style functions (string, math, logical, date). AST caching for performance.
Binding Resolver
Condition evaluation, cycle detection, multi-target routing. Resolves entries to form field values.
PDF Filler
pdf-lib AcroField filling. Multi-form export as merged PDF or ZIP. Handles text, checkboxes, dropdowns.
DataSource Framework
Credit reports, CSV import, API sync, and document upload. Same expression engine for field mapping.

Why It Generalizes

The engine has no concept of "law" or "bankruptcy." It knows five abstract things:

Schemas
Vocabularies of typed data points
Forms
PDFs with extractable AcroForm fields
Bindings
Routes from schema keys to form fields
DataSources
Recipes for importing external data
Entries
Batches of key=value with source tracking

Swap the schema, forms, and bindings — you have a different domain. The engine, server, database, and API routes are unchanged.

What Changes Per Domain

To target...What you buildCode changes
Another law type (immigration, family, PI)New schemas + forms + bindings + UI configNone
Tax preparationNew schemas + forms. Expression engine handles calculations.None (maybe DataSource for tax tables)
Insurance claimsNew schemas + forms + 1-2 tables for payouts/settlementsMinimal — new routes for claim financials
Real estate closingsNew schemas + forms + 1-2 tables for escrow managementMinimal — new routes for escrow
Healthcare credentialingNew schemas + forms + 1 table for credential expiryMinimal — new routes for re-credentialing
Government permitsNew schemas + forms + 1-2 tables for inspection workflowsMinimal — new routes for inspections

For any industry where structured forms need to be filled with data from a shared vocabulary, the engine works as-is. The only question is whether the domain needs concepts beyond the core model — and if so, it's 1-2 new tables, not a rewrite.

Cross-Domain Concept Mapping

The core concepts translate directly across industries:

Dossier Concept Law Insurance Real Estate Tax Healthcare Government
Case Bankruptcy case Claim Transaction Return Provider app Permit app
Schema Data vocabulary (1,100 keys) Claim fields Transaction fields Tax data Provider info Application data
Form Court forms ACORD forms Closing docs IRS forms Credentialing apps Application forms
Filing Court filing Claim submission County recording IRS e-file Board submission Agency submission
Binding Schema → form fields Same Same Same Same Same
Validation Means test, schedule totals Coverage limits Loan-to-value Tax calculations License expiry Zoning compliance
Contact Debtor, Attorney, Trustee Claimant, Adjuster Buyer, Seller, Agent Taxpayer, CPA Provider, Payer Applicant, Inspector
DataSource Credit report, CSV Policy system MLS, title search W-2 import NPDB, license DB GIS, prior permits

Next Verticals

Ranked by volume and Dossier fit:

VerticalFitRationale
Immigration 10/10 8-13M USCIS form receipts/year. ~100+ federal fillable PDFs. Same architecture: federal forms, fillable PDFs, schema → bindings → AcroFields. No local forms needed.
Family Law 8/10 1.5-2M matters/year. Very form-dense (15-30 forms per contested case). State-by-state build, but schema overlaps with bankruptcy (asset/debt inventories, financial disclosures).
Eviction 7/10 3.6M filings/year. Simple forms (3-7 per case) but massive volume. Good for bulk automation.
Probate 7/10 2.6M filings/year. Schema overlaps with bankruptcy (asset/debt inventories, creditor lists). State variation is the main cost.
Workers' Compensation 6/10 2.5M claims/year. IAIABC standards provide a natural schema. Attorney-side market is open.

The 8 Names

Name
What it is
Layer
1
Schema
Named collection of data point definitions for a domain. The shared vocabulary.
vocabulary
2
Form
Recursive. Leaf (has PDF + fields), composite (has children), or both. References a schema.
composition
3
Field
A fillable spot on a PDF — text box, checkbox, dropdown.
composition
4
Binding
Routes a schema key to form fields or $child targets. Outbound from the schema.
composition
5
Validation
Expression-based correctness rule. Required fields, format checks, cross-field logic.
composition
6
DataSource
Reusable recipe for getting data into the schema. Parse rules + field mapping. Inbound to the schema.
producer
7
File
One case — a client's forms filled with their data. References a top-level form.
runtime
8
Entry
A batch of data point changes on a file. One event (save, import, sync) = one entry with N values.
runtime
Blueprint → merged into Form
Datasheet → merged into Schema
DatasheetEntry → Schema entry (no separate name)
file_values → Entry

Architecture

Four layers, one spine. The Schema sits at the center. DataSources write to it (inbound), Bindings read from it (outbound). Every interaction — manual or automated — produces Entries on a File.

Data flow — schema as spine
Schema Data point vocabulary — the spine everything references bankruptcy.individual debtor1.first_name creditors.secured[].name bankruptcy.nonindividual entity.legal_name officer[].name DataSources Inbound — write to schema keys Manual Entry Credit Report (MISMO XML) Case Management API Document Upload (LLM) Forms + Bindings Outbound — read schema keys, fill PDFs Form leaf (PDF) or composite (children) children[] bindings[] fields[] validations[] schema_id filePath? children[] PDF Fill ZIP Export File One case — Entries (batch of values per event) · Attachments · Status DataSources map to schema keys Bindings reference schema keys produce Entries read Entries fill forms

Entities

The domain splits into two sides: Producers create entry values, Consumers use them. The schema key = value pair is the central currency.

Producers Things that create entry values — user input, credit reports, APIs, document uploads schema key = value Consumers Things that read values — bindings, computed fields, validations, conditions
F
Form
Recursive · Consumer
A leaf form has a PDF with extracted AcroForm fields — gets filled and printed. A composite form groups children (other forms) and routes data between them via bindings. A form can be both (has a PDF and children). References a schemaId. Has fields[], children[], bindings[], and validations[].
S
Schema
Vocabulary · The spine
Named collection of data point definitions for a domain. Every key has a type, label, hint, and group. Forms bind to schema keys. DataSources map to schema keys. One schema per domain variant (e.g. bankruptcy.individual).
f
File
Runtime instance
A filled instance of a form for a specific case (e.g. "Smith Ch7 Filing"). References a formId + version. Contains entries (batches of values) and tracks which DataSources produced them.
D
DataSource
Reusable recipe · Producer
A reusable definition that knows how to turn external data into schema key values. Combines parse rules (how to read the source format) with field mapping (how to transform into schema keys). Lives in domains/{domain}/data-sources/. Same expression engine as bindings.
E
Entry
Batch of values · Runtime
A batch of data point changes on a file. One event (save, import, API sync) = one entry with N values. Has a source, timestamp, confirmed flag, and a values map of schema key = value pairs. A file is a timeline of entries.

Schema

The schema defines what data points exist for a domain. Every key has a type, label, hint, and group. Forms bind to schema keys. DataSources map to schema keys. One schema per domain variant.

bankruptcy.individual
debtor1.first_name
debtor1.ssn
creditors.secured[].name
creditors.secured[].claim_amount
income.d1.gross_wages
sofa.q1.prior_addresses[]
case.district
...1,100 keys
bankruptcy.nonindividual
entity.legal_name
entity.ein
officer[].name
officer[].title
creditors.secured[].name
revenue.gross_year1
case.district
...640 keys
insurance.auto.claim
claimant.full_name
vehicle.vin
incident.date
damage[].part
damage[].cost
future domain...

Schema Entry schema entry

One data point definition in a schema. The schema entry is the human-facing side — simpler than the raw form fields.

PropertyTypeDescription
keystringDotted key — debtor.first_name, assets.total
typetext | money | date | boolean | enum | numberData type (not PDF widget type)
labelstringQuestion label shown to user
hintstring?Plain-language explanation
expressionstring?If present, this is computed (not entered by user)
optionsstring[]?Choices for enum type

Form

One entity replaces the old Form + Blueprint split. A form can be a leaf (has a PDF with fields), a composite (has children), or both. Recursive — composites can contain composites.

Leaf form
Has a PDF file, extracted fields, bindings.
Gets filled and printed.
Form 101 — Voluntary Petition
155 fields, 67 bindings
Composite form
Has children (other forms), bindings that route data between them. No PDF of its own.
The Schedules
11 children, 17 bindings
Top-level form
The complete filing package. Children include composites and leaf forms. Cross-child bindings.
Chapter 7 Individual
15 children, 8 bindings, 335 schema keys

Form entity

A PDF with extracted fields, children, and bindings. All optional — a leaf has fields + filePath, a composite has children, a top-level form has a schemaId.

PropertyTypeDescription
idstringDeterministic UUID (see ID convention)
tenantIdstring | nullNull = platform-level
numberstring | nullOfficial form number — "101", "106E/F"
namestringFull title
effectiveDatestring | nullWhen this version took effect — "2024-06-22"
pagesnumberPage count of the PDF (0 for composites without a PDF)
schemaIdstring | nullSchema this form binds to (typically set on the top-level form)
filePathstring | nullPath to the PDF file (null for pure composites)
childrenChildEntry[]Child forms in this container (empty for leaf forms)
fieldsFormField[]Extracted AcroForm fields (empty for composites)
bindingsBinding[]Bindings ($.field for self, $child.field for children)
validationsValidation[]Form-level validation rules

FormField on Form

A single field extracted from a PDF's AcroForm dictionary.

PropertyTypeDescription
keystringField name from the PDF (e.g. debtor_name)
typetext | checkbox | dropdown | radio | signature | optionListAcroForm widget type
pagenumber1-indexed page number
rect{ x, y, w, h } | nullPosition in PDF points — used for UX highlighting
optionsstring[] | nullChoices for dropdown/radio/optionList
labelstring | nullHuman-readable label
hintstring | nullHelp text explaining what to fill
needsReviewbooleanTrue if the key is generic and needs human cleanup
sourceacro | llm | manualHow the field was discovered

ChildEntry on Form

A reference to a child form within a composite form.

PropertyTypeDescription
idstringUUID of the child form
keystringAlias used in expressions: $key.field
positionnumberPrint order
conditionstring?Include/exclude expression: case.has_preparer == true

Binding

Connects a source expression to one or more target field addresses. The core data-wiring mechanism. Array syntax with [] supports repeating groups.

PropertyTypeDescription
sourcestringExpression: debtor.first_name, $b106ab.line55 + $b106ef.total
targetsstring[]Target addresses: ["$b106sum.1a", "$b106ab.line63"]
conditionstring?Boolean expression — binding only applies when true
notestring?Documentation for the binding author

Cross-form sync

A composite form's binding can route one schema key to multiple child forms. The debtor's name entered once appears on every form that needs it.

Binding on the Ch7 Individual top-level form
{
  "source": "debtor1.first_name",
  "targets": [
    "$b101.Debtor1.First name",
    "$b106ab.Debtor 1 First name",
    "$b106d.Debtor 1 Name",
    "$b106ef.Debtor 1 Name",
    "$b107.Debtor 1 Name"
  ]
}

Validation on Form

PropertyTypeDescription
expressionstringBoolean expression that should evaluate to true
severityerror | warningErrors block export, warnings inform
descriptionstringHuman message shown when validation fails

DataSource

A reusable recipe that turns external data into schema key values. Combines parse rules (how to read the source format) with field mapping (how to transform into schema keys). Scoped to a schema, not to individual forms.

DataSource entity

PropertyTypeDescription
idstringUnique identifier
namestringHuman-readable name — "Credit Report (Bankruptcy)"
typeapi | document | import | manualHow data is acquired
schemaIdstringWhich schema this DataSource maps to
configobjectParse rules, field mapping, execution config (API auth, XML paths, LLM prompts)

Inbound vs Outbound

DataSource (inbound)
Direction: external data → schema key
Author: integration developer or platform
Contains: parse rules, field mapping, execution config
Lifecycle: written once, used by every case
Example: cms.contact.first_name → debtor1.first_name
Binding (outbound)
Direction: schema key → PDF form field
Author: form builder (platform)
Contains: source key, target field(s), condition
Lifecycle: defined on a form, static
Example: debtor1.first_name → $b101.Debtor1.First name

The schema is the clean boundary. DataSources don't know about PDF fields. Bindings don't know about APIs. Both use the same key vocabulary — build the key picker once, use it everywhere.

Input Types

TypeParse StepExamples
apiDeterministic parser (XML→JSON, JSON→JSON)Credit report (MISMO XML), Clio, PACER
documentLLM extraction, then optionally saved as reusable templatePay stub, tax return, bank statement
importColumn/field mappingCSV, JSON file upload
manualNone — direct user inputSchema form entry

Data Flow

Example: Credit Report → Bankruptcy Schema Keys
MISMO XML raw credit report xml → json parse step DataSource Config filter + classify + map Entry Values debtor1.full_name = "NICKIE GREEN" creditor.unsecured[0].name = "AMEX" Source Parser Mapper Output

DataSource Structure

domains/bankruptcy/data-sources/credit-report.json
{
  "id": "credit-report-bankruptcy",
  "name": "Credit Report (Bankruptcy)",
  "type": "api",
  "schemaId": "bankruptcy.individual",

  "singleton": {                            // one-to-one mappings
    "debtor1.full_name": "CONCAT(borrowers[0].firstName, ' ', borrowers[0].lastName)",
    "debtor1.ssn": "borrowers[0].ssn"
  },

  "collections": {                          // one-to-many mappings
    "creditors": {
      "source": "liabilities",
      "filter": "AND(IN(status, 'Open'), balance > 0)",
      "classify": {
        "secured": { "rule": "IN(loanType, 'Secured', 'Mortgage')", "targetPrefix": "creditor.secured[]" },
        "unsecured": { "rule": "DEFAULT", "targetPrefix": "creditor.unsecured[]" }
      },
      "fields": {
        "name": "creditorName",
        "account_number": "RIGHT(accountNumber, 4)",
        "claim_amount": "balance"
      }
    }
  }
}
Document templates
For document DataSources, the first upload uses LLM to extract fields. The user reviews and confirms the mapping. The extraction rules are then saved as a reusable template — next time the same document type is uploaded, it skips LLM and uses the saved template directly.

File & Entry

Everything that produces data on a file — typing, credit reports, API syncs — produces the same thing: an Entry. One event = one entry containing a batch of data point changes. A file is a timeline of entries.

File entity

PropertyTypeDescription
idstringUUID
tenantIdstringOwning tenant
formIdstringTop-level form this file is an instance of
formVersionnumberFrozen snapshot version
namestringCase name — "Smith Ch7 Filing"
statusstringdraft | in-review | ready | filed | closed

Entry batch of values

PropertyTypeDescription
sourcestringWhat produced this entry — "manual", "credit-report", "case-mgmt"
timestampstringWhen this entry was created
confirmedbooleanFalse = pending review (e.g. credit report auto-import)
valuesRecord<string, any>Map of schema key = value pairs
An entry has:
Source
credit-report
Timestamp
Mar 20 14:40
Confirmed
pending
Values
80 schema keys
creditors.secured[0].name = "CHASE BANK NA"
creditors.secured[0].claim_amount = 180000
creditors.unsecured[0].name = "AMEX"
creditors.unsecured[0].claim_amount = 12450
...76 more values

File timeline

Nickie Green's Chapter 7 case. Three entries: lawyer types basics, credit report is pulled, lawyer corrects a name.

Entry 1 Manual save Mar 20 14:32 ✓ confirmed
debtor1.first_name = Nickie
debtor1.last_name = Green
case.district = Northern District of Illinois
case.chapter = 7
debtor1.street = 123 Main St 5 values
Entry 2 Credit Report — Experian via Xactus Mar 20 14:40 ? pending
creditors.secured[0].name = CHASE BANK NA
creditors.secured[0].claim_amount = 180,000
creditors.unsecured[0].name = AMEX
creditors.unsecured[0].claim_amount = 12,450 ...80 values total
Entry 3 Manual correction Mar 20 15:10 ✓ confirmed
creditors.secured[0].name = Chase Bank, N.A. 1 value (overrides Entry 2)

Three entries, not 86. The credit report is one entry with 80 values (pending confirmation). The lawyer's correction is a separate entry that overrides one value from the credit report. Current state = merge all entries by priority and timestamp, latest wins per key. PDF generation = snapshot of the current state.

Composition Graph

A composite form contains children — other forms (leaf or composite). Each child has a key (alias used in expressions) and a position (print order). Children can have a condition that controls inclusion.

Example: Chapter 7 Non-Individual Filing
Top-Level Form (composite) Ch7 Non-Individual F0000005-BA01-...-C407...0002 Petition key: petition Schedules key: schedules SOFA (B207) key: b207 Top 20 key: top20 B201 key: b201 Declaration key: b202 Sum A/B D E/F Form (composite) Form (leaf) Schema (referenced by top-level form) debtor.name, case.district, assets.has_real_property, revenue.gross_year1_amount, ...
Key rule
Bindings only flow downward. A composite form can reference any descendant form's fields via $key.field. Never up, never sideways. Cross-tree bindings live on the common ancestor.

Expression Syntax

All source, condition, and computed expression fields are parsed by the expression engine in packages/core/src/expressions/.

References

PatternMeaningExample
$child.fieldA child form's field$b106ab.line55
$.fieldCurrent form's own field (self-ref)$.line55
group.keySchema key (dotted notation)debtor.first_name
'...'String literal'installments'
bare wordLiteral or function nametrue, 42, CONCAT

Disambiguation Rule

How the parser decides
Starts with $ → form field reference
Has a . but no $ → schema key
No . and no $ → literal value or function name

Operators

+  -  *  /          // arithmetic (+ also concatenates strings)
== != >  <  >= <=  // comparison

Functions

22 built-in functions following Excel/Google Sheets naming conventions. Used in bindings, DataSource mappings, validations, and computed fields.

CategoryFunctionExample
StringCONCAT(a, b, ...)CONCAT(first, ' ', last)
UPPER(s)UPPER(name) → "JOHN"
LOWER(s)LOWER(name) → "john"
TRIM(s)TRIM(name) → strip whitespace
LEFT(s, n)LEFT(ssn, 3) → first 3 chars
RIGHT(s, n)RIGHT(acct, 4) → last 4 chars
LEN(s)LEN(name) → string length
SUBSTITUTE(s, old, new)SUBSTITUTE(phone, "-", "")
MathSUM(a, b, ...)SUM(line1, line2) — skips nulls
MAX(a, b)MAX(income, 0)
MIN(a, b)MIN(balance, limit)
COUNT(arr)COUNT(creditors) → array length
ROUND(n, decimals?)ROUND(amount, 2) → 1234.57
ABS(n)ABS(income - expenses)
LogicalIF(cond, then, else)IF(joint, d2_name, "")
AND(a, b, ...)AND(employed, income > 0)
OR(a, b, ...)OR(has_car, has_house)
NOT(a)NOT(exempt)
IN(val, a, b, ...)IN(type, "Secured", "Mortgage")
DateTODAY()TODAY() → MM/DD/YYYY
YEAR(d)YEAR(opened) → 2020
MONTH(d)MONTH(opened) → 5
DAY(d)DAY(filed) → 23
Date formats
Date functions accept YYYY-MM-DD, MM/DD/YYYY, and YYYY-MM. All functions return null for null input (null propagation).

AST

Expressions are parsed into an AST with 5 node types:

literal   → { value: "hello" | 42 | true }
reference → { key: "debtor.first_name" }         // schema key
formRef   → { child: "b106ab", field: "line55" } // $child.field
binary    → { op: "+", left: ..., right: ... }
call      → { name: "CONCAT", args: [...] }

Scoping Rules

Not every reference type is available in every context. This matrix defines what's legal where.

Context $child.field $.field schema key
Composite form binding source
Composite form binding target
Composite form condition
Child entry condition (siblings)
Leaf form computed field
Leaf form validation
Ancestry rule
A binding on a composite form can reference any form that is a descendant (child, grandchild, etc.) of that form. Never up, never sideways — only down. Cross-tree bindings (e.g. means test results → schedules summary) must live on the common ancestor form.

ID Convention

IDs are deterministic UUIDs that encode the entity type, domain, and a hex-leet name. Readable at a glance.

F0001010 type + number
-
BA01 domain + version
-
4000 UUID v4
-
8000 constant
-
000000000000 hex name

Type Prefixes

PrefixMeaningExample
FForm (leaf or composite)F0001010-BA01-... — Form 101
SSchemaS0000001-BA01-... — bankruptcy.individual
Legacy prefixes
Older IDs may use B (master blueprint) and BB (nested blueprint). These map to composite forms under the merged model.

Group 5: Hex Name (12 chars, read right to left)

NNNNNNNNCCCT
│       │  └―― T: filer type (last char)
│       └――――― CCC: chapter hint (0C7=Ch7, CD3=Ch13, 000=shared)
└―――――――――――― NNNNNNNN: hex-leet name (zero-padded)
Filer Type (T)Meaning
1Individual
2Non-individual (corporation, LLC, partnership)
0Form (no filer type) or shared

Examples

F0000001-BA01-4000-8000-C40700000001   ← Ch7 Individual (top-level)
F0000005-BA01-4000-8000-C40700000002   ← Ch7 Non-Individual (top-level)
F0000101-BA01-4000-8000-BE7171090001  ← Petition (composite, Individual)
F0000206-BA01-4000-8000-5C4ED0000002  ← Schedules (composite, Non-Individual)
F0001010-BA01-4000-8000-000000000000  ← Form 101 (leaf, no filer suffix)

Database Tables

PostgreSQL 17 with Drizzle ORM. Core tables store forms, files, and entries.

TablePurposeKey Columns
formsAll forms (leaf + composite)id, tenant_id, name, number, schema_id, file_path, fields, children, bindings, validations
schemasDomain schemasid, name, domain, entries (jsonb)
casesClient matters (top-level)id, tenant_id, schema_id, form_id, name, status, references (jsonb), dates (jsonb)
filesFiled documents within a caseid, case_id, form_id, form_version, name, status, filed_at, snapshot (jsonb)
entriesBatches of values per casecase_id, source, timestamp, confirmed, values (jsonb)
data_sourcesReusable import recipesid, name, type, schema_id, config (jsonb)
JSONB columns
fields, bindings, validations, children, and entries are stored as JSONB. This keeps the schema flat — no join tables for these. The structured types (FormField[], Binding[], etc.) live in code.

Example Flows

Every path through the system — manual or automated — follows the same pattern: something produces entries on a file, bindings route them to PDF fields.

A. Lawyer fills in debtor information and clicks save
Lawyer types 5 fields, saves Entry (manual, 5 values) Bindings $b101.Debtor1.First name, $b106ab.Debtor 1, ...
No DataSource needed. One save = one entry with all changed values. Source = "manual", auto-confirmed. The binding engine routes each value to every form field that needs it.
B. Credit report fills creditor schedules
MISMO XML DataSource: credit-report Entry (credit-report, 80 values, pending) Bindings $b106d.Creditors Name, $b106ef.Creditors Name, ...
The DataSource parses MISMO XML and maps fields to schema keys. One import = one entry with 80 values. Pending confirmation — lawyer reviews before values flow to PDFs.
C. Case management sync fills debtor demographics
Case management REST API DataSource: case-mgmt Entry (case-mgmt, 12 values) Bindings $b101.Debtor1.First name, $b106ab.Debtor 1, ...
One sync = one entry. Same schema keys, same bindings, different source. If the lawyer already typed the name manually (higher-priority entry), the Clio values don't override. Both entries stay in the timeline.
D. Pay stub upload fills income
Pay stub PDF DataSource: pay-stub (LLM) Entry (pay-stub, 5 values, pending) Bindings $b106i.Amount 1 Debtor 1, ...
LLM extracts values, DataSource maps them to schema keys. One upload = one entry. First time: lawyer confirms the mapping (it becomes a reusable DataSource template). Second time: auto-applied.
E. Lawyer corrects a value from the credit report
Fixes creditor name, saves Entry (manual, 1 value) Overrides Entry B's value for creditors.secured[0].name
A new entry with one value. Higher priority (manual > credit-report) and newer timestamp. The credit report entry still exists in the timeline — the file history shows what changed and why.
F. Same data, different schema (future)
Same MISMO XML DataSource: credit-report-auto Entry (credit-report, N values) Different bindings Insurance form fields
A different DataSource maps the same credit report to a different schema (insurance.auto.claim). The DataSource is scoped to a schema, not to forms.