Intake — Many Front Doors, One Artifact
Internal architecture reference for the Dossier dev team. Companion to engine.md; this is the zoomed-in view of the "Any Source, Same Result" section.
The invariant: every path produces an Entry
Every way data enters a Dossier case — a lawyer typing in the Data tab, a client filling a portal form on their phone, an MISMO XML credit report, a pay stub drop, a Clio sync, a CSV import — terminates at the same write: one row in the entries table. An Entry has case_id, source, optional data_source_id, user_id (invoker), timestamp, confirmed, values (JSONB), context (JSONB).
The binding engine reads the merged set of entries on a case and routes schema keys to PDF fields. It does not know, and does not care, where an entry came from. Intake variety is absorbed at the schema boundary. This is what lets us plug in new front doors — a new DataSource config is usually enough, with no changes to the server, the client, or the forms.
| Path | source tag |
Invoker | Review state |
|---|---|---|---|
| Manual entry by lawyer in client app | manual |
authenticated user | auto-confirmed |
| Portal intake by client (full-page) | intake (+ data_source_id identifies slug) |
public / portal user | pending review (confirmed=false) |
| Embedded widget on firm's own site | intake (same pipeline) |
public user | pending review |
| Credit report XML upload | credit-report |
user who uploaded | pending review |
| Doc drop (pay stub, bank statement) | doc-upload with subtype |
user who uploaded | pending review (extraction pipeline scaffolded; mapping templates planned) |
| API sync (Clio) | clio |
system | auto-confirmed per config |
| CSV import | csv |
user | pending review |
Type reference: Entry in packages/core/src/api/types.ts (lines 312–325). The invoker field is populated by a server join and is read-only on the wire.
Path A — Manual entry
The shortest path. The Data tab in the client app is read-only by default (per UX rule); edits start with an explicit "New Entry" action, which opens a form over the schema sections and writes one Entry on save.
- Route:
POST /api/cases/:id/entries(packages/server/src/routes/cases.ts) - Source tag:
manual - Invoker: the signed-in user from the JWT
- Confirmed:
trueon write
An Entry produced here is immediately visible to the binding engine, so the PDF preview in the split-pane editor updates on save. Corrections to any prior Entry — whether it came from a credit report, a portal submission, or another lawyer — are just new manual entries that append to the case's entry timeline.
Path B — Portal + embeddable widget
This is the biggest intake surface and the one most likely to grow. The portal is a real React app on port 4105:
packages/portal/src/
├── intake/
│ ├── intake-shell.tsx full-page intake experience
│ ├── intake-modes.ts grid | spread | wizard | chat | voice
│ ├── landing.tsx
│ └── modes/
│ ├── grid.tsx
│ ├── spread.tsx
│ ├── wizard/
│ ├── chat/ conversational intake (active)
│ └── voice.tsx
├── pages/
│ ├── dashboard.tsx invited-client dashboard
│ ├── login.tsx
│ └── complete.tsx
├── widget/
│ ├── widget-entry.tsx embeddable entry point (Shadow DOM, CSS inlined)
│ ├── widget-bubble.tsx floating chat-style bubble
│ └── widget-panel.tsx side panel that hosts the intake mode
└── shared/
├── api.ts calls /intake/:slug and /portal/:tenantSlug
├── intake-resolver.ts resolves tenant token → IntakeContext
├── theme.ts applies tenant branding (primary/accent colors)
└── fallback-*.ts offline defaults so the widget renders before fetch
Three surfaces
- Full-page intake (
intake-shell.tsx) — rendered at the tenant-branded portal URL. The client fills the whole intake in one session, optionally picks an intake mode from those the tenant allows (TenantBranding.allowedModes, with legacyallowChat/allowFormfallbacks — seepackages/core/src/api/types.tslines 34–40). - Dashboard (
pages/dashboard.tsx) — for invited clients returning via an invite token. Shows progress, pending documents, pending questions. - Embeddable widget (
widget/*) — firms drop a<script>onto their own website; the widget renders into a Shadow DOM (seecssTextimport inwidget-entry.tsx) so the host page's CSS cannot bleed in. Three modes:- Bubble — floating chat-style launcher (
widget-bubble.tsx) - Panel — side panel that hosts the chosen intake mode (
widget-panel.tsx) - Full — the same
intake-shellserved inline
- Bubble — floating chat-style launcher (
All three surfaces call the same backend and produce the same artifact.
Server routes
Defined in packages/server/src/routes/intake.ts and packages/server/src/routes/portal.ts. All three routes go through checkIntakeRate(ip) from packages/server/src/services/rate-limit.ts — public endpoints, untrusted callers.
GET /portal/:tenantSlug[?intake=<slug>]— tenant bootstrap. ReturnsPortalBootstrap { tenant, intake | null }(seetypes.tslines 56–72). Used by the portal SPA on first load to pick up branding + the default intake.GET /intake/:slug— fetch an intake config. Returns{ name, slug, schemaId, config, prefill }. If a?token=is attached and resolves to a live invite whosedataSourceIdmatches the slug, the server prefills the response with confirmed values from the target case (loadCaseValues(invite.caseId)inservices/intake.ts).POST /intake/:slug— submit. Acceptsapplication/json(values + context) ormultipart/form-data(values + context + file parts). File parts are carried asIntakeUploadFile { fieldname, filename, mimeType, buffer }and written intostorage/attachments/<caseId>/with a UUID prefix. SeereadSubmissionBodyinroutes/intake.tsandwriteAttachmentFilesinservices/intake.ts.
IntakeConfig shape
IntakeConfig is the config payload on a DataSource row whose type = 'intake'. From packages/core/src/api/types.ts lines 235–256:
interface IntakeQuestion {
key: string // schema key this answer writes to
prompt?: string
hint?: string
options?: Array<{ value: string; label: string }>
condition?: string // expression evaluated against current values
kind?: 'field' | 'upload'
uploadCategory?: string // category for the attachments row
}
interface IntakeSection {
id: string
title: string
description?: string
condition?: string
sections?: IntakeSection[] // recursive
questions?: IntakeQuestion[]
}
interface IntakeConfig { sections: IntakeSection[] }
Four published intake configs live under domains/bankruptcy/data-sources/:
intake-atlas.json— Atlas firm's new-matter intakeintake-debtstoppers.json— DebtStoppers branded intakeintake-greenfield.json— Greenfield branded intakeintake-individual-self.json— self-file individual chapter 7 intake
Each is a single JSON document with slug, tenantSlug, schemaNamespace, published, and a nested config.sections tree. These are the IP — the lawyer's expertise about what to ask, in what order, conditioned on what — externalized into data. Adding a new intake flow means adding a JSON, not writing code.
Tokenized invites
An attorney sends a client an intake link of the form /intake/<slug>?token=<token>. Behaviors:
GETwith token: resolves the invite viafindInviteByToken(token)(services/intake.tslines 219–242), validatesused_atandexpires_at, then prefills the response with merged confirmed values from the target case. The client sees the questions with their existing answers already populated.POSTwith token: writes the new Entry against the invite'scase_id(not a new case), marks the inviteused_at = NOW(), writes anintakeactivity row. SeesubmitInviteIntakeinservices/intake.tslines 183–217.
Without a token, POST creates a brand-new case in the target tenant (submitPublicIntake, lines 138–176). The tenant is resolved via IntakeRecord.tenantId, falling back to a seeded dev tenant when the intake is platform-level. A cases row is INSERTed with status = 'Intake' and a name derived from debtor1.first_name + debtor1.last_name (see deriveCaseName); the Entry, attachments, and an activity row follow.
From submission to Entry
Client fills portal form (values) + uploads (files)
↓
POST /intake/:slug (multipart)
↓
readSubmissionBody → IntakeSubmission { values, context, files }
↓
submitPublicIntake | submitInviteIntake
├── writeAttachmentFiles → storage + attachments rows (category from question.uploadCategory)
├── INSERT INTO entries (case_id, source='intake', data_source_id=<slug>, confirmed=false, values, context)
└── INSERT INTO activity ('intake', 'Intake submitted via <slug>', {dataSourceId, attachments})
The Entry is confirmed=false. It doesn't flow through the binding engine to PDFs until a reviewer confirms it from the Data tab. Context captured: UTM params, referrer, user-agent, invite token.
AI chat in the portal
The portal's chat intake mode lives at packages/portal/src/intake/modes/chat/ (engine, UI, types). Recent commits 8c20180 feat(portal): richer chat header + AI avatars, align seed with branding and 9d2ddd6 client uses server indicate this surface is under active work. From an architectural standpoint it's just another intake mode: the chat engine collects values and passes them to the same submitIntake call that the grid and wizard modes use. It produces the same Entry as every other path.
Path C — Document upload (credit report, pay stub, etc.)
Credit report (live)
The credit-report DataSource parses MISMO 2.3.1 XML (tri-bureau merge) and emits one Entry with roughly 80 values. The mapping is documented in domains/bankruptcy/data-sources/credit-report-mapping.md — every CREDIT_LIABILITY becomes a creditor, classified into creditor.secured[] or creditor.unsecured[] based on _AccountType and collateral.
- Source tag:
credit-report - Invoker: the user who uploaded
- Confirmed:
false— the reviewer must confirm before values flow to Schedule D / Schedule E/F - Volume: ~80 schema keys per submission, mostly repeating creditor rows
What the credit report provides: creditor identity, balance, account number, account type, date opened, collateral description. What it does not provide (from credit-report-mapping.md): secured/unsecured/priority legal classification, contingent/unliquidated/disputed flags, collateral value, priority type, who-owes (debtor1/debtor2/joint) beyond Individual/Joint/AuthorizedUser. Those require attorney judgment and arrive as follow-on manual entries.
Generic doc drop (scaffolded)
Pay stubs, bank statements, 1099s, leases. Current state:
- The attachment is uploaded through
POST /api/cases/:id/attachments(multipart) and lands in theattachmentstable. - An extraction step is planned but not implemented end-to-end: the LLM-based mapper that reads a pay stub and proposes an Entry with
debtor1.income.wages,debtor1.employer, etc. exists as a design inengine.md(Example Flow D) and as a scaffolded path in our DataSource model — the code path from attachment to reviewable Entry for arbitrary documents is not yet wired in production. - Design intent: the first time an attorney confirms an extraction, the field mapping is saved as a reusable template keyed on document shape. Subsequent uploads of the same document type auto-apply. This is also planned, not built.
The invariant holds regardless: whenever the extraction path completes, its output is a doc-upload Entry that gets reviewed like any other. The PDF never sees unconfirmed values.
Path D — API sync
One mapping spec lives in domains/bankruptcy/data-sources/, fully documented but not yet wired to a running external API in this codebase.
Clio — clio-api-mapping.md
Clio is a general practice-management platform, not a bankruptcy system. The mapping splits cleanly:
- CRM layer → app tables (fully specified): Matter →
cases, Contact →contacts, Relationship →case_contacts, Task →tasks, CalendarEntry →events, Note →notes, Bill / Activity →billing_entries, Document →attachments. - Bankruptcy schema → Clio custom fields (documented, firm-specific): Clio stores chapter/SSN/trustee etc. in per-firm custom fields. The per-firm mapping has to be configured by the user.
Clio does not carry creditor schedules, property, income/expense detail, means-test data, SOFA data, or plan data. It provides the case management shell; everything form-specific comes from another source (credit report, direct intake).
Status: documented mapping, not yet wired. There is no running Clio OAuth + sync loop in packages/server. The doc is the spec; hooking it up is a future task.
The mechanism
When wired, this uses the existing DataSource mechanism:
External payload → DataSource.config (field map) → schema keys → Entry (source='clio')
Same database write. Same review flow (or auto-confirmed per config). Same binding engine on the other side. Adding a new API source is a parser plus a DataSource JSON — the Entry shape doesn't change.
The merge model
Current value of any schema key on a case = the value from the latest confirmed Entry that touched that key. loadCaseValues(caseId) in services/intake.ts (lines 244–257) is the canonical implementation:
SELECT "values" FROM entries
WHERE case_id = ? AND confirmed = true
ORDER BY timestamp ASC
…then Object.assign across all rows. Latest-wins per key, because later objects overwrite earlier ones.
Consequences:
- Corrections are writes. A lawyer who fixes a creditor name from a credit-report Entry just adds a new
manualEntry with that one key. The credit-report Entry stays in the timeline unchanged. - Append-only. No Entry is ever mutated in place. The only state change on an existing Entry is
confirmed: false → true. - Reviewable before filing. Unconfirmed Entries are visible in the Data tab and in the Activity feed, but their values don't reach the binding engine. A reviewer has to promote them.
- Source-tracked.
entries.source,entries.data_source_id, andentries.user_idare always set. Every value on every form is traceable to the Entry that produced it. - Invoker on join. The server populates
Entry.invoker = { id, name }on read for UI purposes — it is not a wire-level column onentries.
Why this is powerful
- One schema, N sources, one artifact. Manual, portal, widget, document, API — all five produce
INSERT INTO entries (...)with the same shape. Reviewable, auditable, append-only, invoker-tracked by construction. There is no second code path for "external data" versus "typed data." - Adding a new intake source is data, not code. A new intake JSON under
domains/<vertical>/data-sources/with aslugandpublished: trueis live on/intake/<slug>immediately. A new API sync needs a small parser plus a DataSource config; the Entry write is the same line. - The intake JSONs are the IP. What to ask, in what order, conditioned on what, in what tone — the lawyer's domain expertise — lives in
IntakeConfig.sections[].questions[]. This is the same place the four Atlas / DebtStoppers / Greenfield / self-file configs live. Everything below that point — the portal UI, the rate limiter, the attachment writer, the Entry insert, the binding engine — is generic. - Cross-vertical by default. The same portal, widget, invite, and Entry pipeline runs for any domain whose schema + forms are loaded. Immigration intakes, family-law intakes, and bankruptcy intakes share one intake subsystem. Swap the schema and the intake configs; nothing else changes.
Key files
- Types:
packages/core/src/api/types.ts - Server routes:
packages/server/src/routes/intake.ts,packages/server/src/routes/portal.ts - Server services:
packages/server/src/services/intake.ts,packages/server/src/services/portal.ts,packages/server/src/services/rate-limit.ts - Portal app:
packages/portal/src/intake/,packages/portal/src/widget/,packages/portal/src/pages/,packages/portal/src/shared/ - Intake configs:
domains/bankruptcy/data-sources/intake-atlas.json,intake-debtstoppers.json,intake-greenfield.json,intake-individual-self.json - DataSource mappings:
domains/bankruptcy/data-sources/credit-report-mapping.md,clio-api-mapping.md - Companion doc:
docs/engine.md(Any Source, Same Result)
docs/intake.md