ce67e38d77
- Add reconciled_with_id and created_at columns to transactions table docs - Document split_payments, expense_metadata, rule_apply_runs tables - Update /api/transactions route docs with has_split filter and all sort options - Add /api/transactions/reconcile and /api/split-payments to API table - Document import date (created_at) behaviour and reconciliation caveat - Add Prisma regeneration note to CLAUDE.md - Note schema drift for tables added without migration files
383 lines
19 KiB
Markdown
383 lines
19 KiB
Markdown
# Finance App
|
||
|
||
Personal finance tracker built on Next.js 16 (App Router), PostgreSQL, and Prisma. Bank statements are ingested automatically from Paperless-NGX via an N8N workflow that uses Gemini to extract structured data from PDF statements.
|
||
|
||
## Stack
|
||
|
||
- **Frontend**: Next.js 16 App Router, TypeScript, Tailwind CSS, Recharts
|
||
- **Backend**: Next.js API routes, raw PostgreSQL via `pg` + `@prisma/adapter-pg`
|
||
- **Database**: PostgreSQL (`postgres-personal` container)
|
||
- **Auth**: `X-Forwarded-User` header (email) set by Traefik forward-auth → mapped to `participants.email`
|
||
- **Ingestion**: N8N workflow → Gemini 2.5 Flash (PDF parsing) → PostgreSQL
|
||
|
||
---
|
||
|
||
## Data Model
|
||
|
||
### `statements`
|
||
The top-level document, one row per billing period per account.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | int | Primary key |
|
||
| `bank_name` | text | Normalised bank name (e.g. "American Express") |
|
||
| `card_name` | text | Product name (e.g. "Rewards Travel Adventures") |
|
||
| `account_number` | text | Account/card number (spaces stripped) |
|
||
| `account_type` | text | Raw account type string from statement |
|
||
| `statement_type` | text | Normalised type: `Credit Card`, `Business Card`, `multi-currency account`, etc. |
|
||
| `account_holder_name` | text | Name on the account if extracted |
|
||
| `billing_start_date` | date | Period start |
|
||
| `billing_end_date` | date | Period end — used as the deduplication anchor |
|
||
| `opening_balance` | numeric | Balance at start of period |
|
||
| `closing_balance` | numeric | Balance at end of period |
|
||
| `total_credits` | numeric | Sum of all credits in period |
|
||
| `total_debits` | numeric | Sum of all debits in period |
|
||
| `total_amount_due` | numeric | Amount due (credit cards) |
|
||
| `minimum_amount_due` | numeric | Minimum payment due (credit cards) |
|
||
| `payment_due_date` | date | Payment due date (credit cards) |
|
||
| `credit_limit` | numeric | Credit limit (credit cards) |
|
||
| `available_credit` | numeric | Available credit at statement date |
|
||
| `interest_charged` | numeric | Interest charged this period (from statement summary) |
|
||
| `fees_charged` | numeric | Fees charged this period (from statement summary) |
|
||
| `currency` | text | Statement currency (e.g. `AUD`, `USD`) |
|
||
| `exchange_rate_to_aud` | numeric | FX rate at ingestion time (live from open.er-api.com) |
|
||
| `owner_id` | int FK → `participants` | Which person owns this statement |
|
||
| `paperless_doc_id` | int | Paperless-NGX document ID — deduplication key |
|
||
| `tier_used` | text | AI model used for extraction (e.g. `gemini-2.5-flash`) |
|
||
| `event_created` | bool | Whether a Google Calendar reminder was created for payment due date |
|
||
|
||
**Deduplication**: unique index on `(bank_name, account_number, billing_end_date)` prevents re-ingestion of the same period. `paperless_doc_id` has a separate unique index for Paperless-linked documents.
|
||
|
||
**Credit card detection**: `statement_type ILIKE '%card%'`
|
||
|
||
---
|
||
|
||
### `transactions`
|
||
One row per line item within a statement. Cascade-deleted when the parent statement is deleted.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | int | Primary key |
|
||
| `statement_id` | int FK → `statements` (nullable) | Parent statement; NULL for manually-entered transactions |
|
||
| `owner_id` | int FK → `participants` (nullable) | Owner for manual transactions (no statement); statement-linked transactions derive owner from `statements.owner_id` |
|
||
| `transaction_date` | date | Date of transaction |
|
||
| `description` | text | Raw description from the statement |
|
||
| `amount` | numeric | Original amount in statement currency |
|
||
| `amount_aud` | numeric | AUD-converted amount (= amount if already AUD) |
|
||
| `transaction_type` | text | `debit`, `credit`, `payment`, `refund`, `fee`, `interest`, `transfer` |
|
||
| `merchant_name` | text | Raw merchant name extracted by Gemini |
|
||
| `merchant_normalized` | text | Cleaned/normalised merchant name (Gemini) |
|
||
| `location` | text | Location if present on statement |
|
||
| `foreign_currency_amount` | numeric | Original foreign amount if this was an FX transaction |
|
||
| `foreign_currency_code` | text | Foreign currency code (e.g. `USD`) |
|
||
| `category` | text | AI-assigned category (see category taxonomy below) |
|
||
| `row_index` | int | Position in statement — used for deduplication |
|
||
| `reconciled_with_id` | int FK → `transactions` (nullable) | Links a manually-entered transaction to its matching statement transaction after reconciliation |
|
||
| `created_at` | timestamptz | When the row was inserted — the "import date". For reconciled transactions the UI shows the original manual/CSV `created_at`, not the statement's |
|
||
|
||
**Deduplication**: unique index on `(statement_id, transaction_date, description, amount, row_index)`.
|
||
|
||
**Analytics**: all spend queries use `amount_aud` for cross-currency consistency. Split-adjusted queries apply `amount_aud * share_percent / 100` where a split exists for the current user.
|
||
|
||
---
|
||
|
||
### `transaction_overrides`
|
||
User corrections to AI-extracted data. Stored separately to preserve the original extraction.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `transaction_id` | int FK → `transactions` (unique) | One override per transaction |
|
||
| `merchant_normalized` | text | User-corrected merchant name |
|
||
| `category_override` | text | User-corrected category |
|
||
| `notes` | text | Free-text notes |
|
||
|
||
All analytics queries use `COALESCE(o.category_override, t.category)` and `COALESCE(o.merchant_normalized, t.merchant_normalized, t.merchant_name)` to prefer overrides over AI values.
|
||
|
||
---
|
||
|
||
### `transaction_splits`
|
||
Shared expense tracking — records that a transaction was split between participants.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `transaction_id` | int FK → `transactions` | The transaction being split |
|
||
| `participant_id` | int FK → `participants` | Who shares in this transaction |
|
||
| `share_percent` | numeric(5,2) | Their percentage (1–100) |
|
||
| `settled` | bool | Whether this share has been settled |
|
||
| `settled_at` | timestamptz | When it was settled |
|
||
|
||
A transaction can be split across multiple participants. The statement owner's own share is implicit (`100 - SUM(other shares)`). Analytics queries LEFT JOIN `transaction_splits` on `participant_id = current_user.id` — if no split row exists, the full amount belongs to the owner.
|
||
|
||
---
|
||
|
||
### `transaction_tags`
|
||
Many-to-many join between transactions and tags.
|
||
|
||
| Column | Type |
|
||
|--------|------|
|
||
| `transaction_id` | int FK → `transactions` |
|
||
| `tag_id` | int FK → `tags` |
|
||
|
||
---
|
||
|
||
### `tags`
|
||
User-defined coloured labels for ad-hoc transaction grouping beyond the fixed category taxonomy.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | int | Primary key |
|
||
| `name` | text (unique) | Tag name |
|
||
| `color` | text | Hex colour (default `#6366f1`) |
|
||
|
||
---
|
||
|
||
### `participants`
|
||
People who own statements or share expenses.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `id` | int | Primary key |
|
||
| `name` | text (unique) | Display name |
|
||
| `email` | text (unique) | Login identity — matched against `X-Forwarded-User` header |
|
||
|
||
---
|
||
|
||
### `account_owner_mappings`
|
||
Persists `(bank, account_number) → owner` assignments so future ingestion auto-assigns the correct owner without manual intervention.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `bank_name` | text | |
|
||
| `account_number` | text | |
|
||
| `owner_id` | int FK → `participants` | |
|
||
|
||
Written when a user reassigns a statement owner in the UI. Consulted by the N8N workflow on every new statement insert.
|
||
|
||
---
|
||
|
||
### `rules`
|
||
Saved auto-categorisation rules. Applied in bulk via the Rules page.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `owner_id` | int FK → `participants` | Rule belongs to this user |
|
||
| `name` | text | Rule label |
|
||
| `conditions` | jsonb | Array of `{field, operator, value}` — AND logic |
|
||
| `actions` | jsonb | `{set_category, add_tag_ids, set_merchant}` |
|
||
| `enabled` | bool | |
|
||
| `priority` | int | Higher priority rules run first |
|
||
|
||
**Condition fields**: `merchant_normalized`, `description`, `category`, `bank_name`, `amount`, `transaction_type`
|
||
**Condition operators**: `contains`, `equals`, `starts_with`, `gt`, `lt`, `not_equals`
|
||
**Actions**: `set_category`, `set_merchant`, `add_tag_ids`, `apply_split`
|
||
|
||
---
|
||
|
||
### `split_payments`
|
||
Records of actual cash settlements between participants.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `from_participant_id` | int FK → `participants` | Who paid |
|
||
| `to_participant_id` | int FK → `participants` | Who received |
|
||
| `amount` | numeric | Amount settled |
|
||
| `payment_date` | date | Date of settlement |
|
||
| `notes` | text | Optional note (e.g. "bank transfer") |
|
||
| `linked_transaction_id` | int FK → `transactions` (nullable) | If the payment was itself a transaction |
|
||
|
||
---
|
||
|
||
### `expense_metadata`
|
||
Enrichment records for non-statement expenses (email receipts, manual entries). Linked to a `transaction` if one exists; otherwise a standalone record awaiting reconciliation.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `transaction_id` | int FK → `transactions` (unique, nullable) | Linked transaction; NULL until reconciled |
|
||
| `source` | text | Origin: `email`, `manual` |
|
||
| `paperless_doc_id` | int | Paperless-NGX document ID |
|
||
| `payment_method` | text | `credit_card`, `debit_card`, `paypal`, `afterpay`, `cash`, etc. |
|
||
| `payment_method_detail` | text | Card last-4 or provider detail |
|
||
| `order_reference` | text | Order/confirmation number |
|
||
| `line_items` | jsonb | Array of `{description, qty, unit_price, total}` |
|
||
| `merchant_normalized` | text | Canonical merchant for matching |
|
||
| `amount` / `transaction_date` | numeric / date | Used for reconciliation matching when `transaction_id IS NULL` |
|
||
| `extraction_model` | text | AI model used (`gemini-2.5-flash`) |
|
||
|
||
Partial index on `(merchant_normalized, transaction_date) WHERE transaction_id IS NULL` powers reconciliation queries.
|
||
|
||
---
|
||
|
||
### `rule_apply_runs`
|
||
Audit log of bulk rule-apply operations. Each run captures which transactions were affected and a full snapshot for revert support.
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `owner_id` | int FK → `participants` | |
|
||
| `applied_at` | timestamptz | When the run executed |
|
||
| `split_from` | date | Optional date filter used for this run |
|
||
| `matched` | int | Number of rules matched |
|
||
| `transactions_affected` | int | Number of transactions changed |
|
||
| `reverted_at` | timestamptz | Set when run was reverted |
|
||
| `snapshot` | jsonb | Pre-run state of all affected transactions |
|
||
|
||
---
|
||
|
||
### `budgets`
|
||
Monthly spend targets per category. Stored but currently unused in the UI (replaced by the analytics/insights views).
|
||
|
||
| Column | Type | Description |
|
||
|--------|------|-------------|
|
||
| `owner_id` | int FK → `participants` | |
|
||
| `category` | text | Category name |
|
||
| `month` | date | Always first of month (e.g. `2026-03-01`) |
|
||
| `amount_limit` | numeric | Spend target for that category/month |
|
||
|
||
---
|
||
|
||
## Category Taxonomy
|
||
|
||
Fixed set defined in `src/lib/categories.ts`. Applied by Gemini at ingestion and overridable by the user or rules engine:
|
||
|
||
`groceries` · `dining` · `transport` · `fuel` · `shopping` · `utilities` · `entertainment` · `travel` · `health` · `insurance` · `subscriptions` · `cash_advance` · `government` · `education` · `rent` · `home_goods` · `home_maintenance` · `transfers` · `income` · `investment` · `personal_care` · `pets` · `gifts` · `charity` · `other`
|
||
|
||
- **home_goods** — items purchased for the house (appliances, furniture, kitchenware, electronics)
|
||
- **home_maintenance** — services on the property (cleaning, mowing, repairs)
|
||
|
||
**Committed spend** (Insights page): `rent`, `utilities`, `insurance`, `subscriptions`
|
||
**Excluded from spend analytics**: `transfers`, `investment`
|
||
|
||
---
|
||
|
||
## API Routes
|
||
|
||
All routes require authentication via `X-Forwarded-User` header (set by Traefik). Responses are always scoped to the authenticated user's `owner_id`.
|
||
|
||
| Method | Route | Description |
|
||
|--------|-------|-------------|
|
||
| GET | `/api/statements` | All statements for current user |
|
||
| GET / PATCH | `/api/statements/[id]` | Get statement; PATCH to reassign owner (also writes `account_owner_mappings`) |
|
||
| GET | `/api/transactions` | Paginated transactions. Filters: `from`, `to`, `categories`, `bank_names`, `tag_ids`, `transaction_types`, `search`, `statement_id`, `amount_min`, `amount_max`, `has_split` (`yes`/`no`). Sort: `sort_by` (`transaction_date`\|`amount`\|`created_at`), `sort_dir` (`asc`\|`desc`) |
|
||
| POST | `/api/transactions` | Create a manual transaction (no statement) |
|
||
| GET / PATCH | `/api/transactions/[id]` | Get transaction; PATCH to upsert override (category, merchant, notes) |
|
||
| GET / POST | `/api/transactions/[id]/splits` | List or create splits on a transaction |
|
||
| GET / POST | `/api/transactions/[id]/tags` | List or apply tags to a transaction |
|
||
| POST | `/api/transactions/bulk` | Bulk update category/merchant across multiple transactions |
|
||
| POST | `/api/transactions/reconcile` | Link manual transactions to statement transactions; copies overrides, tags, splits across |
|
||
| GET | `/api/analytics/monthly` | Split-adjusted monthly spend by category + income + investments. Params: `months` (1–24, default 6) |
|
||
| GET | `/api/analytics/subscriptions` | Recurring charge detection — merchants with ≥3 occurrences at consistent intervals |
|
||
| GET | `/api/analytics/fees` | Fees and interest from statement summaries + individual fee/interest transactions |
|
||
| GET | `/api/shared-transactions` | Transactions with active splits; sorted client-side by date/imported/amount in the UI |
|
||
| POST | `/api/splits/settle` | Mark a split as settled |
|
||
| GET / POST | `/api/split-payments` | List or record cash settlements between participants |
|
||
| GET / POST | `/api/participants` | List participants; POST to create (with optional `email`) |
|
||
| GET | `/api/participants/[id]/balance` | Net balance owed by/to a specific participant |
|
||
| GET | `/api/participants/balances` | All participant balances |
|
||
| GET / POST | `/api/rules` | List or create rules |
|
||
| PATCH / DELETE | `/api/rules/[id]` | Update or delete a rule |
|
||
| POST | `/api/rules/apply` | Run all enabled rules against all transactions; returns `{matched, transactions_affected}` |
|
||
| GET / POST | `/api/budgets` | List budgets for a month (`?month=YYYY-MM`); upsert budget |
|
||
| DELETE | `/api/budgets/[id]` | Delete a budget |
|
||
| GET | `/api/merchants` | Merchant name autocomplete suggestions |
|
||
| GET | `/api/me` | Current user info derived from `X-Forwarded-User` header |
|
||
| GET / POST | `/api/tags` | List or create tags |
|
||
| PATCH / DELETE | `/api/tags/[id]` | Update or delete a tag |
|
||
|
||
---
|
||
|
||
## Ingestion Pipeline
|
||
|
||
```
|
||
Paperless-NGX
|
||
└─ documents tagged "Bank Statement" + "Credit Card" (without "cc-processor")
|
||
│
|
||
▼
|
||
N8N workflow — polls every 5 minutes (workflow ID: FysADdFwEtwONQl4)
|
||
│
|
||
├─ Duplicate check: SELECT WHERE paperless_doc_id = <id>
|
||
│ └─ Already processed → skip, mark in Paperless
|
||
│
|
||
├─ Download PDF binary from Paperless API
|
||
│
|
||
├─ Gemini 2.5 Flash — PDF → structured JSON
|
||
│ responseSchema: { summary: {...}, transactions: [...] }
|
||
│ timeout: 180s, retryOnFail: 3×, delay: 30s
|
||
│
|
||
├─ Parse & normalise
|
||
│ account_number: strip spaces
|
||
│ bank_name: title-case
|
||
│ FX rate: fetch live from open.er-api.com if non-AUD
|
||
│
|
||
├─ Statement exists? (bank + account + billing_end_date)
|
||
│ └─ Duplicate → skip, mark in Paperless
|
||
│
|
||
├─ New bank? → Slack approval gate (human confirms before insert)
|
||
│
|
||
├─ Lookup account_owner_mappings → resolve owner_id (default: 1 = "Me")
|
||
│
|
||
├─ INSERT statements + transactions
|
||
│
|
||
├─ Google Calendar reminder for payment_due_date (credit cards)
|
||
│
|
||
└─ Paperless: PATCH document to add "cc-processor" tag
|
||
```
|
||
|
||
N8N workflow JSON: `docker/automation/workflows/cc-statement-processor-paperless.json` in the smarthome repo.
|
||
|
||
---
|
||
|
||
## Schema Migrations
|
||
|
||
Located in `prisma/migrations/`. Applied manually against the running container:
|
||
|
||
```bash
|
||
docker exec postgres-personal psql -U personal -d personal \
|
||
< prisma/migrations/<migration>/migration.sql
|
||
```
|
||
|
||
| Migration | What it adds |
|
||
|-----------|-------------|
|
||
| `0001_init` | `statements`, `transactions`, `participants` |
|
||
| `0002_splits` | `transaction_splits` |
|
||
| `0003_owner_segregation` | `owner_id` on statements, `account_owner_mappings`, `email` on participants |
|
||
| `0004_tags` | `tags`, `transaction_tags` |
|
||
| `0005_rules` | `rules` |
|
||
| `0006_budgets` | `budgets` |
|
||
| `0007_cashflow` | `amount_aud`, `exchange_rate_to_aud` on transactions; `exchange_rate_to_aud` on statements |
|
||
|
||
> `paperless_doc_id` on statements and the `uq_statements_paperless_doc_id` index were added directly (not tracked in a migration file).
|
||
> `owner_id` on transactions and `statement_id` made nullable were applied directly (March 2026) to support manual transaction entry without a fake statement.
|
||
> `reconciled_with_id` on transactions, `expense_metadata`, `rule_apply_runs`, `split_payments` were added directly and are covered by the Prisma schema but lack individual migration files.
|
||
|
||
---
|
||
|
||
## Known Gaps / TODOs
|
||
|
||
### Payment Provider tracking
|
||
|
||
Currently `merchant_normalized` conflates the *payment provider* with the *merchant*. Transactions processed through PayPal, Afterpay, Zip, Alipay, etc. end up with the provider as the merchant when the real merchant can't be recovered.
|
||
|
||
**What's been done so far:**
|
||
- PayPal entries that embed the merchant name (e.g. `PAYPAL *BUNNINGSGRO`) were cleaned up — the real merchant was extracted during the March 2026 consolidation pass.
|
||
- Pure PayPal/Afterpay/Zip entries where the merchant is unrecoverable were left as-is.
|
||
- A one-time SQL consolidation pass normalised ~50 merchant name variant groups (March 2026).
|
||
|
||
**Remaining work:**
|
||
1. **DB migration**: `ALTER TABLE transactions ADD COLUMN payment_provider text` and same on `transaction_overrides`.
|
||
2. **Gemini prompt**: add `payment_provider` to the `responseSchema` so the AI extracts it separately (`"PayPal"`, `"Afterpay"`, `"Zip"`, `null`, etc.) — the raw bank description usually contains enough signal.
|
||
3. **Backfill**: for existing transactions, derive `payment_provider` from `merchant_name` patterns (`PAYPAL *`, `AFTERPAY`, `ZIP/ZIPPAY`, `BPAY`).
|
||
4. **App**: surface `payment_provider` as a filter/column in the transactions view; exclude payment providers from merchant analytics so they don't inflate the merchant list.
|
||
|
||
---
|
||
|
||
## Deployment
|
||
|
||
Runs as a Docker container alongside the rest of the home lab stack. Build and deploy:
|
||
|
||
```bash
|
||
# From smarthome repo root
|
||
docker compose --env-file docker/common.env --env-file docker/finance/.env \
|
||
-f docker/finance/docker-compose.yml up -d --build
|
||
```
|
||
|
||
The container uses Next.js standalone output. `@prisma/adapter-pg` and `pg` are listed in `serverExternalPackages` in `next.config.ts` to ensure they are included in the standalone bundle.
|