# Finance App Personal finance tracker built on Next.js 16 (App Router), PostgreSQL, and Prisma. Bank statements are ingested automatically from Paperless-NGX via an N8N workflow that uses Gemini to extract structured data from PDF statements. ## Stack - **Frontend**: Next.js 16 App Router, TypeScript, Tailwind CSS, Recharts - **Backend**: Next.js API routes, raw PostgreSQL via `pg` + `@prisma/adapter-pg` - **Database**: PostgreSQL (`postgres-personal` container) - **Auth**: `X-Forwarded-User` header (email) set by Traefik forward-auth → mapped to `participants.email` - **Ingestion**: N8N workflow → Gemini 2.5 Flash (PDF parsing) → PostgreSQL --- ## Data Model ### `statements` The top-level document, one row per billing period per account. | Column | Type | Description | |--------|------|-------------| | `id` | int | Primary key | | `bank_name` | text | Normalised bank name (e.g. "American Express") | | `card_name` | text | Product name (e.g. "Rewards Travel Adventures") | | `account_number` | text | Account/card number (spaces stripped) | | `account_type` | text | Raw account type string from statement | | `statement_type` | text | Normalised type: `Credit Card`, `Business Card`, `multi-currency account`, etc. | | `account_holder_name` | text | Name on the account if extracted | | `billing_start_date` | date | Period start | | `billing_end_date` | date | Period end — used as the deduplication anchor | | `opening_balance` | numeric | Balance at start of period | | `closing_balance` | numeric | Balance at end of period | | `total_credits` | numeric | Sum of all credits in period | | `total_debits` | numeric | Sum of all debits in period | | `total_amount_due` | numeric | Amount due (credit cards) | | `minimum_amount_due` | numeric | Minimum payment due (credit cards) | | `payment_due_date` | date | Payment due date (credit cards) | | `credit_limit` | numeric | Credit limit (credit cards) | | `available_credit` | numeric | Available credit at statement date | | `interest_charged` | numeric | Interest charged this period (from statement summary) | | `fees_charged` | numeric | Fees charged this period (from statement summary) | | `currency` | text | Statement currency (e.g. `AUD`, `USD`) | | `exchange_rate_to_aud` | numeric | FX rate at ingestion time (live from open.er-api.com) | | `owner_id` | int FK → `participants` | Which person owns this statement | | `paperless_doc_id` | int | Paperless-NGX document ID — deduplication key | | `tier_used` | text | AI model used for extraction (e.g. `gemini-2.5-flash`) | | `event_created` | bool | Whether a Google Calendar reminder was created for payment due date | **Deduplication**: unique index on `(bank_name, account_number, billing_end_date)` prevents re-ingestion of the same period. `paperless_doc_id` has a separate unique index for Paperless-linked documents. **Credit card detection**: `statement_type ILIKE '%card%'` --- ### `transactions` One row per line item within a statement. Cascade-deleted when the parent statement is deleted. | Column | Type | Description | |--------|------|-------------| | `id` | int | Primary key | | `statement_id` | int FK → `statements` (nullable) | Parent statement; NULL for manually-entered transactions | | `owner_id` | int FK → `participants` (nullable) | Owner for manual transactions (no statement); statement-linked transactions derive owner from `statements.owner_id` | | `transaction_date` | date | Date of transaction | | `description` | text | Raw description from the statement | | `amount` | numeric | Original amount in statement currency | | `amount_aud` | numeric | AUD-converted amount (= amount if already AUD) | | `transaction_type` | text | `debit`, `credit`, `payment`, `refund`, `fee`, `interest`, `transfer` | | `merchant_name` | text | Raw merchant name extracted by Gemini | | `merchant_normalized` | text | Cleaned/normalised merchant name (Gemini) | | `location` | text | Location if present on statement | | `foreign_currency_amount` | numeric | Original foreign amount if this was an FX transaction | | `foreign_currency_code` | text | Foreign currency code (e.g. `USD`) | | `category` | text | AI-assigned category (see category taxonomy below) | | `row_index` | int | Position in statement — used for deduplication | | `reconciled_with_id` | int FK → `transactions` (nullable) | Links a manually-entered transaction to its matching statement transaction after reconciliation | | `created_at` | timestamptz | When the row was inserted — the "import date". For reconciled transactions the UI shows the original manual/CSV `created_at`, not the statement's | **Deduplication**: unique index on `(statement_id, transaction_date, description, amount, row_index)`. **Analytics**: all spend queries use `amount_aud` for cross-currency consistency. Split-adjusted queries apply `amount_aud * share_percent / 100` where a split exists for the current user. --- ### `transaction_overrides` User corrections to AI-extracted data. Stored separately to preserve the original extraction. | Column | Type | Description | |--------|------|-------------| | `transaction_id` | int FK → `transactions` (unique) | One override per transaction | | `merchant_normalized` | text | User-corrected merchant name | | `category_override` | text | User-corrected category | | `notes` | text | Free-text notes | All analytics queries use `COALESCE(o.category_override, t.category)` and `COALESCE(o.merchant_normalized, t.merchant_normalized, t.merchant_name)` to prefer overrides over AI values. --- ### `transaction_splits` Shared expense tracking — records that a transaction was split between participants. | Column | Type | Description | |--------|------|-------------| | `transaction_id` | int FK → `transactions` | The transaction being split | | `participant_id` | int FK → `participants` | Who shares in this transaction | | `share_percent` | numeric(5,2) | Their percentage (1–100) | | `settled` | bool | Whether this share has been settled | | `settled_at` | timestamptz | When it was settled | A transaction can be split across multiple participants. The statement owner's own share is implicit (`100 - SUM(other shares)`). Analytics queries LEFT JOIN `transaction_splits` on `participant_id = current_user.id` — if no split row exists, the full amount belongs to the owner. --- ### `transaction_tags` Many-to-many join between transactions and tags. | Column | Type | |--------|------| | `transaction_id` | int FK → `transactions` | | `tag_id` | int FK → `tags` | --- ### `tags` User-defined coloured labels for ad-hoc transaction grouping beyond the fixed category taxonomy. | Column | Type | Description | |--------|------|-------------| | `id` | int | Primary key | | `name` | text (unique) | Tag name | | `color` | text | Hex colour (default `#6366f1`) | --- ### `participants` People who own statements or share expenses. | Column | Type | Description | |--------|------|-------------| | `id` | int | Primary key | | `name` | text (unique) | Display name | | `email` | text (unique) | Login identity — matched against `X-Forwarded-User` header | --- ### `account_owner_mappings` Persists `(bank, account_number) → owner` assignments so future ingestion auto-assigns the correct owner without manual intervention. | Column | Type | Description | |--------|------|-------------| | `bank_name` | text | | | `account_number` | text | | | `owner_id` | int FK → `participants` | | Written when a user reassigns a statement owner in the UI. Consulted by the N8N workflow on every new statement insert. --- ### `rules` Saved auto-categorisation rules. Applied in bulk via the Rules page. | Column | Type | Description | |--------|------|-------------| | `owner_id` | int FK → `participants` | Rule belongs to this user | | `name` | text | Rule label | | `conditions` | jsonb | Array of `{field, operator, value}` — AND logic | | `actions` | jsonb | `{set_category, add_tag_ids, set_merchant}` | | `enabled` | bool | | | `priority` | int | Higher priority rules run first | **Condition fields**: `merchant_normalized`, `description`, `category`, `bank_name`, `amount`, `transaction_type` **Condition operators**: `contains`, `equals`, `starts_with`, `gt`, `lt`, `not_equals` **Actions**: `set_category`, `set_merchant`, `add_tag_ids`, `apply_split` --- ### `split_payments` Records of actual cash settlements between participants. | Column | Type | Description | |--------|------|-------------| | `from_participant_id` | int FK → `participants` | Who paid | | `to_participant_id` | int FK → `participants` | Who received | | `amount` | numeric | Amount settled | | `payment_date` | date | Date of settlement | | `notes` | text | Optional note (e.g. "bank transfer") | | `linked_transaction_id` | int FK → `transactions` (nullable) | If the payment was itself a transaction | --- ### `expense_metadata` Enrichment records for non-statement expenses (email receipts, manual entries). Linked to a `transaction` if one exists; otherwise a standalone record awaiting reconciliation. | Column | Type | Description | |--------|------|-------------| | `transaction_id` | int FK → `transactions` (unique, nullable) | Linked transaction; NULL until reconciled | | `source` | text | Origin: `email`, `manual` | | `paperless_doc_id` | int | Paperless-NGX document ID | | `payment_method` | text | `credit_card`, `debit_card`, `paypal`, `afterpay`, `cash`, etc. | | `payment_method_detail` | text | Card last-4 or provider detail | | `order_reference` | text | Order/confirmation number | | `line_items` | jsonb | Array of `{description, qty, unit_price, total}` | | `merchant_normalized` | text | Canonical merchant for matching | | `amount` / `transaction_date` | numeric / date | Used for reconciliation matching when `transaction_id IS NULL` | | `extraction_model` | text | AI model used (`gemini-2.5-flash`) | Partial index on `(merchant_normalized, transaction_date) WHERE transaction_id IS NULL` powers reconciliation queries. --- ### `rule_apply_runs` Audit log of bulk rule-apply operations. Each run captures which transactions were affected and a full snapshot for revert support. | Column | Type | Description | |--------|------|-------------| | `owner_id` | int FK → `participants` | | | `applied_at` | timestamptz | When the run executed | | `split_from` | date | Optional date filter used for this run | | `matched` | int | Number of rules matched | | `transactions_affected` | int | Number of transactions changed | | `reverted_at` | timestamptz | Set when run was reverted | | `snapshot` | jsonb | Pre-run state of all affected transactions | --- ### `budgets` Monthly spend targets per category. Stored but currently unused in the UI (replaced by the analytics/insights views). | Column | Type | Description | |--------|------|-------------| | `owner_id` | int FK → `participants` | | | `category` | text | Category name | | `month` | date | Always first of month (e.g. `2026-03-01`) | | `amount_limit` | numeric | Spend target for that category/month | --- ## Category Taxonomy Fixed set defined in `src/lib/categories.ts`. Applied by Gemini at ingestion and overridable by the user or rules engine: `groceries` · `dining` · `transport` · `fuel` · `shopping` · `utilities` · `entertainment` · `travel` · `health` · `insurance` · `subscriptions` · `cash_advance` · `government` · `education` · `rent` · `home_goods` · `home_maintenance` · `transfers` · `income` · `investment` · `personal_care` · `pets` · `gifts` · `charity` · `other` - **home_goods** — items purchased for the house (appliances, furniture, kitchenware, electronics) - **home_maintenance** — services on the property (cleaning, mowing, repairs) **Committed spend** (Insights page): `rent`, `utilities`, `insurance`, `subscriptions` **Excluded from spend analytics**: `transfers`, `investment` --- ## API Routes All routes require authentication via `X-Forwarded-User` header (set by Traefik). Responses are always scoped to the authenticated user's `owner_id`. | Method | Route | Description | |--------|-------|-------------| | GET | `/api/statements` | All statements for current user | | GET / PATCH | `/api/statements/[id]` | Get statement; PATCH to reassign owner (also writes `account_owner_mappings`) | | GET | `/api/transactions` | Paginated transactions. Filters: `from`, `to`, `categories`, `bank_names`, `tag_ids`, `transaction_types`, `search`, `statement_id`, `amount_min`, `amount_max`, `has_split` (`yes`/`no`). Sort: `sort_by` (`transaction_date`\|`amount`\|`created_at`), `sort_dir` (`asc`\|`desc`) | | POST | `/api/transactions` | Create a manual transaction (no statement) | | GET / PATCH | `/api/transactions/[id]` | Get transaction; PATCH to upsert override (category, merchant, notes) | | GET / POST | `/api/transactions/[id]/splits` | List or create splits on a transaction | | GET / POST | `/api/transactions/[id]/tags` | List or apply tags to a transaction | | POST | `/api/transactions/bulk` | Bulk update category/merchant across multiple transactions | | POST | `/api/transactions/reconcile` | Link manual transactions to statement transactions; copies overrides, tags, splits across | | GET | `/api/analytics/monthly` | Split-adjusted monthly spend by category + income + investments. Params: `months` (1–24, default 6) | | GET | `/api/analytics/subscriptions` | Recurring charge detection — merchants with ≥3 occurrences at consistent intervals | | GET | `/api/analytics/fees` | Fees and interest from statement summaries + individual fee/interest transactions | | GET | `/api/shared-transactions` | Transactions with active splits; sorted client-side by date/imported/amount in the UI | | POST | `/api/splits/settle` | Mark a split as settled | | GET / POST | `/api/split-payments` | List or record cash settlements between participants | | GET / POST | `/api/participants` | List participants; POST to create (with optional `email`) | | GET | `/api/participants/[id]/balance` | Net balance owed by/to a specific participant | | GET | `/api/participants/balances` | All participant balances | | GET / POST | `/api/rules` | List or create rules | | PATCH / DELETE | `/api/rules/[id]` | Update or delete a rule | | POST | `/api/rules/apply` | Run all enabled rules against all transactions; returns `{matched, transactions_affected}` | | GET / POST | `/api/budgets` | List budgets for a month (`?month=YYYY-MM`); upsert budget | | DELETE | `/api/budgets/[id]` | Delete a budget | | GET | `/api/merchants` | Merchant name autocomplete suggestions | | GET | `/api/me` | Current user info derived from `X-Forwarded-User` header | | GET / POST | `/api/tags` | List or create tags | | PATCH / DELETE | `/api/tags/[id]` | Update or delete a tag | --- ## Ingestion Pipeline ``` Paperless-NGX └─ documents tagged "Bank Statement" + "Credit Card" (without "cc-processor") │ ▼ N8N workflow — polls every 5 minutes (workflow ID: FysADdFwEtwONQl4) │ ├─ Duplicate check: SELECT WHERE paperless_doc_id = │ └─ Already processed → skip, mark in Paperless │ ├─ Download PDF binary from Paperless API │ ├─ Gemini 2.5 Flash — PDF → structured JSON │ responseSchema: { summary: {...}, transactions: [...] } │ timeout: 180s, retryOnFail: 3×, delay: 30s │ ├─ Parse & normalise │ account_number: strip spaces │ bank_name: title-case │ FX rate: fetch live from open.er-api.com if non-AUD │ ├─ Statement exists? (bank + account + billing_end_date) │ └─ Duplicate → skip, mark in Paperless │ ├─ New bank? → Slack approval gate (human confirms before insert) │ ├─ Lookup account_owner_mappings → resolve owner_id (default: 1 = "Me") │ ├─ INSERT statements + transactions │ ├─ Google Calendar reminder for payment_due_date (credit cards) │ └─ Paperless: PATCH document to add "cc-processor" tag ``` N8N workflow JSON: `docker/automation/workflows/cc-statement-processor-paperless.json` in the smarthome repo. --- ## Schema Migrations Located in `prisma/migrations/`. Applied manually against the running container: ```bash docker exec postgres-personal psql -U personal -d personal \ < prisma/migrations//migration.sql ``` | Migration | What it adds | |-----------|-------------| | `0001_init` | `statements`, `transactions`, `participants` | | `0002_splits` | `transaction_splits` | | `0003_owner_segregation` | `owner_id` on statements, `account_owner_mappings`, `email` on participants | | `0004_tags` | `tags`, `transaction_tags` | | `0005_rules` | `rules` | | `0006_budgets` | `budgets` | | `0007_cashflow` | `amount_aud`, `exchange_rate_to_aud` on transactions; `exchange_rate_to_aud` on statements | > `paperless_doc_id` on statements and the `uq_statements_paperless_doc_id` index were added directly (not tracked in a migration file). > `owner_id` on transactions and `statement_id` made nullable were applied directly (March 2026) to support manual transaction entry without a fake statement. > `reconciled_with_id` on transactions, `expense_metadata`, `rule_apply_runs`, `split_payments` were added directly and are covered by the Prisma schema but lack individual migration files. --- ## Known Gaps / TODOs ### Payment Provider tracking Currently `merchant_normalized` conflates the *payment provider* with the *merchant*. Transactions processed through PayPal, Afterpay, Zip, Alipay, etc. end up with the provider as the merchant when the real merchant can't be recovered. **What's been done so far:** - PayPal entries that embed the merchant name (e.g. `PAYPAL *BUNNINGSGRO`) were cleaned up — the real merchant was extracted during the March 2026 consolidation pass. - Pure PayPal/Afterpay/Zip entries where the merchant is unrecoverable were left as-is. - A one-time SQL consolidation pass normalised ~50 merchant name variant groups (March 2026). **Remaining work:** 1. **DB migration**: `ALTER TABLE transactions ADD COLUMN payment_provider text` and same on `transaction_overrides`. 2. **Gemini prompt**: add `payment_provider` to the `responseSchema` so the AI extracts it separately (`"PayPal"`, `"Afterpay"`, `"Zip"`, `null`, etc.) — the raw bank description usually contains enough signal. 3. **Backfill**: for existing transactions, derive `payment_provider` from `merchant_name` patterns (`PAYPAL *`, `AFTERPAY`, `ZIP/ZIPPAY`, `BPAY`). 4. **App**: surface `payment_provider` as a filter/column in the transactions view; exclude payment providers from merchant analytics so they don't inflate the merchant list. --- ## Deployment Runs as a Docker container alongside the rest of the home lab stack. Build and deploy: ```bash # From smarthome repo root docker compose --env-file docker/common.env --env-file docker/finance/.env \ -f docker/finance/docker-compose.yml up -d --build ``` The container uses Next.js standalone output. `@prisma/adapter-pg` and `pg` are listed in `serverExternalPackages` in `next.config.ts` to ensure they are included in the standalone bundle.