siddharthd b8cd1b0f89 fix(reconcile): prevent split/tag double-counting on reconciled transactions
Move splits, tags and overrides from manual to statement side on reconcile
(delete from manual after copying) instead of just copying. Add read-time
filter to exclude reconciled manual transactions from balance and shared
transaction queries. Also adds participant filter to shared expenses page.
2026-05-11 19:15:41 +10:00
2026-05-10 16:13:40 +10:00

Finance App

Personal finance tracker built on Next.js 16 (App Router), PostgreSQL, and Prisma. Bank statements are ingested automatically from Paperless-NGX via an N8N workflow that uses Gemini to extract structured data from PDF statements.

Stack

  • Frontend: Next.js 16 App Router, TypeScript, Tailwind CSS, Recharts
  • Backend: Next.js API routes, raw PostgreSQL via pg + @prisma/adapter-pg
  • Database: PostgreSQL (postgres-personal container)
  • Auth: X-Forwarded-User header (email) set by Traefik forward-auth → mapped to participants.email
  • Ingestion: N8N workflow → Gemini 2.5 Flash (PDF parsing) → PostgreSQL

Data Model

statements

The top-level document, one row per billing period per account.

Column Type Description
id int Primary key
bank_name text Normalised bank name (e.g. "American Express")
card_name text Product name (e.g. "Rewards Travel Adventures")
account_number text Account/card number (spaces stripped)
account_type text Raw account type string from statement
statement_type text Normalised type: Credit Card, Business Card, multi-currency account, etc.
account_holder_name text Name on the account if extracted
billing_start_date date Period start
billing_end_date date Period end — used as the deduplication anchor
opening_balance numeric Balance at start of period
closing_balance numeric Balance at end of period
total_credits numeric Sum of all credits in period
total_debits numeric Sum of all debits in period
total_amount_due numeric Amount due (credit cards)
minimum_amount_due numeric Minimum payment due (credit cards)
payment_due_date date Payment due date (credit cards)
credit_limit numeric Credit limit (credit cards)
available_credit numeric Available credit at statement date
interest_charged numeric Interest charged this period (from statement summary)
fees_charged numeric Fees charged this period (from statement summary)
currency text Statement currency (e.g. AUD, USD)
exchange_rate_to_aud numeric FX rate at ingestion time (live from open.er-api.com)
owner_id int FK → participants Which person owns this statement
paperless_doc_id int Paperless-NGX document ID — deduplication key
tier_used text AI model used for extraction (e.g. gemini-2.5-flash)
event_created bool Whether a Google Calendar reminder was created for payment due date

Deduplication: unique index on (bank_name, account_number, billing_end_date) prevents re-ingestion of the same period. paperless_doc_id has a separate unique index for Paperless-linked documents.

Credit card detection: statement_type ILIKE '%card%'


transactions

One row per line item within a statement. Cascade-deleted when the parent statement is deleted.

Column Type Description
id int Primary key
statement_id int FK → statements (nullable) Parent statement; NULL for manually-entered transactions
owner_id int FK → participants (nullable) Owner for manual transactions (no statement); statement-linked transactions derive owner from statements.owner_id
transaction_date date Date of transaction
description text Raw description from the statement
amount numeric Original amount in statement currency
amount_aud numeric AUD-converted amount (= amount if already AUD)
transaction_type text debit, credit, payment, refund, fee, interest, transfer
merchant_name text Raw merchant name extracted by Gemini
merchant_normalized text Cleaned/normalised merchant name (Gemini)
location text Location if present on statement
foreign_currency_amount numeric Original foreign amount if this was an FX transaction
foreign_currency_code text Foreign currency code (e.g. USD)
category text AI-assigned category (see category taxonomy below)
row_index int Position in statement — used for deduplication
reconciled_with_id int FK → transactions (nullable) Links a manually-entered transaction to its matching statement transaction after reconciliation
created_at timestamptz When the row was inserted — the "import date". For reconciled transactions the UI shows the original manual/CSV created_at, not the statement's

Deduplication: unique index on (statement_id, transaction_date, description, amount, row_index).

Analytics: all spend queries use amount_aud for cross-currency consistency. Split-adjusted queries apply amount_aud * share_percent / 100 where a split exists for the current user.


transaction_overrides

User corrections to AI-extracted data. Stored separately to preserve the original extraction.

Column Type Description
transaction_id int FK → transactions (unique) One override per transaction
merchant_normalized text User-corrected merchant name
category_override text User-corrected category
notes text Free-text notes

All analytics queries use COALESCE(o.category_override, t.category) and COALESCE(o.merchant_normalized, t.merchant_normalized, t.merchant_name) to prefer overrides over AI values.


transaction_splits

Shared expense tracking — records that a transaction was split between participants.

Column Type Description
transaction_id int FK → transactions The transaction being split
participant_id int FK → participants Who shares in this transaction
share_percent numeric(5,2) Their percentage (1100)
settled bool Whether this share has been settled
settled_at timestamptz When it was settled

A transaction can be split across multiple participants. The statement owner's own share is implicit (100 - SUM(other shares)). Analytics queries LEFT JOIN transaction_splits on participant_id = current_user.id — if no split row exists, the full amount belongs to the owner.


transaction_tags

Many-to-many join between transactions and tags.

Column Type
transaction_id int FK → transactions
tag_id int FK → tags

tags

User-defined coloured labels for ad-hoc transaction grouping beyond the fixed category taxonomy.

Column Type Description
id int Primary key
name text (unique) Tag name
color text Hex colour (default #6366f1)

participants

People who own statements or share expenses.

Column Type Description
id int Primary key
name text (unique) Display name
email text (unique) Login identity — matched against X-Forwarded-User header

account_owner_mappings

Persists (bank, account_number) → owner assignments so future ingestion auto-assigns the correct owner without manual intervention.

Column Type Description
bank_name text
account_number text
owner_id int FK → participants

Written when a user reassigns a statement owner in the UI. Consulted by the N8N workflow on every new statement insert.


rules

Saved auto-categorisation rules. Applied in bulk via the Rules page.

Column Type Description
owner_id int FK → participants Rule belongs to this user
name text Rule label
conditions jsonb Array of {field, operator, value} — AND logic
actions jsonb {set_category, add_tag_ids, set_merchant}
enabled bool
priority int Higher priority rules run first

Condition fields: merchant_normalized, description, category, bank_name, amount, transaction_type Condition operators: contains, equals, starts_with, gt, lt, not_equals Actions: set_category, set_merchant, add_tag_ids, apply_split


split_payments

Records of actual cash settlements between participants.

Column Type Description
from_participant_id int FK → participants Who paid
to_participant_id int FK → participants Who received
amount numeric Amount settled
payment_date date Date of settlement
notes text Optional note (e.g. "bank transfer")
linked_transaction_id int FK → transactions (nullable) If the payment was itself a transaction

expense_metadata

Enrichment records for non-statement expenses (email receipts, manual entries). Linked to a transaction if one exists; otherwise a standalone record awaiting reconciliation.

Column Type Description
transaction_id int FK → transactions (unique, nullable) Linked transaction; NULL until reconciled
source text Origin: email, manual
paperless_doc_id int Paperless-NGX document ID
payment_method text credit_card, debit_card, paypal, afterpay, cash, etc.
payment_method_detail text Card last-4 or provider detail
order_reference text Order/confirmation number
line_items jsonb Array of {description, qty, unit_price, total}
merchant_normalized text Canonical merchant for matching
amount / transaction_date numeric / date Used for reconciliation matching when transaction_id IS NULL
extraction_model text AI model used (gemini-2.5-flash)

Partial index on (merchant_normalized, transaction_date) WHERE transaction_id IS NULL powers reconciliation queries.


rule_apply_runs

Audit log of bulk rule-apply operations. Each run captures which transactions were affected and a full snapshot for revert support.

Column Type Description
owner_id int FK → participants
applied_at timestamptz When the run executed
split_from date Optional date filter used for this run
matched int Number of rules matched
transactions_affected int Number of transactions changed
reverted_at timestamptz Set when run was reverted
snapshot jsonb Pre-run state of all affected transactions

budgets

Monthly spend targets per category. Stored but currently unused in the UI (replaced by the analytics/insights views).

Column Type Description
owner_id int FK → participants
category text Category name
month date Always first of month (e.g. 2026-03-01)
amount_limit numeric Spend target for that category/month

Category Taxonomy

Fixed set defined in src/lib/categories.ts. Applied by Gemini at ingestion and overridable by the user or rules engine:

groceries · dining · transport · fuel · shopping · utilities · entertainment · travel · health · insurance · subscriptions · cash_advance · government · education · rent · home_goods · home_maintenance · transfers · income · investment · personal_care · pets · gifts · charity · other

  • home_goods — items purchased for the house (appliances, furniture, kitchenware, electronics)
  • home_maintenance — services on the property (cleaning, mowing, repairs)

Committed spend (Insights page): rent, utilities, insurance, subscriptions Excluded from spend analytics: transfers, investment


API Routes

All routes require authentication via X-Forwarded-User header (set by Traefik). Responses are always scoped to the authenticated user's owner_id.

Method Route Description
GET /api/statements All statements for current user
GET / PATCH /api/statements/[id] Get statement; PATCH to reassign owner (also writes account_owner_mappings)
GET /api/transactions Paginated transactions. Filters: from, to, categories, bank_names, tag_ids, transaction_types, search, statement_id, amount_min, amount_max, has_split (yes/no). Sort: sort_by (transaction_date|amount|created_at), sort_dir (asc|desc)
POST /api/transactions Create a manual transaction (no statement)
GET / PATCH /api/transactions/[id] Get transaction; PATCH to upsert override (category, merchant, notes)
GET / POST /api/transactions/[id]/splits List or create splits on a transaction
GET / POST /api/transactions/[id]/tags List or apply tags to a transaction
POST /api/transactions/bulk Bulk update category/merchant across multiple transactions
POST /api/transactions/reconcile Link manual transactions to statement transactions; copies overrides, tags, splits across
GET /api/analytics/monthly Split-adjusted monthly spend by category + income + investments. Params: months (124, default 6)
GET /api/analytics/subscriptions Recurring charge detection — merchants with ≥3 occurrences at consistent intervals
GET /api/analytics/fees Fees and interest from statement summaries + individual fee/interest transactions
GET /api/shared-transactions Transactions with active splits; sorted client-side by date/imported/amount in the UI
POST /api/splits/settle Mark a split as settled
GET / POST /api/split-payments List or record cash settlements between participants
GET / POST /api/participants List participants; POST to create (with optional email)
GET /api/participants/[id]/balance Net balance owed by/to a specific participant
GET /api/participants/balances All participant balances
GET / POST /api/rules List or create rules
PATCH / DELETE /api/rules/[id] Update or delete a rule
POST /api/rules/apply Run all enabled rules against all transactions; returns {matched, transactions_affected}
GET / POST /api/budgets List budgets for a month (?month=YYYY-MM); upsert budget
DELETE /api/budgets/[id] Delete a budget
GET /api/merchants Merchant name autocomplete suggestions
GET /api/me Current user info derived from X-Forwarded-User header
GET / POST /api/tags List or create tags
PATCH / DELETE /api/tags/[id] Update or delete a tag

Ingestion Pipeline

Paperless-NGX
  └─ documents tagged "Bank Statement" + "Credit Card" (without "cc-processor")
       │
       ▼
  N8N workflow — polls every 5 minutes (workflow ID: FysADdFwEtwONQl4)
       │
       ├─ Duplicate check: SELECT WHERE paperless_doc_id = <id>
       │    └─ Already processed → skip, mark in Paperless
       │
       ├─ Download PDF binary from Paperless API
       │
       ├─ Gemini 2.5 Flash — PDF → structured JSON
       │    responseSchema: { summary: {...}, transactions: [...] }
       │    timeout: 180s, retryOnFail: 3×, delay: 30s
       │
       ├─ Parse & normalise
       │    account_number: strip spaces
       │    bank_name: title-case
       │    FX rate: fetch live from open.er-api.com if non-AUD
       │
       ├─ Statement exists? (bank + account + billing_end_date)
       │    └─ Duplicate → skip, mark in Paperless
       │
       ├─ New bank? → Slack approval gate (human confirms before insert)
       │
       ├─ Lookup account_owner_mappings → resolve owner_id (default: 1 = "Me")
       │
       ├─ INSERT statements + transactions
       │
       ├─ Google Calendar reminder for payment_due_date (credit cards)
       │
       └─ Paperless: PATCH document to add "cc-processor" tag

N8N workflow JSON: docker/automation/workflows/cc-statement-processor-paperless.json in the smarthome repo.


Schema Migrations

Located in prisma/migrations/. Applied manually against the running container:

docker exec postgres-personal psql -U personal -d personal \
  < prisma/migrations/<migration>/migration.sql
Migration What it adds
0001_init statements, transactions, participants
0002_splits transaction_splits
0003_owner_segregation owner_id on statements, account_owner_mappings, email on participants
0004_tags tags, transaction_tags
0005_rules rules
0006_budgets budgets
0007_cashflow amount_aud, exchange_rate_to_aud on transactions; exchange_rate_to_aud on statements

paperless_doc_id on statements and the uq_statements_paperless_doc_id index were added directly (not tracked in a migration file). owner_id on transactions and statement_id made nullable were applied directly (March 2026) to support manual transaction entry without a fake statement. reconciled_with_id on transactions, expense_metadata, rule_apply_runs, split_payments were added directly and are covered by the Prisma schema but lack individual migration files.


Known Gaps / TODOs

Payment Provider tracking

Currently merchant_normalized conflates the payment provider with the merchant. Transactions processed through PayPal, Afterpay, Zip, Alipay, etc. end up with the provider as the merchant when the real merchant can't be recovered.

What's been done so far:

  • PayPal entries that embed the merchant name (e.g. PAYPAL *BUNNINGSGRO) were cleaned up — the real merchant was extracted during the March 2026 consolidation pass.
  • Pure PayPal/Afterpay/Zip entries where the merchant is unrecoverable were left as-is.
  • A one-time SQL consolidation pass normalised ~50 merchant name variant groups (March 2026).

Remaining work:

  1. DB migration: ALTER TABLE transactions ADD COLUMN payment_provider text and same on transaction_overrides.
  2. Gemini prompt: add payment_provider to the responseSchema so the AI extracts it separately ("PayPal", "Afterpay", "Zip", null, etc.) — the raw bank description usually contains enough signal.
  3. Backfill: for existing transactions, derive payment_provider from merchant_name patterns (PAYPAL *, AFTERPAY, ZIP/ZIPPAY, BPAY).
  4. App: surface payment_provider as a filter/column in the transactions view; exclude payment providers from merchant analytics so they don't inflate the merchant list.

Deployment

Runs as a Docker container alongside the rest of the home lab stack. Build and deploy:

# From smarthome repo root
docker compose --env-file docker/common.env --env-file docker/finance/.env \
  -f docker/finance/docker-compose.yml up -d --build

The container uses Next.js standalone output. @prisma/adapter-pg and pg are listed in serverExternalPackages in next.config.ts to ensure they are included in the standalone bundle.

S
Description
Personal Finance SPA — Next.js + Prisma
Readme 902 KiB
Languages
TypeScript 99.4%
Shell 0.3%
Dockerfile 0.2%
JavaScript 0.1%