Comptabilité Scripts

This repository contains Python scripts to process financial statements from various banks and financial institutions, extract transaction data, and categorize expenses.

Features

  • Processes PDF and CSV financial statements from multiple sources
  • Categorizes transactions automatically based on descriptions
  • Generates expense summaries with percentages
  • Optional CSV output for all transactions
  • Support for Boursobank, American Express, Monabanq, Revolut, SNCF, and La Poste

Project Structure

comptabilite/
├── .gitignore
├── docs/                    # Documentation files
│   └── README.md
├── scripts/                 # All processing scripts
│   ├── process_bourso.py     # Boursobank account statements
│   ├── process_amex.py       # American Express credit card statements
│   ├── process_monabanq.py   # Monabanq account statements
│   ├── process_expenses.py   # Revolut account statements (CSV format)
│   ├── process_sncf.py       # SNCF salary statements
│   ├── process_laposte.py    # La Poste (CCP) account statements
│   └── process_all.py        # Master script that runs all processing scripts
├── data/                    # Input data
│   ├── pdf/                 # PDF statements by institution
│   │   ├── boursobank/
│   │   ├── american_express/
│   │   ├── monabanq/
│   │   ├── sncf/
│   │   ├── la_poste/
│   │   └── impots/
│   └── raw_csv/             # Raw CSV files (e.g., Revolut statements)
└── output/                  # Generated output
    ├── csv/                  # CSV exports of transactions
    └── reports/              # Financial reports and summaries

Scripts Overview

Main Scripts

  1. process_bourso.py - Processes Boursobank account statements
  2. process_amex.py - Processes American Express credit card statements
  3. process_monabanq.py - Processes Monabanq account statements
  4. process_expenses.py - Processes Revolut account statements (CSV format)
  5. process_sncf.py - Processes SNCF salary statements
  6. process_laposte.py - Processes La Poste (CCP) account statements

Master Script

  • process_all.py - Master script that runs all processing scripts with unified options
  • export_all_csv.py - Exports CSV files for all account statements in one run
  • aggregate_by_month.py - Aggregates all account statements by month and creates reports

Usage

Individual Scripts

From the scripts/ directory:

# Process without CSV output
python process_bourso.py

# Process with CSV output
python process_bourso.py --csv

# Process all PDFs in a specific directory
python process_bourso.py --pdf-dir ../data/pdf/boursobank --output-dir ../../output/csv --csv

Master Scripts

From the scripts/ directory, you can use these master scripts:

process_all.py

Process statements with the standard individual scripts:

# Process all statements
python process_all.py

# Process all statements with CSV output
python process_all.py --csv

# Process only specific accounts
python process_all.py --bourso --amex --csv

export_all_csv.py

Export CSV files for all account statements in one run:

# Export all account statements to CSV
python export_all_csv.py

# Export to specific output directory
python export_all_csv.py --output-dir /path/to/output

dynamic_all_processor.py (NEW)

Fully dynamic processor that auto-discovers all PDF directories and processes them:

# Automatically discover and process all accounts
python dynamic_all_processor.py

# Specify custom data directory
python dynamic_all_processor.py --data-dir /custom/path/to/pdfs

# Generate CSV outputs
python dynamic_all_processor.py --csv

# Custom data and output directories
python dynamic_all_processor.py --data-dir /path/to/pdfs --output-dir /path/to/output

Features:

  • Automatically scans for any directory containing PDF files
  • Processes all discovered directories without requiring predefined paths
  • Handles special cases (Revolut uses CSV files, Impôts are skipped)
  • Displays progress and results for each account type
  • Provides clear success/failure feedback

aggregate_by_month.py

Create monthly and yearly aggregated reports from all CSV files:

# Aggregate all transactions by month
python aggregate_by_month.py

# Specify input and output directories
python aggregate_by_month.py --input-dir /path/to/csv --output-dir /path/to/reports

# Create annual reports
python aggregate_by_month.py --annual

# Create annual report for a specific year
python aggregate_by_month.py --annual --year 2025

Output

When run with --csv flag, each script generates:

  • Individual CSV files for each input file (when applicable)
  • A consolidated CSV file containing all transactions in the output/csv/ directory

The CSV files include:

  • Date
  • Description
  • Category
  • Amount/Debit/Credit
  • Source file

Requirements

  • Python 3.6+
  • pdftotext utility (for PDF processing)
  • Required Python packages: csv, subprocess, re, os, glob, collections

Installation

  1. Install pdftotext:

    # Ubuntu/Debian
    sudo apt-get install poppler-utils
    
    # macOS
    brew install poppler
    
    # Windows
    # Download from https://github.com/xfftt/poppler-windows/releases/
    
  2. Clone or download this repository

CSV Export Feature

The scripts support exporting all transaction data to CSV format. This allows for:

  • Further analysis in spreadsheet applications
  • Data archiving
  • Integration with other financial tools
  • Transaction-level review and editing

To enable CSV export, add the --csv flag when running any script.

Organization

This project has been reorganized with a clean directory structure:

  • Input files are separated from processing scripts
  • All generated outputs go to the output/ directory
  • Scripts are organized in their own directory
  • Documentation is in the docs/ directory

Git Repository

This is a Git repository. To start tracking changes:

git add .
git commit -m "Your commit message"
Description
No description provided
Readme 2.8 MiB
Languages
Python 100%