Comptabilité Scripts
This repository contains Python scripts to process financial statements from various banks and financial institutions, extract transaction data, and categorize expenses.
Features
- Processes PDF and CSV financial statements from multiple sources
- Categorizes transactions automatically based on descriptions
- Generates expense summaries with percentages
- Optional CSV output for all transactions
- Support for Boursobank, American Express, Monabanq, Revolut, SNCF, and La Poste
Project Structure
comptabilite/
├── .gitignore
├── docs/ # Documentation files
│ └── README.md
├── scripts/ # All processing scripts
│ ├── process_bourso.py # Boursobank account statements
│ ├── process_amex.py # American Express credit card statements
│ ├── process_monabanq.py # Monabanq account statements
│ ├── process_expenses.py # Revolut account statements (CSV format)
│ ├── process_sncf.py # SNCF salary statements
│ ├── process_laposte.py # La Poste (CCP) account statements
│ └── process_all.py # Master script that runs all processing scripts
├── data/ # Input data
│ ├── pdf/ # PDF statements by institution
│ │ ├── boursobank/
│ │ ├── american_express/
│ │ ├── monabanq/
│ │ ├── sncf/
│ │ ├── la_poste/
│ │ └── impots/
│ └── raw_csv/ # Raw CSV files (e.g., Revolut statements)
└── output/ # Generated output
├── csv/ # CSV exports of transactions
└── reports/ # Financial reports and summaries
Scripts Overview
Main Scripts
- process_bourso.py - Processes Boursobank account statements
- process_amex.py - Processes American Express credit card statements
- process_monabanq.py - Processes Monabanq account statements
- process_expenses.py - Processes Revolut account statements (CSV format)
- process_sncf.py - Processes SNCF salary statements
- process_laposte.py - Processes La Poste (CCP) account statements
Master Script
- process_all.py - Master script that runs all processing scripts with unified options
- export_all_csv.py - Exports CSV files for all account statements in one run
- aggregate_by_month.py - Aggregates all account statements by month and creates reports
Usage
Individual Scripts
From the scripts/ directory:
# Process without CSV output
python process_bourso.py
# Process with CSV output
python process_bourso.py --csv
# Process all PDFs in a specific directory
python process_bourso.py --pdf-dir ../data/pdf/boursobank --output-dir ../../output/csv --csv
Master Scripts
From the scripts/ directory, you can use these master scripts:
process_all.py
Process statements with the standard individual scripts:
# Process all statements
python process_all.py
# Process all statements with CSV output
python process_all.py --csv
# Process only specific accounts
python process_all.py --bourso --amex --csv
export_all_csv.py
Export CSV files for all account statements in one run:
# Export all account statements to CSV
python export_all_csv.py
# Export to specific output directory
python export_all_csv.py --output-dir /path/to/output
dynamic_all_processor.py (NEW)
Fully dynamic processor that auto-discovers all PDF directories and processes them:
# Automatically discover and process all accounts
python dynamic_all_processor.py
# Specify custom data directory
python dynamic_all_processor.py --data-dir /custom/path/to/pdfs
# Generate CSV outputs
python dynamic_all_processor.py --csv
# Custom data and output directories
python dynamic_all_processor.py --data-dir /path/to/pdfs --output-dir /path/to/output
Features:
- Automatically scans for any directory containing PDF files
- Processes all discovered directories without requiring predefined paths
- Handles special cases (Revolut uses CSV files, Impôts are skipped)
- Displays progress and results for each account type
- Provides clear success/failure feedback
aggregate_by_month.py
Create monthly and yearly aggregated reports from all CSV files:
# Aggregate all transactions by month
python aggregate_by_month.py
# Specify input and output directories
python aggregate_by_month.py --input-dir /path/to/csv --output-dir /path/to/reports
# Create annual reports
python aggregate_by_month.py --annual
# Create annual report for a specific year
python aggregate_by_month.py --annual --year 2025
Output
When run with --csv flag, each script generates:
- Individual CSV files for each input file (when applicable)
- A consolidated CSV file containing all transactions in the
output/csv/directory
The CSV files include:
- Date
- Description
- Category
- Amount/Debit/Credit
- Source file
Requirements
- Python 3.6+
- pdftotext utility (for PDF processing)
- Required Python packages: csv, subprocess, re, os, glob, collections
Installation
-
Install pdftotext:
# Ubuntu/Debian sudo apt-get install poppler-utils # macOS brew install poppler # Windows # Download from https://github.com/xfftt/poppler-windows/releases/ -
Clone or download this repository
CSV Export Feature
The scripts support exporting all transaction data to CSV format. This allows for:
- Further analysis in spreadsheet applications
- Data archiving
- Integration with other financial tools
- Transaction-level review and editing
To enable CSV export, add the --csv flag when running any script.
Organization
This project has been reorganized with a clean directory structure:
- Input files are separated from processing scripts
- All generated outputs go to the
output/directory - Scripts are organized in their own directory
- Documentation is in the
docs/directory
Git Repository
This is a Git repository. To start tracking changes:
git add .
git commit -m "Your commit message"