7e58c68197ccdb541e652553f3f9055b1b0e1e3b
Comptabilité Scripts
This repository contains Python scripts to process financial statements from various banks and financial institutions, extract transaction data, and categorize expenses.
Features
- Processes PDF and CSV financial statements from multiple sources
- Categorizes transactions automatically based on descriptions
- Generates expense summaries with percentages
- Optional CSV output for all transactions
- Support for Boursobank, American Express, Monabanq, Revolut, SNCF, and La Poste
Project Structure
comptabilite/
├── .gitignore
├── docs/ # Documentation files
│ └── README.md
├── scripts/ # All processing scripts
│ ├── process_bourso.py # Boursobank account statements
│ ├── process_amex.py # American Express credit card statements
│ ├── process_monabanq.py # Monabanq account statements
│ ├── process_expenses.py # Revolut account statements (CSV format)
│ ├── process_sncf.py # SNCF salary statements
│ ├── process_laposte.py # La Poste (CCP) account statements
│ └── process_all.py # Master script that runs all processing scripts
├── data/ # Input data
│ ├── pdf/ # PDF statements by institution
│ │ ├── boursobank/
│ │ ├── american_express/
│ │ ├── monabanq/
│ │ ├── sncf/
│ │ ├── la_poste/
│ │ └── impots/
│ └── raw_csv/ # Raw CSV files (e.g., Revolut statements)
└── output/ # Generated output
├── csv/ # CSV exports of transactions
└── reports/ # Financial reports and summaries
Scripts Overview
Main Scripts
- process_bourso.py - Processes Boursobank account statements
- process_amex.py - Processes American Express credit card statements
- process_monabanq.py - Processes Monabanq account statements
- process_expenses.py - Processes Revolut account statements (CSV format)
- process_sncf.py - Processes SNCF salary statements
- process_laposte.py - Processes La Poste (CCP) account statements
Master Script
- process_all.py - Master script that runs all processing scripts with unified options
Usage
Individual Scripts
From the scripts/ directory:
# Process without CSV output
python process_bourso.py
# Process with CSV output
python process_bourso.py --csv
# Process all PDFs in a specific directory
python process_bourso.py --pdf-dir ../data/pdf/boursobank --output-dir ../../output/csv --csv
Master Script
From the scripts/ directory, the master script can process all statements at once:
# Process all statements
python process_all.py
# Process all statements with CSV output
python process_all.py --csv
# Process only specific accounts
python process_all.py --bourso --amex --csv
Output
When run with --csv flag, each script generates:
- Individual CSV files for each input file (when applicable)
- A consolidated CSV file containing all transactions in the
output/csv/directory
The CSV files include:
- Date
- Description
- Category
- Amount/Debit/Credit
- Source file
Requirements
- Python 3.6+
- pdftotext utility (for PDF processing)
- Required Python packages: csv, subprocess, re, os, glob, collections
Installation
-
Install pdftotext:
# Ubuntu/Debian sudo apt-get install poppler-utils # macOS brew install poppler # Windows # Download from https://github.com/xfftt/poppler-windows/releases/ -
Clone or download this repository
CSV Export Feature
The scripts support exporting all transaction data to CSV format. This allows for:
- Further analysis in spreadsheet applications
- Data archiving
- Integration with other financial tools
- Transaction-level review and editing
To enable CSV export, add the --csv flag when running any script.
Organization
This project has been reorganized with a clean directory structure:
- Input files are separated from processing scripts
- All generated outputs go to the
output/directory - Scripts are organized in their own directory
- Documentation is in the
docs/directory
Git Repository
This is a Git repository. To start tracking changes:
git add .
git commit -m "Your commit message"
Description
Languages
Python
100%