Reorganize project structure with separate directories for scripts, data, and output

This commit is contained in:
Kevin Bataille
2026-02-09 10:20:55 +01:00
parent acb1276b38
commit 73ff2b70f7
125 changed files with 1786 additions and 56 deletions

149
docs/README.md Normal file
View File

@@ -0,0 +1,149 @@
# Comptabilité Scripts
This repository contains Python scripts to process financial statements from various banks and financial institutions, extract transaction data, and categorize expenses.
## Features
- Processes PDF and CSV financial statements from multiple sources
- Categorizes transactions automatically based on descriptions
- Generates expense summaries with percentages
- Optional CSV output for all transactions
- Support for Boursobank, American Express, Monabanq, Revolut, SNCF, and La Poste
## Project Structure
```
comptabilite/
├── .gitignore
├── docs/ # Documentation files
│ └── README.md
├── scripts/ # All processing scripts
│ ├── process_bourso.py # Boursobank account statements
│ ├── process_amex.py # American Express credit card statements
│ ├── process_monabanq.py # Monabanq account statements
│ ├── process_expenses.py # Revolut account statements (CSV format)
│ ├── process_sncf.py # SNCF salary statements
│ ├── process_laposte.py # La Poste (CCP) account statements
│ └── process_all.py # Master script that runs all processing scripts
├── data/ # Input data
│ ├── pdf/ # PDF statements by institution
│ │ ├── boursobank/
│ │ ├── american_express/
│ │ ├── monabanq/
│ │ ├── sncf/
│ │ ├── la_poste/
│ │ └── impots/
│ └── raw_csv/ # Raw CSV files (e.g., Revolut statements)
└── output/ # Generated output
├── csv/ # CSV exports of transactions
└── reports/ # Financial reports and summaries
```
## Scripts Overview
### Main Scripts
1. **process_bourso.py** - Processes Boursobank account statements
2. **process_amex.py** - Processes American Express credit card statements
3. **process_monabanq.py** - Processes Monabanq account statements
4. **process_expenses.py** - Processes Revolut account statements (CSV format)
5. **process_sncf.py** - Processes SNCF salary statements
6. **process_laposte.py** - Processes La Poste (CCP) account statements
### Master Script
- **process_all.py** - Master script that runs all processing scripts with unified options
## Usage
### Individual Scripts
From the `scripts/` directory:
```bash
# Process without CSV output
python process_bourso.py
# Process with CSV output
python process_bourso.py --csv
# Process all PDFs in a specific directory
python process_bourso.py --pdf-dir ../data/pdf/boursobank --output-dir ../../output/csv --csv
```
### Master Script
From the `scripts/` directory, the master script can process all statements at once:
```bash
# Process all statements
python process_all.py
# Process all statements with CSV output
python process_all.py --csv
# Process only specific accounts
python process_all.py --bourso --amex --csv
```
## Output
When run with `--csv` flag, each script generates:
- Individual CSV files for each input file (when applicable)
- A consolidated CSV file containing all transactions in the `output/csv/` directory
The CSV files include:
- Date
- Description
- Category
- Amount/Debit/Credit
- Source file
## Requirements
- Python 3.6+
- pdftotext utility (for PDF processing)
- Required Python packages: csv, subprocess, re, os, glob, collections
## Installation
1. Install pdftotext:
```bash
# Ubuntu/Debian
sudo apt-get install poppler-utils
# macOS
brew install poppler
# Windows
# Download from https://github.com/xfftt/poppler-windows/releases/
```
2. Clone or download this repository
## CSV Export Feature
The scripts support exporting all transaction data to CSV format. This allows for:
- Further analysis in spreadsheet applications
- Data archiving
- Integration with other financial tools
- Transaction-level review and editing
To enable CSV export, add the `--csv` flag when running any script.
## Organization
This project has been reorganized with a clean directory structure:
- Input files are separated from processing scripts
- All generated outputs go to the `output/` directory
- Scripts are organized in their own directory
- Documentation is in the `docs/` directory
## Git Repository
This is a Git repository. To start tracking changes:
```bash
git add .
git commit -m "Your commit message"
```