diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000..c0c0b3e --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,196 @@ +# SEO Automation - Architecture Overview + +## Integrated Application Structure + +The SEO automation tool has been refactored into a proper Python package with a clean, modular architecture inspired by Ruby on Rails and modern Python best practices. + +## Package Structure + +``` +seo/ +├── seo # Main executable (entry point) +├── src/seo/ # Main package +│ ├── __init__.py # Package initialization +│ ├── cli.py # CLI interface +│ ├── app.py # Main application class +│ ├── config.py # Configuration +│ ├── exporter.py # Export functionality +│ ├── analyzer.py # AI analysis +│ ├── recategorizer.py # Recategorization +│ ├── seo_checker.py # SEO checking +│ ├── categories.py # Category management +│ └── approval.py # Approval system +├── scripts/ # Legacy scripts (deprecated) +├── setup.py # Package installation +├── setup.cfg # Package configuration +└── config.yaml # Application config +``` + +## Application Architecture + +### Main Application Class (SEOApp) + +The `SEOApp` class in `app.py` is the heart of the application, providing a unified, Rails-inspired API: + +```python +from seo.app import SEOApp + +# Initialize +app = SEOApp(verbose=True) + +# Use methods directly +app.export() +app.analyze() +app.seo_check() +app.categories() +app.approve() + +# Or run full pipeline +app.full_pipeline() +``` + +### Design Principles + +1. **Convention Over Configuration**: Sensible defaults, minimal configuration needed +2. **DRY (Don't Repeat Yourself)**: Shared functionality in base classes +3. **Separation of Concerns**: Each module has a single responsibility +4. **Rails-Inspired API**: Simple, intuitive method names +5. **Pythonic**: Follows Python best practices and idioms + +## Modules + +### Core Modules + +- **app.py**: Main orchestrator, coordinates all operations +- **cli.py**: Command-line interface, parses arguments, routes commands +- **config.py**: Configuration management, loads from .env and config.yaml + +### Feature Modules + +- **exporter.py**: Exports posts from WordPress sites +- **analyzer.py**: AI-powered post analysis +- **recategorizer.py**: AI-powered recategorization +- **seo_checker.py**: SEO quality analysis +- **categories.py**: Category management across sites +- **approval.py**: User approval system for recommendations + +## Usage Patterns + +### As a Library + +```python +from seo.app import SEOApp + +# Create instance +app = SEOApp() + +# Export posts +csv_file = app.export() + +# Analyze with AI (uses latest export) +app.analyze() + +# Or analyze specific file +app.analyze(csv_file) + +# Check SEO quality +app.seo_check(top_n=20) + +# Manage categories +app.categories() + +# Approve recommendations +app.approve() +``` + +### Via CLI + +```bash +# Individual commands +./seo export +./seo analyze +./seo seo_check +./seo categories +./seo approve + +# With arguments +./seo analyze output/all_posts_2026-02-16.csv +./seo approve output/category_assignments_*.csv +./seo seo_check --top-n 20 + +# Full pipeline +./seo full_pipeline + +# Get status +./seo status +``` + +### Installation + +```bash +# Install in development mode +cd seo +pip install -e . + +# Now you can use from anywhere +seo help +seo export +``` + +## Data Flow + +``` +User Input (CLI or Library) + ↓ +SEOApp (orchestrator) + ↓ +┌───────────────────────────────────────┐ +│ exporter analyzer seo_checker │ +│ categories approval recategorizer│ +└───────────────────────────────────────┘ + ↓ +WordPress API / AI API + ↓ +Output Files (output/) +``` + +## Configuration + +Configuration is loaded from multiple sources in order of precedence: + +1. Environment variables (.env file) +2. config.yaml +3. Default values + +```python +from seo.config import Config + +# Access configuration +api_key = Config.OPENROUTER_API_KEY +sites = Config.WORDPRESS_SITES +``` + +## Benefits of This Architecture + +1. **Maintainability**: Clear separation of concerns +2. **Testability**: Each module can be tested independently +3. **Extensibility**: Easy to add new features +4. **Reusability**: Can be used as a library or CLI +5. **Discoverability**: Intuitive API, easy to learn +6. **Robustness**: Proper error handling and logging +7. **Pythonic**: Follows Python conventions and best practices + +## Future Enhancements + +- Add async support for parallel API calls +- Implement caching for API responses +- Add progress bars for long operations +- Create web interface +- Add plugin system for custom analyzers +- Implement scheduling for automated runs + +--- + +**Version**: 1.0.0 +**Last Updated**: 2026-02-16 +**Architecture**: Integrated Python Package