Refactor SEO automation into unified CLI application

Major refactoring to create a clean, integrated CLI application:

### New Features:
- Unified CLI executable (./seo) with simple command structure
- All commands accept optional CSV file arguments
- Auto-detection of latest files when no arguments provided
- Simplified output directory structure (output/ instead of output/reports/)
- Cleaner export filename format (all_posts_YYYY-MM-DD.csv)

### Commands:
- export: Export all posts from WordPress sites
- analyze [csv]: Analyze posts with AI (optional CSV input)
- recategorize [csv]: Recategorize posts with AI
- seo_check: Check SEO quality
- categories: Manage categories across sites
- approve [files]: Review and approve recommendations
- full_pipeline: Run complete workflow
- analytics, gaps, opportunities, report, status

### Changes:
- Moved all scripts to scripts/ directory
- Created config.yaml for configuration
- Updated all scripts to use output/ directory
- Deprecated old seo-cli.py in favor of new ./seo
- Added AGENTS.md and CHANGELOG.md documentation
- Consolidated README.md with updated usage

### Technical:
- Added PyYAML dependency
- Removed hardcoded configuration values
- All scripts now properly integrated
- Better error handling and user feedback

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
Kevin Bataille
2026-02-16 14:24:44 +01:00
parent 3b51952336
commit 8c7cd24685
57 changed files with 16095 additions and 560 deletions

View File

@@ -0,0 +1,365 @@
# AI Analysis for Post Migration & Automation
## Complete Workflow
This guide shows you how to export posts, get AI recommendations, and automate the migrations.
---
## Step 1: Export All Posts
```bash
python scripts/export_posts_for_ai_decision.py
```
**Output:** `output/reports/all_posts_for_ai_decision_TIMESTAMP.csv`
This creates a CSV with all post details (title, content, current site, etc.)
---
## Step 2: Analyze with AI and Get Recommendations
```bash
python scripts/ai_analyze_posts_for_decisions.py \
output/reports/all_posts_for_ai_decision_TIMESTAMP.csv
```
**What happens:**
1. ✓ Reads your posts CSV
2. ✓ Sends batches to Claude via OpenRouter
3. ✓ Gets clear, actionable recommendations
4. ✓ Creates multiple output CSVs
---
## Output Files Generated
### 1. Main File: `posts_with_ai_recommendations_TIMESTAMP.csv`
Contains ALL posts with AI recommendations added:
| site | post_id | title | decision | recommended_category | reason | priority | ai_notes |
|------|---------|-------|----------|---------------------|--------|----------|----------|
| mistergeek.net | 2845 | Best VPN 2025 | Keep on mistergeek.net | VPN | High traffic, core topic | High | Already optimized |
| mistergeek.net | 1234 | YggTorrent Guide | Move to webscroll.fr | Torrenting | Torrent content | Medium | Good SEO potential |
| mistergeek.net | 5678 | Niche Post | Move to hellogeek.net | Other | Low traffic | Low | Experimental content |
### 2. Action-Specific Files
**`posts_to_move_TIMESTAMP.csv`**
- Only posts with "Move to X" decisions
- Ready for export/import automation
**`posts_to_consolidate_TIMESTAMP.csv`**
- Posts with "Consolidate with post_id:X" decisions
- Indicates which posts are duplicates
**`posts_to_delete_TIMESTAMP.csv`**
- Posts marked for deletion
- Low quality, spam, or zero traffic
---
## Understanding Decisions
### Decision Types
| Decision | Meaning | Action |
|----------|---------|--------|
| `Keep on mistergeek.net` | High-value, optimized | Optimize & promote |
| `Move to webscroll.fr` | Torrenting/file-sharing | Export & import |
| `Move to hellogeek.net` | Low-traffic/experimental | Export & import |
| `Consolidate with post_id:2845` | Duplicate content | Merge into post 2845 |
| `Delete` | Low quality or spam | Delete from WordPress |
### Categories
AI assigns one of these categories:
- **VPN** - VPN & privacy tools
- **Software/Tools** - Software reviews & guides
- **Gaming** - Gaming content & emulation
- **Streaming** - Streaming guides & tools
- **Torrenting** - Torrent trackers & guides
- **File-Sharing** - File-sharing services
- **SEO** - SEO & marketing content
- **Content Marketing** - Marketing strategies
- **Other** - Miscellaneous
### Priority
- **High**: Act first (traffic, core content, duplicates)
- **Medium**: Act second (important but less urgent)
- **Low**: Act last (niche, experimental, low impact)
---
## Automation-Friendly Format
The recommendations are designed for automation:
```
"decision": "Move to webscroll.fr"
→ Export post from mistergeek.net
→ Import to webscroll.fr
→ Set 301 redirect
"decision": "Consolidate with post_id:2845"
→ Merge content into post 2845
→ Set 301 redirect from this post
"recommended_category": "VPN"
→ Set WordPress category to "VPN"
"decision": "Delete"
→ Remove post from WordPress
```
---
## Example: Using Recommendations
### Review Moves
```bash
open output/reports/posts_to_move_*.csv
```
Shows all posts that should move sites:
```
post_id | title | current_site | decision | reason
1234 | YggTorrent Guide | mistergeek.net | Move to webscroll.fr | Torrent content
5678 | File Sharing | mistergeek.net | Move to webscroll.fr | File-sharing focus
9012 | Experiment | mistergeek.net | Move to hellogeek.net | Very low traffic
```
### Review Consolidations
```bash
open output/reports/posts_to_consolidate_*.csv
```
Shows duplicates:
```
post_id | title | decision | reason
100 | Best VPN 2025 | Consolidate with post_id:2845 | Duplicate topic
101 | VPN Review | Consolidate with post_id:2845 | Similar content
102 | Top VPNs | Consolidate with post_id:2845 | Same theme
```
Action: Keep post 2845, merge content from 100/101/102, delete others with 301 redirects.
---
## Cost & Performance
### API Usage
For 368 posts in batches of 10:
- **Batches**: ~37 API calls
- **Tokens**: ~300-400k total
- **Cost**: ~$1.50-2.00 (well within €50 budget)
- **Time**: ~5-10 minutes
### Token Breakdown
| Operation | Tokens | Cost |
|-----------|--------|------|
| Analyze 10 posts | ~8-10k | ~$0.04-0.05 |
| Full 368 posts | ~300k | ~$1.50 |
---
## Complete End-to-End Workflow
```bash
# Step 1: Export all posts (5 min)
python scripts/export_posts_for_ai_decision.py
# Step 2: Analyze with AI (10 min)
python scripts/ai_analyze_posts_for_decisions.py \
output/reports/all_posts_for_ai_decision_20260216_150000.csv
# Step 3: Review recommendations
open output/reports/posts_with_ai_recommendations_*.csv
# Step 4: Create master decision sheet (in Google Sheets)
# Copy recommendations, add "Completed" column, share with team
# Step 5: Execute moves (Week 1-4)
# For each post in posts_to_move_*.csv:
# 1. Export from source site
# 2. Import to destination site
# 3. Set 301 redirect
# 4. Update internal links
# Step 6: Consolidate duplicates (Week 3-4)
# For each post in posts_to_consolidate_*.csv:
# 1. Merge content into target post
# 2. Set 301 redirect
# 3. Delete old post
# Step 7: Delete posts (Week 4)
# For each post in posts_to_delete_*.csv:
# 1. Verify no traffic
# 2. Delete post
# 3. No redirect needed
```
---
## Example Output
### Terminal Output
```
======================================================================
AI-POWERED POST ANALYSIS AND RECOMMENDATIONS
======================================================================
Loading CSV: output/reports/all_posts_for_ai_decision_20260216_150000.csv
✓ Loaded 368 posts from CSV
mistergeek.net: 328 posts
webscroll.fr: 17 posts
hellogeek.net: 23 posts
======================================================================
ANALYZING POSTS WITH AI
======================================================================
Processing 368 posts in 37 batches of 10...
Batch 1/37: Analyzing 10 posts...
Sending batch to Claude for analysis...
✓ Got recommendations (tokens: 8234+1456)
Batch 2/37: Analyzing 10 posts...
...
✓ Analysis complete!
Total recommendations: 368
API calls: 37
Estimated cost: $1.84
======================================================================
ANALYSIS SUMMARY
======================================================================
DECISIONS:
Keep on mistergeek.net: 185 posts
Move to webscroll.fr: 42 posts
Move to hellogeek.net: 89 posts
Consolidate with post_id:XX: 34 posts
Delete: 18 posts
RECOMMENDED CATEGORIES:
VPN: 52
Software/Tools: 48
Gaming: 45
Torrenting: 42
Other: 181
...
PRIORITY BREAKDOWN:
High: 95 posts
Medium: 187 posts
Low: 86 posts
======================================================================
EXPORTING RESULTS
======================================================================
✓ Main file: output/reports/posts_with_ai_recommendations_20260216_150000.csv
✓ Moves file (42 posts): output/reports/posts_to_move_20260216_150000.csv
✓ Consolidate file (34 posts): output/reports/posts_to_consolidate_20260216_150000.csv
✓ Delete file (18 posts): output/reports/posts_to_delete_20260216_150000.csv
======================================================================
NEXT STEPS
======================================================================
1. Review main file with all recommendations:
output/reports/posts_with_ai_recommendations_20260216_150000.csv
2. Execute moves (automate with script):
output/reports/posts_to_move_20260216_150000.csv
3. Consolidate duplicates:
output/reports/posts_to_consolidate_20260216_150000.csv
4. Delete low-quality posts:
output/reports/posts_to_delete_20260216_150000.csv
✓ Analysis complete!
```
---
## Integration with Other Tools
### Future: Export/Import Automation
Once you have recommendations, you could automate:
```python
# Pseudo-code for automation
for post in posts_to_move:
1. Export post XML from source site
2. Import to destination site
3. Create 301 redirect
4. Update internal links
```
### Future: Category Bulk Update
```python
# Pseudo-code for category automation
for post in all_posts:
1. Read recommended_category from CSV
2. Set post category via WordPress API
3. Update in bulk
```
---
## Troubleshooting
### "OPENROUTER_API_KEY not set"
- Make sure .env file has OPENROUTER_API_KEY
- Verify key is valid and has credits
- Check file permissions
### "Could not find JSON array in response"
- AI response format might have changed
- Check OpenRouter API documentation
- Try again (might be temporary API issue)
### CSV files are empty
- Check export worked: verify `all_posts_for_ai_decision_*.csv` exists
- Verify WordPress API is working
- Check credentials in .env
### High cost than expected
- Check batch size (default is 10)
- Could reduce to batches of 5 for less cost
- Or use a cheaper model (GPT-3.5 instead of Claude)
---
## Next Steps
Ready to analyze?
```bash
# Step 1: Export posts
python scripts/export_posts_for_ai_decision.py
# Step 2: Get AI recommendations
python scripts/ai_analyze_posts_for_decisions.py \
output/reports/all_posts_for_ai_decision_*.csv
# Step 3: Review and execute!
```
Let me know when you're ready to start! 🚀