Files
seo/guides/STORAGE_AND_DRAFTS.md
Kevin Bataille 8c7cd24685 Refactor SEO automation into unified CLI application
Major refactoring to create a clean, integrated CLI application:

### New Features:
- Unified CLI executable (./seo) with simple command structure
- All commands accept optional CSV file arguments
- Auto-detection of latest files when no arguments provided
- Simplified output directory structure (output/ instead of output/reports/)
- Cleaner export filename format (all_posts_YYYY-MM-DD.csv)

### Commands:
- export: Export all posts from WordPress sites
- analyze [csv]: Analyze posts with AI (optional CSV input)
- recategorize [csv]: Recategorize posts with AI
- seo_check: Check SEO quality
- categories: Manage categories across sites
- approve [files]: Review and approve recommendations
- full_pipeline: Run complete workflow
- analytics, gaps, opportunities, report, status

### Changes:
- Moved all scripts to scripts/ directory
- Created config.yaml for configuration
- Updated all scripts to use output/ directory
- Deprecated old seo-cli.py in favor of new ./seo
- Added AGENTS.md and CHANGELOG.md documentation
- Consolidated README.md with updated usage

### Technical:
- Added PyYAML dependency
- Removed hardcoded configuration values
- All scripts now properly integrated
- Better error handling and user feedback

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-02-16 14:24:44 +01:00

10 KiB

Storage & Draft Posts - Complete Guide

Storage Architecture

How Data is Stored

The Multi-Site SEO Analyzer does NOT use a local database. Instead:

  1. Fetches on-demand from WordPress REST API
  2. Analyzes in-memory using Python
  3. Exports to CSV files for long-term storage and review
┌─────────────────────────────┐
│   3 WordPress Sites         │
│  (via REST API)             │
└──────────┬──────────────────┘
           │
           ├─→ Fetch posts (published + optional drafts)
           │
┌──────────▼──────────────────┐
│   Python Analysis           │
│  (in-memory processing)     │
└──────────┬──────────────────┘
           │
           ├─→ Analyze titles
           │
           ├─→ Analyze meta descriptions
           │
           ├─→ Score (0-100)
           │
           ├─→ AI recommendations (optional)
           │
┌──────────▼──────────────────┐
│   CSV File Export           │
│  (persistent storage)       │
└─────────────────────────────┘

Why CSV Instead of Database?

Advantages:

  • ✓ No database setup or maintenance
  • ✓ Easy to import to Excel/Google Sheets
  • ✓ Human-readable format
  • ✓ Shareable with non-technical team members
  • ✓ Version control friendly (Git-trackable)
  • ✓ No dependencies on database software

Disadvantages:

  • ✗ Each run is independent (no running total)
  • ✗ No real-time updates
  • ✗ Manual comparison between runs

When to use database instead:

  • If analyzing >10,000 posts regularly
  • If you need real-time dashboards
  • If you want automatic tracking over time

CSV Output Structure

File Location

output/reports/seo_analysis_TIMESTAMP.csv

Columns

Column Description Example
site WordPress site mistergeek.net
post_id WordPress post ID 2845
status Post status publish / draft
title Post title "Best VPN Services 2025"
slug URL slug best-vpn-services-2025
url Full URL https://mistergeek.net/best-vpn-2025/
meta_description Meta description text "Compare 50+ VPN..."
title_score Title SEO score (0-100) 92
title_issues Problems with title "None"
title_recommendations How to improve "None"
meta_score Meta description score (0-100) 88
meta_issues Problems with meta "None"
meta_recommendations How to improve "None"
overall_score Combined score 90
ai_recommendations Claude-generated tips "Consider adding..."

Importing to Google Sheets

  1. Download CSV from output/reports/
  2. Open Google Sheets
  3. File → Import → Upload CSV
  4. Add columns for tracking:
    • Status (Not Started / In Progress / Done)
    • Notes
    • Date Completed
  5. Share with team
  6. Filter and sort as needed

Draft Posts Feature

What Are Drafts?

Draft posts are unpublished WordPress posts. They're:

  • Written but not published
  • Not visible on the website
  • Still indexed by WordPress
  • Perfect for analyzing before publishing

Using Draft Posts

By default, the analyzer fetches only published posts:

python scripts/multi_site_seo_analyzer.py

To include draft posts, use the --include-drafts flag:

python scripts/multi_site_seo_analyzer.py --include-drafts

Output with Drafts

The CSV will include a status column showing which posts are published vs. draft:

site,post_id,status,title,meta_score,overall_score
mistergeek.net,2845,publish,"Best VPN",88,90
mistergeek.net,2901,draft,"New VPN Draft",45,55
webscroll.fr,1234,publish,"Torrent Guide",72,75
webscroll.fr,1235,draft,"Draft Tracker Review",20,30

Use Cases for Drafts

1. Optimize Before Publishing

If you have draft posts ready to publish:

python scripts/multi_site_seo_analyzer.py --include-drafts

Review their SEO scores and improve titles/meta before publishing.

2. Recover Previous Content

If you have removed posts saved as drafts:

python scripts/multi_site_seo_analyzer.py --include-drafts

Analyze them to decide: republish, improve, or delete.

3. Audit Unpublished Work

See what's sitting in drafts that could be published:

python scripts/multi_site_seo_analyzer.py --include-drafts | grep "draft"

Complete Examples

Example 1: Analyze Published Only

python scripts/multi_site_seo_analyzer.py

Output:

  • Analyzes: ~262 published posts
  • Time: 2-3 minutes
  • Drafts: Not included

Example 2: Analyze Published + Drafts

python scripts/multi_site_seo_analyzer.py --include-drafts

Output:

  • Analyzes: ~262 published + X drafts
  • Time: 2-5 minutes (depending on draft count)
  • Shows status column: "publish" or "draft"

Example 3: Analyze Published + Drafts + AI

python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20

Output:

  • Analyzes: All posts (published + drafts)
  • AI recommendations: Top 20 worst-scoring posts
  • Cost: ~$0.20
  • Time: 10-15 minutes

Example 4: Focus on Drafts Only

While the script always includes both, you can filter in Excel/Sheets:

  1. Run: python scripts/multi_site_seo_analyzer.py --include-drafts
  2. Open CSV in Google Sheets
  3. Filter status column = "draft"
  4. Sort by overall_score (lowest first)
  5. Optimize top 10 drafts before publishing

Comparing Results Over Time

Manual Comparison

Since results are exported to CSV, you can track progress manually:

# Week 1
python scripts/multi_site_seo_analyzer.py --no-ai
# Save: seo_analysis_week1.csv

# (Optimize posts for 4 weeks)

# Week 5
python scripts/multi_site_seo_analyzer.py --no-ai
# Save: seo_analysis_week5.csv

# Compare in Excel/Sheets:
# Sort both by post_id
# Compare scores: Week 1 vs Week 5

Calculating Improvement

Example:

Post Week 1 Score Week 5 Score Change
Best VPN 45 92 +47
Top 10 Software 38 78 +40
Streaming Guide 52 65 +13
Average 45 78 +33

Organizing Your CSV Files

Naming Convention

Create a folder for historical analysis:

output/
├── reports/
│   ├── 2025-02-16_initial_analysis.csv
│   ├── 2025-03-16_after_optimization.csv
│   ├── 2025-04-16_follow_up.csv
│   └── seo_analysis_20250216_120000.csv  (latest)

Archive Strategy

  1. Run analyzer monthly
  2. Save result with date: seo_analysis_2025-02-16.csv
  3. Keep 12 months of history
  4. Compare trends over time

Advanced: Storing Recommendations

Using a Master Spreadsheet

Instead of relying on CSV alone, create a master Google Sheet:

Columns:

  • Post ID
  • Title
  • Current Score
  • Issues
  • Improvements Needed
  • Status (Not Started / In Progress / Done)
  • Completed Date
  • New Score

Process:

  1. Run analyzer: python scripts/multi_site_seo_analyzer.py
  2. Copy relevant rows to master spreadsheet
  3. As you optimize: update "Status" and "New Score"
  4. Track progress visually

Performance Considerations

Fetch Time

  • Published only: ~10-30 seconds (262 posts)
  • Published + drafts: ~10-30 seconds (+X seconds per 100 drafts)

Drafts don't significantly impact speed since both are fetched in same API call.

Analysis Time

  • Without AI: ~1-2 minutes
  • With AI (10 posts): ~5-10 minutes
  • With AI (50 posts): ~20-30 minutes

AI recommendations add most of the time (not the fetching).

Memory Usage

  • 262 posts: ~20-30 MB
  • 262 posts + 100 drafts: ~35-50 MB

No memory issues for typical WordPress sites.


Troubleshooting

"No drafts found"

Problem: You're using --include-drafts but get same result as without it.

Solutions:

  1. Verify you have draft posts on the site
  2. Check user has permission to view drafts (needs edit_posts capability)
  3. Try logging in and checking WordPress directly

CSV Encoding Issues

Problem: CSV opens with weird characters in Excel.

Solution: Open with UTF-8 encoding:

  • Excel: File → Open → Select CSV → Click "Edit"
  • Sheets: Upload CSV, let Google handle encoding

Want to Use a Database Later?

If you outgrow CSV files, consider:

SQLite (built-in, no installation):

import sqlite3
conn = sqlite3.connect('seo_analysis.db')
# Insert results into database

PostgreSQL (professional option):

import psycopg2
conn = psycopg2.connect("dbname=seo_db user=postgres")
# Insert results

But for now, CSV is perfect for your needs.


Summary

Storage

Aspect Implementation
Database? No - CSV files
Location output/reports/
Format CSV (Excel/Sheets compatible)
Persistence Permanent (until deleted)

Draft Posts

Aspect Usage
Default Published only
Include drafts --include-drafts flag
Output column status (publish/draft)
Use case Optimize before publishing, recover removed content

Commands

# Published only
python scripts/multi_site_seo_analyzer.py

# Published + Drafts
python scripts/multi_site_seo_analyzer.py --include-drafts

# Published + Drafts + AI
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20

# Skip AI (faster)
python scripts/multi_site_seo_analyzer.py --no-ai

Next Steps

  1. First run (published only):

    python scripts/multi_site_seo_analyzer.py --no-ai
    
  2. Analyze results:

    open output/reports/seo_analysis_*.csv
    
  3. Optimize published posts with score < 50

  4. Second run (include drafts):

    python scripts/multi_site_seo_analyzer.py --include-drafts
    
  5. Decide on drafts: Publish, improve, or delete

  6. Track progress: Re-run monthly and compare scores

Ready? Start with: python scripts/multi_site_seo_analyzer.py --include-drafts