Major refactoring to create a clean, integrated CLI application: ### New Features: - Unified CLI executable (./seo) with simple command structure - All commands accept optional CSV file arguments - Auto-detection of latest files when no arguments provided - Simplified output directory structure (output/ instead of output/reports/) - Cleaner export filename format (all_posts_YYYY-MM-DD.csv) ### Commands: - export: Export all posts from WordPress sites - analyze [csv]: Analyze posts with AI (optional CSV input) - recategorize [csv]: Recategorize posts with AI - seo_check: Check SEO quality - categories: Manage categories across sites - approve [files]: Review and approve recommendations - full_pipeline: Run complete workflow - analytics, gaps, opportunities, report, status ### Changes: - Moved all scripts to scripts/ directory - Created config.yaml for configuration - Updated all scripts to use output/ directory - Deprecated old seo-cli.py in favor of new ./seo - Added AGENTS.md and CHANGELOG.md documentation - Consolidated README.md with updated usage ### Technical: - Added PyYAML dependency - Removed hardcoded configuration values - All scripts now properly integrated - Better error handling and user feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
10 KiB
Storage & Draft Posts - Complete Guide
Storage Architecture
How Data is Stored
The Multi-Site SEO Analyzer does NOT use a local database. Instead:
- Fetches on-demand from WordPress REST API
- Analyzes in-memory using Python
- Exports to CSV files for long-term storage and review
┌─────────────────────────────┐
│ 3 WordPress Sites │
│ (via REST API) │
└──────────┬──────────────────┘
│
├─→ Fetch posts (published + optional drafts)
│
┌──────────▼──────────────────┐
│ Python Analysis │
│ (in-memory processing) │
└──────────┬──────────────────┘
│
├─→ Analyze titles
│
├─→ Analyze meta descriptions
│
├─→ Score (0-100)
│
├─→ AI recommendations (optional)
│
┌──────────▼──────────────────┐
│ CSV File Export │
│ (persistent storage) │
└─────────────────────────────┘
Why CSV Instead of Database?
Advantages:
- ✓ No database setup or maintenance
- ✓ Easy to import to Excel/Google Sheets
- ✓ Human-readable format
- ✓ Shareable with non-technical team members
- ✓ Version control friendly (Git-trackable)
- ✓ No dependencies on database software
Disadvantages:
- ✗ Each run is independent (no running total)
- ✗ No real-time updates
- ✗ Manual comparison between runs
When to use database instead:
- If analyzing >10,000 posts regularly
- If you need real-time dashboards
- If you want automatic tracking over time
CSV Output Structure
File Location
output/reports/seo_analysis_TIMESTAMP.csv
Columns
| Column | Description | Example |
|---|---|---|
site |
WordPress site | mistergeek.net |
post_id |
WordPress post ID | 2845 |
status |
Post status | publish / draft |
title |
Post title | "Best VPN Services 2025" |
slug |
URL slug | best-vpn-services-2025 |
url |
Full URL | https://mistergeek.net/best-vpn-2025/ |
meta_description |
Meta description text | "Compare 50+ VPN..." |
title_score |
Title SEO score (0-100) | 92 |
title_issues |
Problems with title | "None" |
title_recommendations |
How to improve | "None" |
meta_score |
Meta description score (0-100) | 88 |
meta_issues |
Problems with meta | "None" |
meta_recommendations |
How to improve | "None" |
overall_score |
Combined score | 90 |
ai_recommendations |
Claude-generated tips | "Consider adding..." |
Importing to Google Sheets
- Download CSV from
output/reports/ - Open Google Sheets
- File → Import → Upload CSV
- Add columns for tracking:
- Status (Not Started / In Progress / Done)
- Notes
- Date Completed
- Share with team
- Filter and sort as needed
Draft Posts Feature
What Are Drafts?
Draft posts are unpublished WordPress posts. They're:
- Written but not published
- Not visible on the website
- Still indexed by WordPress
- Perfect for analyzing before publishing
Using Draft Posts
By default, the analyzer fetches only published posts:
python scripts/multi_site_seo_analyzer.py
To include draft posts, use the --include-drafts flag:
python scripts/multi_site_seo_analyzer.py --include-drafts
Output with Drafts
The CSV will include a status column showing which posts are published vs. draft:
site,post_id,status,title,meta_score,overall_score
mistergeek.net,2845,publish,"Best VPN",88,90
mistergeek.net,2901,draft,"New VPN Draft",45,55
webscroll.fr,1234,publish,"Torrent Guide",72,75
webscroll.fr,1235,draft,"Draft Tracker Review",20,30
Use Cases for Drafts
1. Optimize Before Publishing
If you have draft posts ready to publish:
python scripts/multi_site_seo_analyzer.py --include-drafts
Review their SEO scores and improve titles/meta before publishing.
2. Recover Previous Content
If you have removed posts saved as drafts:
python scripts/multi_site_seo_analyzer.py --include-drafts
Analyze them to decide: republish, improve, or delete.
3. Audit Unpublished Work
See what's sitting in drafts that could be published:
python scripts/multi_site_seo_analyzer.py --include-drafts | grep "draft"
Complete Examples
Example 1: Analyze Published Only
python scripts/multi_site_seo_analyzer.py
Output:
- Analyzes: ~262 published posts
- Time: 2-3 minutes
- Drafts: Not included
Example 2: Analyze Published + Drafts
python scripts/multi_site_seo_analyzer.py --include-drafts
Output:
- Analyzes: ~262 published + X drafts
- Time: 2-5 minutes (depending on draft count)
- Shows status column: "publish" or "draft"
Example 3: Analyze Published + Drafts + AI
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
Output:
- Analyzes: All posts (published + drafts)
- AI recommendations: Top 20 worst-scoring posts
- Cost: ~$0.20
- Time: 10-15 minutes
Example 4: Focus on Drafts Only
While the script always includes both, you can filter in Excel/Sheets:
- Run:
python scripts/multi_site_seo_analyzer.py --include-drafts - Open CSV in Google Sheets
- Filter
statuscolumn = "draft" - Sort by
overall_score(lowest first) - Optimize top 10 drafts before publishing
Comparing Results Over Time
Manual Comparison
Since results are exported to CSV, you can track progress manually:
# Week 1
python scripts/multi_site_seo_analyzer.py --no-ai
# Save: seo_analysis_week1.csv
# (Optimize posts for 4 weeks)
# Week 5
python scripts/multi_site_seo_analyzer.py --no-ai
# Save: seo_analysis_week5.csv
# Compare in Excel/Sheets:
# Sort both by post_id
# Compare scores: Week 1 vs Week 5
Calculating Improvement
Example:
| Post | Week 1 Score | Week 5 Score | Change |
|---|---|---|---|
| Best VPN | 45 | 92 | +47 |
| Top 10 Software | 38 | 78 | +40 |
| Streaming Guide | 52 | 65 | +13 |
| Average | 45 | 78 | +33 |
Organizing Your CSV Files
Naming Convention
Create a folder for historical analysis:
output/
├── reports/
│ ├── 2025-02-16_initial_analysis.csv
│ ├── 2025-03-16_after_optimization.csv
│ ├── 2025-04-16_follow_up.csv
│ └── seo_analysis_20250216_120000.csv (latest)
Archive Strategy
- Run analyzer monthly
- Save result with date:
seo_analysis_2025-02-16.csv - Keep 12 months of history
- Compare trends over time
Advanced: Storing Recommendations
Using a Master Spreadsheet
Instead of relying on CSV alone, create a master Google Sheet:
Columns:
- Post ID
- Title
- Current Score
- Issues
- Improvements Needed
- Status (Not Started / In Progress / Done)
- Completed Date
- New Score
Process:
- Run analyzer:
python scripts/multi_site_seo_analyzer.py - Copy relevant rows to master spreadsheet
- As you optimize: update "Status" and "New Score"
- Track progress visually
Performance Considerations
Fetch Time
- Published only: ~10-30 seconds (262 posts)
- Published + drafts: ~10-30 seconds (+X seconds per 100 drafts)
Drafts don't significantly impact speed since both are fetched in same API call.
Analysis Time
- Without AI: ~1-2 minutes
- With AI (10 posts): ~5-10 minutes
- With AI (50 posts): ~20-30 minutes
AI recommendations add most of the time (not the fetching).
Memory Usage
- 262 posts: ~20-30 MB
- 262 posts + 100 drafts: ~35-50 MB
No memory issues for typical WordPress sites.
Troubleshooting
"No drafts found"
Problem: You're using --include-drafts but get same result as without it.
Solutions:
- Verify you have draft posts on the site
- Check user has permission to view drafts (needs edit_posts capability)
- Try logging in and checking WordPress directly
CSV Encoding Issues
Problem: CSV opens with weird characters in Excel.
Solution: Open with UTF-8 encoding:
- Excel: File → Open → Select CSV → Click "Edit"
- Sheets: Upload CSV, let Google handle encoding
Want to Use a Database Later?
If you outgrow CSV files, consider:
SQLite (built-in, no installation):
import sqlite3
conn = sqlite3.connect('seo_analysis.db')
# Insert results into database
PostgreSQL (professional option):
import psycopg2
conn = psycopg2.connect("dbname=seo_db user=postgres")
# Insert results
But for now, CSV is perfect for your needs.
Summary
Storage
| Aspect | Implementation |
|---|---|
| Database? | No - CSV files |
| Location | output/reports/ |
| Format | CSV (Excel/Sheets compatible) |
| Persistence | Permanent (until deleted) |
Draft Posts
| Aspect | Usage |
|---|---|
| Default | Published only |
| Include drafts | --include-drafts flag |
| Output column | status (publish/draft) |
| Use case | Optimize before publishing, recover removed content |
Commands
# Published only
python scripts/multi_site_seo_analyzer.py
# Published + Drafts
python scripts/multi_site_seo_analyzer.py --include-drafts
# Published + Drafts + AI
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
# Skip AI (faster)
python scripts/multi_site_seo_analyzer.py --no-ai
Next Steps
-
First run (published only):
python scripts/multi_site_seo_analyzer.py --no-ai -
Analyze results:
open output/reports/seo_analysis_*.csv -
Optimize published posts with score < 50
-
Second run (include drafts):
python scripts/multi_site_seo_analyzer.py --include-drafts -
Decide on drafts: Publish, improve, or delete
-
Track progress: Re-run monthly and compare scores
Ready? Start with: python scripts/multi_site_seo_analyzer.py --include-drafts