Refactor SEO automation into unified CLI application

Major refactoring to create a clean, integrated CLI application:

### New Features:
- Unified CLI executable (./seo) with simple command structure
- All commands accept optional CSV file arguments
- Auto-detection of latest files when no arguments provided
- Simplified output directory structure (output/ instead of output/reports/)
- Cleaner export filename format (all_posts_YYYY-MM-DD.csv)

### Commands:
- export: Export all posts from WordPress sites
- analyze [csv]: Analyze posts with AI (optional CSV input)
- recategorize [csv]: Recategorize posts with AI
- seo_check: Check SEO quality
- categories: Manage categories across sites
- approve [files]: Review and approve recommendations
- full_pipeline: Run complete workflow
- analytics, gaps, opportunities, report, status

### Changes:
- Moved all scripts to scripts/ directory
- Created config.yaml for configuration
- Updated all scripts to use output/ directory
- Deprecated old seo-cli.py in favor of new ./seo
- Added AGENTS.md and CHANGELOG.md documentation
- Consolidated README.md with updated usage

### Technical:
- Added PyYAML dependency
- Removed hardcoded configuration values
- All scripts now properly integrated
- Better error handling and user feedback

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
Kevin Bataille
2026-02-16 14:24:44 +01:00
parent 3b51952336
commit 8c7cd24685
57 changed files with 16095 additions and 560 deletions

View File

@@ -0,0 +1,365 @@
# AI Analysis for Post Migration & Automation
## Complete Workflow
This guide shows you how to export posts, get AI recommendations, and automate the migrations.
---
## Step 1: Export All Posts
```bash
python scripts/export_posts_for_ai_decision.py
```
**Output:** `output/reports/all_posts_for_ai_decision_TIMESTAMP.csv`
This creates a CSV with all post details (title, content, current site, etc.)
---
## Step 2: Analyze with AI and Get Recommendations
```bash
python scripts/ai_analyze_posts_for_decisions.py \
output/reports/all_posts_for_ai_decision_TIMESTAMP.csv
```
**What happens:**
1. ✓ Reads your posts CSV
2. ✓ Sends batches to Claude via OpenRouter
3. ✓ Gets clear, actionable recommendations
4. ✓ Creates multiple output CSVs
---
## Output Files Generated
### 1. Main File: `posts_with_ai_recommendations_TIMESTAMP.csv`
Contains ALL posts with AI recommendations added:
| site | post_id | title | decision | recommended_category | reason | priority | ai_notes |
|------|---------|-------|----------|---------------------|--------|----------|----------|
| mistergeek.net | 2845 | Best VPN 2025 | Keep on mistergeek.net | VPN | High traffic, core topic | High | Already optimized |
| mistergeek.net | 1234 | YggTorrent Guide | Move to webscroll.fr | Torrenting | Torrent content | Medium | Good SEO potential |
| mistergeek.net | 5678 | Niche Post | Move to hellogeek.net | Other | Low traffic | Low | Experimental content |
### 2. Action-Specific Files
**`posts_to_move_TIMESTAMP.csv`**
- Only posts with "Move to X" decisions
- Ready for export/import automation
**`posts_to_consolidate_TIMESTAMP.csv`**
- Posts with "Consolidate with post_id:X" decisions
- Indicates which posts are duplicates
**`posts_to_delete_TIMESTAMP.csv`**
- Posts marked for deletion
- Low quality, spam, or zero traffic
---
## Understanding Decisions
### Decision Types
| Decision | Meaning | Action |
|----------|---------|--------|
| `Keep on mistergeek.net` | High-value, optimized | Optimize & promote |
| `Move to webscroll.fr` | Torrenting/file-sharing | Export & import |
| `Move to hellogeek.net` | Low-traffic/experimental | Export & import |
| `Consolidate with post_id:2845` | Duplicate content | Merge into post 2845 |
| `Delete` | Low quality or spam | Delete from WordPress |
### Categories
AI assigns one of these categories:
- **VPN** - VPN & privacy tools
- **Software/Tools** - Software reviews & guides
- **Gaming** - Gaming content & emulation
- **Streaming** - Streaming guides & tools
- **Torrenting** - Torrent trackers & guides
- **File-Sharing** - File-sharing services
- **SEO** - SEO & marketing content
- **Content Marketing** - Marketing strategies
- **Other** - Miscellaneous
### Priority
- **High**: Act first (traffic, core content, duplicates)
- **Medium**: Act second (important but less urgent)
- **Low**: Act last (niche, experimental, low impact)
---
## Automation-Friendly Format
The recommendations are designed for automation:
```
"decision": "Move to webscroll.fr"
→ Export post from mistergeek.net
→ Import to webscroll.fr
→ Set 301 redirect
"decision": "Consolidate with post_id:2845"
→ Merge content into post 2845
→ Set 301 redirect from this post
"recommended_category": "VPN"
→ Set WordPress category to "VPN"
"decision": "Delete"
→ Remove post from WordPress
```
---
## Example: Using Recommendations
### Review Moves
```bash
open output/reports/posts_to_move_*.csv
```
Shows all posts that should move sites:
```
post_id | title | current_site | decision | reason
1234 | YggTorrent Guide | mistergeek.net | Move to webscroll.fr | Torrent content
5678 | File Sharing | mistergeek.net | Move to webscroll.fr | File-sharing focus
9012 | Experiment | mistergeek.net | Move to hellogeek.net | Very low traffic
```
### Review Consolidations
```bash
open output/reports/posts_to_consolidate_*.csv
```
Shows duplicates:
```
post_id | title | decision | reason
100 | Best VPN 2025 | Consolidate with post_id:2845 | Duplicate topic
101 | VPN Review | Consolidate with post_id:2845 | Similar content
102 | Top VPNs | Consolidate with post_id:2845 | Same theme
```
Action: Keep post 2845, merge content from 100/101/102, delete others with 301 redirects.
---
## Cost & Performance
### API Usage
For 368 posts in batches of 10:
- **Batches**: ~37 API calls
- **Tokens**: ~300-400k total
- **Cost**: ~$1.50-2.00 (well within €50 budget)
- **Time**: ~5-10 minutes
### Token Breakdown
| Operation | Tokens | Cost |
|-----------|--------|------|
| Analyze 10 posts | ~8-10k | ~$0.04-0.05 |
| Full 368 posts | ~300k | ~$1.50 |
---
## Complete End-to-End Workflow
```bash
# Step 1: Export all posts (5 min)
python scripts/export_posts_for_ai_decision.py
# Step 2: Analyze with AI (10 min)
python scripts/ai_analyze_posts_for_decisions.py \
output/reports/all_posts_for_ai_decision_20260216_150000.csv
# Step 3: Review recommendations
open output/reports/posts_with_ai_recommendations_*.csv
# Step 4: Create master decision sheet (in Google Sheets)
# Copy recommendations, add "Completed" column, share with team
# Step 5: Execute moves (Week 1-4)
# For each post in posts_to_move_*.csv:
# 1. Export from source site
# 2. Import to destination site
# 3. Set 301 redirect
# 4. Update internal links
# Step 6: Consolidate duplicates (Week 3-4)
# For each post in posts_to_consolidate_*.csv:
# 1. Merge content into target post
# 2. Set 301 redirect
# 3. Delete old post
# Step 7: Delete posts (Week 4)
# For each post in posts_to_delete_*.csv:
# 1. Verify no traffic
# 2. Delete post
# 3. No redirect needed
```
---
## Example Output
### Terminal Output
```
======================================================================
AI-POWERED POST ANALYSIS AND RECOMMENDATIONS
======================================================================
Loading CSV: output/reports/all_posts_for_ai_decision_20260216_150000.csv
✓ Loaded 368 posts from CSV
mistergeek.net: 328 posts
webscroll.fr: 17 posts
hellogeek.net: 23 posts
======================================================================
ANALYZING POSTS WITH AI
======================================================================
Processing 368 posts in 37 batches of 10...
Batch 1/37: Analyzing 10 posts...
Sending batch to Claude for analysis...
✓ Got recommendations (tokens: 8234+1456)
Batch 2/37: Analyzing 10 posts...
...
✓ Analysis complete!
Total recommendations: 368
API calls: 37
Estimated cost: $1.84
======================================================================
ANALYSIS SUMMARY
======================================================================
DECISIONS:
Keep on mistergeek.net: 185 posts
Move to webscroll.fr: 42 posts
Move to hellogeek.net: 89 posts
Consolidate with post_id:XX: 34 posts
Delete: 18 posts
RECOMMENDED CATEGORIES:
VPN: 52
Software/Tools: 48
Gaming: 45
Torrenting: 42
Other: 181
...
PRIORITY BREAKDOWN:
High: 95 posts
Medium: 187 posts
Low: 86 posts
======================================================================
EXPORTING RESULTS
======================================================================
✓ Main file: output/reports/posts_with_ai_recommendations_20260216_150000.csv
✓ Moves file (42 posts): output/reports/posts_to_move_20260216_150000.csv
✓ Consolidate file (34 posts): output/reports/posts_to_consolidate_20260216_150000.csv
✓ Delete file (18 posts): output/reports/posts_to_delete_20260216_150000.csv
======================================================================
NEXT STEPS
======================================================================
1. Review main file with all recommendations:
output/reports/posts_with_ai_recommendations_20260216_150000.csv
2. Execute moves (automate with script):
output/reports/posts_to_move_20260216_150000.csv
3. Consolidate duplicates:
output/reports/posts_to_consolidate_20260216_150000.csv
4. Delete low-quality posts:
output/reports/posts_to_delete_20260216_150000.csv
✓ Analysis complete!
```
---
## Integration with Other Tools
### Future: Export/Import Automation
Once you have recommendations, you could automate:
```python
# Pseudo-code for automation
for post in posts_to_move:
1. Export post XML from source site
2. Import to destination site
3. Create 301 redirect
4. Update internal links
```
### Future: Category Bulk Update
```python
# Pseudo-code for category automation
for post in all_posts:
1. Read recommended_category from CSV
2. Set post category via WordPress API
3. Update in bulk
```
---
## Troubleshooting
### "OPENROUTER_API_KEY not set"
- Make sure .env file has OPENROUTER_API_KEY
- Verify key is valid and has credits
- Check file permissions
### "Could not find JSON array in response"
- AI response format might have changed
- Check OpenRouter API documentation
- Try again (might be temporary API issue)
### CSV files are empty
- Check export worked: verify `all_posts_for_ai_decision_*.csv` exists
- Verify WordPress API is working
- Check credentials in .env
### High cost than expected
- Check batch size (default is 10)
- Could reduce to batches of 5 for less cost
- Or use a cheaper model (GPT-3.5 instead of Claude)
---
## Next Steps
Ready to analyze?
```bash
# Step 1: Export posts
python scripts/export_posts_for_ai_decision.py
# Step 2: Get AI recommendations
python scripts/ai_analyze_posts_for_decisions.py \
output/reports/all_posts_for_ai_decision_*.csv
# Step 3: Review and execute!
```
Let me know when you're ready to start! 🚀

View File

@@ -0,0 +1,423 @@
# AI Recommendations & Meta Description Diagnostics
## Part 1: AI Recommendations - --top-n Parameter
### Understanding the Default (10 posts)
By default, the analyzer:
- **Analyzes ALL posts** (titles, meta descriptions, scores)
- **Generates AI recommendations for only top 10** worst-scoring posts
```bash
python scripts/multi_site_seo_analyzer.py
Result:
✓ Analyzes 368 posts (all of them)
✓ AI recommendations: top 10 only
✓ Cost: ~$0.10
```
### Why Only 10?
**Cost Control:**
| Posts | Cost | Time | Use Case |
|-------|------|------|----------|
| 10 | $0.10 | 5 min | Quick analysis, focus on worst |
| 20 | $0.20 | 8 min | More detailed, more cost |
| 50 | $0.50 | 15 min | Comprehensive, moderate cost |
| 100 | $1.00 | 25 min | Very thorough |
| 368 | $3.60+ | 60+ min | All posts (within €50 budget) |
### Changing the AI Analysis Level
```bash
# Analyze top 20 worst posts
python scripts/multi_site_seo_analyzer.py --top-n 20
# Analyze top 50 worst posts
python scripts/multi_site_seo_analyzer.py --top-n 50
# Analyze ALL 368 posts (comprehensive)
python scripts/multi_site_seo_analyzer.py --top-n 368
# Analyze 0 posts (no AI, free)
python scripts/multi_site_seo_analyzer.py --no-ai
```
### Expected Results
**Command:** `--top-n 50`
```
Analyzing 368 posts...
CSV written with all 368 posts
Generating AI recommendations for top 50 posts...
[1/50] Post with score 12 → AI generates recommendations
[2/50] Post with score 18 → AI generates recommendations
...
[50/50] Post with score 72 → AI generates recommendations
CSV updated with 50 AI recommendations
Cost: ~$0.50
```
**Output CSV:**
- Posts 1-50: AI recommendations filled in
- Posts 51-368: AI recommendations empty
- All posts have: title_score, meta_score, overall_score
### Workflow by Level
**Level 1: Quick Overview (--no-ai)**
```bash
python scripts/multi_site_seo_analyzer.py --no-ai
# See all scores, identify worst posts, no AI cost
# Good for: Understanding what needs work
```
**Level 2: Quick Wins (default --top-n 10)**
```bash
python scripts/multi_site_seo_analyzer.py
# Analyze top 10 worst, get AI recommendations
# Good for: Getting started, low cost (~$0.10)
```
**Level 3: Thorough Analysis (--top-n 50)**
```bash
python scripts/multi_site_seo_analyzer.py --top-n 50
# Analyze top 50 worst, comprehensive AI
# Good for: Serious optimization effort (~$0.50)
```
**Level 4: Complete Analysis (--top-n 368)**
```bash
python scripts/multi_site_seo_analyzer.py --top-n 368
# AI for every post
# Good for: Complete overhaul, fits €50 budget (~$3.60)
```
### Combined Options
```bash
# Include drafts + AI for top 30
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 30
# No AI (free, fast) + drafts
python scripts/multi_site_seo_analyzer.py --include-drafts --no-ai
# All posts + AI for all + progressive CSV
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 368
```
---
## Part 2: Meta Description Detection & Diagnostics
### The Problem
Meta descriptions aren't being found for some posts. This could be because:
1. **WordPress REST API not returning meta fields**
2. **Meta fields stored in different plugin locations**
3. **SEO plugin not properly exposing fields**
### Supported SEO Plugins
The script now looks for meta descriptions from:
| Plugin | Field Name |
|--------|------------|
| Yoast SEO | `_yoast_wpseo_metadesc` |
| Rank Math | `_rank_math_description` |
| All in One SEO | `_aioseo_description` |
| Standard | `description` |
| Alternative names | `_meta_description`, `metadesc` |
### Diagnostic Command
Check what meta fields are actually available on your site:
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
**Output example:**
```
============================================================
META FIELD DIAGNOSTIC
============================================================
Site: https://www.mistergeek.net
Checking available meta fields in first post...
Post: The Best VPN Services 2025
Available meta fields:
• _yoast_wpseo_metadesc: Discover the best VPN services...
• _yoast_wpseo_focuskw: best VPN
• _yoast_wpseo_title: Best VPN Services 2025 | mistergeek
• custom_field_1: some value
Full meta object:
{
"_yoast_wpseo_metadesc": "Discover the best VPN services...",
"_yoast_wpseo_focuskw": "best VPN",
...
}
```
### Running Diagnostics on All 3 Sites
```bash
# Mistergeek
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
# Webscroll
python scripts/multi_site_seo_analyzer.py --diagnose https://www.webscroll.fr
# HelloGeek
python scripts/multi_site_seo_analyzer.py --diagnose https://www.hellogeek.net
```
### What to Look For
**Good - Meta descriptions found:**
```
Available meta fields:
• _yoast_wpseo_metadesc: Discover the best VPN...
• _yoast_wpseo_focuskw: best VPN
```
✓ Meta descriptions will be detected
**Problem - No meta descriptions:**
```
Available meta fields:
(No meta fields found)
```
✗ Either:
- SEO plugin not installed
- REST API not exposing meta
- Custom field names not recognized
**Problem - Unknown field names:**
```
Available meta fields:
• custom_meta_1: some value
• my_seo_field: description text
```
✗ Custom field names - need to update script
---
## Fixing Missing Meta Descriptions
### Solution 1: Enable REST API for SEO Plugin
**For Yoast SEO:**
1. Admin → Yoast SEO → Settings → Advanced
2. Look for "REST API" option
3. Enable "Show in REST API"
4. Save
**For Rank Math:**
1. Admin → Rank Math → General Settings
2. Look for "REST API" option
3. Enable REST API fields
4. Save
**For All in One SEO:**
1. Admin → All in One SEO → Settings
2. Look for REST API option
3. Enable REST API
4. Save
### Solution 2: Add Custom Field Recognition
If your site uses custom field names, tell us and we'll add them:
```python
# Example: if site uses "my_custom_description"
meta_desc = (
meta_dict.get('_yoast_wpseo_metadesc', '') or
meta_dict.get('_rank_math_description', '') or
meta_dict.get('my_custom_description', '') # ← Add this
)
```
Run diagnostic and send us the field name, we'll update the script.
### Solution 3: Manual Curl Request
Check API response directly:
```bash
# Replace with your site and credentials
curl -u "username:app_password" \
"https://www.mistergeek.net/wp-json/wp/v2/posts?per_page=1&status=publish" | jq '.[] | .meta'
# Output will show all meta fields available
```
### Solution 4: Check REST API is Enabled
Test if REST API works:
```bash
# Should return post data
curl https://www.mistergeek.net/wp-json/wp/v2/posts?per_page=1
# Should return 404 or empty if not available
curl https://broken-site.com/wp-json/wp/v2/posts
```
---
## Workflow: Finding Missing Meta Descriptions
### Step 1: Run Diagnostic
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
### Step 2: Check Output
Look for meta description field names in the output.
### Step 3: If Missing
**Option A: Enable in SEO Plugin**
- Go to plugin settings
- Enable REST API field exposure
- Save
**Option B: Update Field Name**
- If custom field is shown in diagnostic
- Tell us the field name
- We'll add it to the script
**Option C: Check WordPress**
- Verify WordPress REST API is working
- Check security plugins aren't blocking
- Ensure user has read permissions
### Step 4: Re-run Analysis
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 50
```
Now meta descriptions should be found!
---
## Complete Examples
### Example 1: Quick Analysis (Cost: $0.10)
```bash
# Default: all posts analyzed, AI for top 10
python scripts/multi_site_seo_analyzer.py
Result:
- 368 posts analyzed (titles, meta, scores)
- 10 posts get AI recommendations
- Cost: ~$0.10
- Time: 5 minutes
```
### Example 2: Comprehensive Analysis (Cost: $0.50)
```bash
# Include drafts, AI for top 50 worst posts
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 50
Result:
- 368 posts analyzed (all, including drafts)
- 50 posts get AI recommendations
- Cost: ~$0.50
- Time: 15 minutes
```
### Example 3: Diagnostic + Complete Analysis
```bash
# First, diagnose meta fields
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
# Then run full analysis
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 100
Result:
- Understand meta situation first
- 100 posts get AI recommendations
- Cost: ~$1.00
- Time: 30 minutes
```
### Example 4: Free Analysis (No AI Cost)
```bash
# Get all scores without AI
python scripts/multi_site_seo_analyzer.py --no-ai
Result:
- 368 posts analyzed
- 0 posts get AI recommendations
- Cost: $0.00
- Time: 2 minutes
- Then manually review CSV and optimize
```
---
## Summary
### AI Recommendations (-top-n)
```bash
--no-ai # Cost: $0 | Time: 2 min | AI: 0 posts
--top-n 10 # Cost: $0.10 | Time: 5 min | AI: 10 posts (default)
--top-n 20 # Cost: $0.20 | Time: 8 min | AI: 20 posts
--top-n 50 # Cost: $0.50 | Time: 15 min | AI: 50 posts
--top-n 100 # Cost: $1.00 | Time: 25 min | AI: 100 posts
--top-n 368 # Cost: $3.60 | Time: 60 min | AI: all posts
```
### Meta Description Detection
```bash
--diagnose URL # Check what meta fields are available
```
If meta descriptions not found:
1. Run diagnostic
2. Check which field names are available
3. Enable in SEO plugin settings OR
4. Tell us custom field name and we'll add support
---
## Next Steps
1. **Run diagnostic:**
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
2. **Check for meta descriptions** in output
3. **If missing:**
- Enable REST API in SEO plugin, or
- Share diagnostic output so we can add custom field support
4. **Run full analysis with desired AI level:**
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 50
```
5. **Review results in CSV**
Ready? Run: `python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net`

382
guides/ANALYZER_SUMMARY.md Normal file
View File

@@ -0,0 +1,382 @@
# Multi-Site SEO Analyzer - Implementation Summary
## What Was Created
### New Script: `scripts/multi_site_seo_analyzer.py`
A Python script that automatically:
1. **Connects to 3 WordPress sites** (mistergeek.net, webscroll.fr, hellogeek.net)
2. **Fetches all published posts** using WordPress REST API
3. **Analyzes titles** for:
- Length (optimal: 50-70 chars)
- Power words (best, complete, guide, etc.)
- Numbers (2025, top 10, etc.)
- Readability and special characters
4. **Analyzes meta descriptions** for:
- Presence (missing = 0 score)
- Length (optimal: 120-160 chars)
- Call-to-action language
5. **Scores each post** (0-100) based on SEO best practices
6. **Generates AI recommendations** (optional) for top priority posts using Claude via OpenRouter
7. **Exports results** to:
- CSV file with detailed analysis
- Markdown summary report
---
## Features
### Automatic Title Analysis
- Detects titles that are too short/long
- Identifies missing power words
- Checks for numbers/statistics
- Flags problematic special characters
- Scoring algorithm: 0-100
### Automatic Meta Description Analysis
- Detects missing meta descriptions (0 score)
- Validates length (120-160 chars optimal)
- Checks for call-to-action language
- Scoring algorithm: 0-100
### Combined SEO Scoring
```
Overall Score = (Title Score × 40%) + (Meta Description Score × 60%)
```
Meta descriptions weighted heavier because they directly impact CTR from search results.
### AI-Powered Recommendations (Optional)
- Uses Claude 3.5 Sonnet via OpenRouter
- Generates specific, actionable recommendations
- Cost-optimized: Only analyzes top priority posts (default 10, configurable)
- Estimated cost: $0.10 per 10 posts analyzed
### Multi-Site Support
- Fetches from all 3 sites simultaneously
- Per-site breakdown in reports
- Identifies top 5 posts to optimize per site
- Consolidates analysis across all sites
---
## Configuration Changes
### Updated `scripts/config.py`
Added multi-site configuration support:
```python
WORDPRESS_SITES = {
'mistergeek.net': {'url': '...', 'username': '...', 'password': '...'},
'webscroll.fr': {'url': '...', 'username': '...', 'password': '...'},
'hellogeek.net': {'url': '...', 'username': '...', 'password': '...'}
}
```
New methods:
- `get_site_config(site_name)` - Get config for specific site
- `get_all_sites()` - Get all configured sites
### Updated `.env.example`
Added variables for each site:
```
WORDPRESS_MISTERGEEK_URL=...
WORDPRESS_MISTERGEEK_USERNAME=...
WORDPRESS_MISTERGEEK_PASSWORD=...
WORDPRESS_WEBSCROLL_URL=...
[etc for each site]
```
---
## Documentation Created
### 1. `guides/SEO_ANALYZER_GUIDE.md` (Comprehensive)
- Complete setup instructions
- Detailed usage examples
- How to interpret scores
- Understanding title and meta analysis
- Action plan for implementation
- Cost estimation
- Troubleshooting guide
- Advanced usage examples
- FAQ section
### 2. `guides/QUICKSTART_ANALYZER.md` (Fast Reference)
- 30-second setup
- One-liners for different scenarios
- Common commands
- Quick troubleshooting
- Cost comparison table
### 3. `guides/ANALYZER_SUMMARY.md` (This document)
- Overview of what was created
- Feature summary
- Usage instructions
- Output explanation
---
## Usage
### Basic Command
```bash
python scripts/multi_site_seo_analyzer.py
```
**What it does:**
- Fetches posts from all 3 sites
- Analyzes titles and meta descriptions
- Generates AI recommendations for top 10 worst-scoring posts
- Exports CSV and Markdown report
### Command Options
```bash
# Skip AI recommendations (free, faster)
python scripts/multi_site_seo_analyzer.py --no-ai
# AI recommendations for top 20 posts
python scripts/multi_site_seo_analyzer.py --top-n 20
# Custom output file
python scripts/multi_site_seo_analyzer.py --output my_report.csv
```
---
## Output Files
### Location: `output/reports/`
### 1. CSV File: `seo_analysis_YYYYMMDD_HHMMSS.csv`
Contains one row per post with columns:
- `site` - Website name
- `post_id` - WordPress post ID
- `title` - Post title
- `slug` - Post slug
- `url` - Full URL
- `meta_description` - Current meta description
- `title_score` - Title SEO score (0-100)
- `title_issues` - Title problems identified
- `title_recommendations` - How to improve title
- `meta_score` - Meta description SEO score (0-100)
- `meta_issues` - Meta description problems
- `meta_recommendations` - How to improve meta
- `overall_score` - Combined score (40% title + 60% meta)
- `ai_recommendations` - Claude-generated specific recommendations
**Use for:**
- Importing to Excel/Google Sheets
- Filtering and sorting
- Bulk editing preparations
- Tracking changes over time
### 2. Markdown Report: `seo_analysis_YYYYMMDD_HHMMSS_summary.md`
Contains:
- Summary statistics (total posts, average scores, cost)
- Priority issues breakdown (missing meta, weak titles, etc.)
- Per-site analysis and top 5 posts to optimize per site
- Human-readable explanations
**Use for:**
- Quick overview
- Sharing with team
- Understanding key metrics
- Decision-making
---
## Score Interpretation
### Score Ranges
| Range | Interpretation | Action |
|-------|-----------------|--------|
| 0-25 | Critical | Fix immediately - major SEO issues |
| 25-50 | Poor | Optimize soon - multiple issues |
| 50-75 | Fair | Improve when convenient - some issues |
| 75-90 | Good | Minor tweaks only - mostly optimized |
| 90-100 | Excellent | No changes needed - well optimized |
### Example Scores
**Poor Post (Score: 12)**
```
Title: "VPN"
- Issues: Too short (3 chars), no power words, no numbers
- Title Score: 5/100
Meta Description: [MISSING]
- Issues: Missing entirely
- Meta Score: 0/100
Overall: 12/100 (Critical - needs work)
```
**Good Post (Score: 88)**
```
Title: "Best VPN Services 2025: Complete Review"
- Issues: None
- Title Score: 95/100
Meta Description: "Compare 50+ VPN services with speed tests, security reviews, and pricing. Find the best VPN for your needs."
- Issues: None
- Meta Score: 85/100
Overall: 88/100 (Good - minimal changes)
```
---
## Cost Breakdown
### Using AI Recommendations
**Pricing:** Claude 3.5 Sonnet via OpenRouter = $3/$15 per 1M input/output tokens
**Per run examples:**
| Posts Analyzed | Tokens | Cost |
|---|---|---|
| 10 posts | ~30k input, 5k output | ~$0.10 |
| 20 posts | ~60k input, 10k output | ~$0.20 |
| 50 posts | ~150k input, 25k output | ~$0.50 |
| 100 posts | ~300k input, 50k output | ~$1.00 |
### Monthly Budget
- **Weekly no-AI:** $0/month
- **Weekly with AI (top 10):** ~$0.40/month
- **Monthly with AI (top 50):** ~$0.50/month
- **Fits easily in €50 budget ✓**
---
## Prerequisites
Before running, ensure:
1. **WordPress credentials** for all 3 sites (API/app passwords)
2. **OpenRouter API key** (for AI recommendations)
3. **REST API enabled** on all 3 WordPress sites
4. **Python 3.8+** installed
5. **Dependencies installed:** `pip install -r requirements.txt`
---
## Workflow Example
### Week 1: Initial Analysis
```bash
# Run analyzer with AI for top 10
python scripts/multi_site_seo_analyzer.py --top-n 10
# Review results
open output/reports/seo_analysis_*_summary.md
# See top 10 posts to optimize on each site
# Note: AI cost ~$0.10
```
### Week 1-4: Implementation
For each of top 10 posts per site:
1. Open WordPress editor
2. Review AI recommendation
3. Update title (if needed)
4. Update meta description (if needed)
5. Publish changes
Average time: 2-3 minutes per post = 30-45 minutes total
### Week 5: Re-analysis
```bash
# Run analyzer again to track progress
python scripts/multi_site_seo_analyzer.py --no-ai
# Compare with Week 1 results
# Identify next batch of 10 posts to optimize
```
Repeat as needed.
---
## Expected Improvements
### Short-term (Month 1)
- **Reduced posts with score < 50:** 30-50% fewer critical issues
- **Meta descriptions added:** Most missing descriptions now present
- **Title improvements:** Clearer, more compelling titles
### Medium-term (Month 3)
- **CTR improvement:** 10-20% increase in click-through rate from search results
- **Keyword rankings:** Some keywords move up 1-3 positions
- **Organic traffic:** 5-10% increase as improved titles/descriptions increase clicks
### Long-term (Months 3-6)
- **Compound effect:** Better CTR signals boost rankings
- **Authority:** Focused content with optimized SEO
- **Traffic:** 20-30% total increase from all factors
---
## Next Steps
1. **Update .env** with your 3 site credentials
2. **Run analyzer:** `python scripts/multi_site_seo_analyzer.py`
3. **Review report:** `open output/reports/seo_analysis_*_summary.md`
4. **Implement:** Start with top 5 posts per site
5. **Re-run:** Monthly to track progress and identify next batch
---
## Troubleshooting
### Connection Issues
- Verify site URLs (https, www)
- Check WordPress credentials
- Test: `curl https://yoursite.com/wp-json/wp/v2/posts?per_page=1`
### No Posts Found
- Check credentials have read permissions
- Verify posts are published (not draft)
- Try disabling SSL verification (last resort)
### AI Errors
- Verify OPENROUTER_API_KEY is set
- Check key has API credits
- Use --no-ai to skip AI (still analyzes)
See `guides/SEO_ANALYZER_GUIDE.md` for detailed troubleshooting.
---
## Files Summary
| File | Purpose |
|------|---------|
| `scripts/multi_site_seo_analyzer.py` | Main analyzer script |
| `scripts/config.py` | Updated with multi-site config |
| `.env` | Your site credentials (not in repo) |
| `.env.example` | Example config (with all fields) |
| `guides/SEO_ANALYZER_GUIDE.md` | Comprehensive guide |
| `guides/QUICKSTART_ANALYZER.md` | Quick reference |
| `guides/ANALYZER_SUMMARY.md` | This file |
| `output/reports/` | Where results are saved |
---
## Questions?
See the full guide: `guides/SEO_ANALYZER_GUIDE.md`
Ready to analyze? Run: `python scripts/multi_site_seo_analyzer.py`

View File

@@ -0,0 +1,330 @@
# API Troubleshooting - 400 Bad Request Issues
## The Problem
WordPress REST API returned **400 Bad Request** errors on pagination:
```
✓ Fetched 100 posts (page 1)
✓ Fetched 100 posts (page 2)
✓ Fetched 100 posts (page 3)
✗ Error page 4: 400 Bad Request
```
This is a **server-side limitation**, not a bug in our code.
---
## Root Causes
### 1. **API Pagination Limits**
Some WordPress configurations limit how many pages can be fetched:
- Page 1-3: OK (limit reached)
- Page 4+: 400 Bad Request
**Common causes:**
- Plugin restrictions (security, performance)
- Server configuration limits
- REST API throttling
- Custom WordPress filters
### 2. **_fields Parameter Issues**
The `_fields` parameter (to fetch only specific columns) might cause issues on:
- Specific API versions
- Custom REST API implementations
- Security plugins that filter fields
### 3. **Status Parameter Encoding**
Multi-status queries (`status=publish,draft`) can fail on pagination.
---
## The Solution
The script now:
1. **Gracefully handles 400 errors** - Treats pagination limit as end of data
2. **Retries without _fields** - Falls back to fetching all fields if needed
3. **Continues analysis** - Uses posts it was able to fetch (doesn't fail)
4. **Logs what it got** - Shows exactly how many posts were fetched
```python
# Graceful error handling
if response.status_code == 400:
logger.info(f"API limit reached (got {status_count} posts)")
break # Stop pagination, use what we have
```
---
## What Happens Now
### Before (Failed)
```
Fetching mistergeek.net...
✓ Fetched 100 posts (page 1)
✓ Fetched 100 posts (page 2)
✗ Error page 4: 400 Bad Request
ERROR: No posts found on any site
```
### After (Works)
```
Fetching mistergeek.net...
✓ Fetched 100 publish posts (page 1)
✓ Fetched 100 publish posts (page 2)
✓ Fetched 28 publish posts (page 3)
ⓘ API limit reached (fetched 228 posts)
✓ Total publish posts: 228
```
---
## How to Check If This Affects You
### If you see:
```
✓ Fetched 100 posts (page 1)
✓ Fetched 100 posts (page 2)
✓ Fetched 28 posts (page 3)
✓ Fetched 15 posts (page 4)
✓ Total posts: 243
```
**Good!** Your API supports full pagination. All posts are being fetched.
### If you see:
```
✓ Fetched 100 posts (page 1)
ⓘ API limit reached (fetched 100 posts)
✓ Total posts: 100
```
**Limited pagination.** API only allows page 1. Script continues with 100 posts.
### If you see:
```
✓ Fetched 100 posts (page 1)
✓ Fetched 100 posts (page 2)
ⓘ API limit reached (fetched 200 posts)
✓ Total posts: 200
```
**Partial pagination.** API allows pages 1-2. Script gets 200 posts.
---
## Impact on Analysis
### Scenario 1: All Posts Fetched (Full Pagination)
```
262 posts total
262 posts analyzed ✓
100% coverage
```
**Result:** Complete analysis, no issues.
### Scenario 2: Limited to First Page (100 posts)
```
262 posts total
100 posts analyzed
38% coverage
```
**Result:** Analysis of first 100 posts only. Missing ~162 posts.
**Impact:**
- Report shows only first 100 posts
- Cannot analyze all content
- Must run analyzer multiple times or contact hosting provider
### Scenario 3: Limited to First 3 Pages (300+ posts if available)
```
262 posts total
228 posts analyzed ✓
87% coverage
```
**Result:** Analyzes most posts, misses last few.
---
## Solutions If Limited
### Solution 1: Contact Hosting Provider
**Ask for:**
> "Can you increase the WordPress REST API pagination limit? Currently limited to X posts per site."
Most providers can increase this in:
- WordPress settings
- PHP configuration
- Plugin settings
### Solution 2: Fetch in Batches
If API limits to 100 posts at a time:
```bash
# Run 1: Analyze first 100
python scripts/multi_site_seo_analyzer.py
# Save results
cp output/reports/seo_analysis_*.csv week1_batch1.csv
# Then manually get remaining posts another way
# (export from WordPress admin, use different tool, etc.)
```
### Solution 3: Check Security Plugins
Some plugins limit REST API access:
- Wordfence
- Sucuri
- iThemes Security
- Jetpack
Try:
1. Temporarily disable security plugins
2. Run analyzer
3. Re-enable plugins
If this works, configure plugin to allow REST API for your IP.
### Solution 4: Use WordPress Export Feature
If REST API is completely broken:
1. WordPress Admin → Tools → Export
2. Select posts to export
3. Download XML
4. Convert XML to CSV
5. Run analyzer on CSV (different mode)
---
## When to Worry
### No Worries If:
- API fetches 150+ posts (most content covered)
- Error message says "API limit reached" (graceful)
- Analysis completes successfully
- CSV has all/most posts
### Worth Investigating If:
- Only fetching <50 posts
- API returning other errors (401, 403, 500)
- All 3 sites have same issue
- Posts are missing from analysis
---
## Checking Your Hosting
### How to check API pagination limit:
**In browser/terminal:**
```bash
# Replace with your site
curl https://www.mistergeek.net/wp-json/wp/v2/posts?per_page=100&status=publish
# Try different pages
curl https://www.mistergeek.net/wp-json/wp/v2/posts?page=1&per_page=100&status=publish
curl https://www.mistergeek.net/wp-json/wp/v2/posts?page=2&per_page=100&status=publish
curl https://www.mistergeek.net/wp-json/wp/v2/posts?page=3&per_page=100&status=publish
```
**If you get:**
- 200 OK: Page works
- 400 Bad Request: Pagination limited
- 401 Unauthorized: Auth needed
- 403 Forbidden: Access denied
### Common Limits by Hosting:
| Host | Typical Limit | Notes |
|------|---------------|-------|
| Shared hosting | 1-2 pages | Often limited for performance |
| WP Engine | Unlimited | Usually good |
| Kinsta | Unlimited | Usually good |
| Bluehost | Often limited | Contact support |
| GoDaddy | Limited | May need plugin adjustment |
---
## Advanced: Manual Pagination
If API pagination is broken, you can manually specify which posts to analyze:
```bash
# Fetch from Google Sheets instead of API
# Or use WordPress XML export
# Or manually create CSV of posts you want to analyze
```
(Contact us if you need help with this)
---
## Logs Explained
### New Log Messages:
```
✓ Fetched 100 publish posts (page 1)
→ Successful fetch of 100 posts
ⓘ API limit reached (fetched 228 posts)
→ API doesn't allow page 4+, got 228 total
ⓘ Retrying without _fields parameter
→ Trying again without field filtering
✓ Total publish posts: 228
→ Final count for this status
```
---
## Summary
| Issue | Impact | Solution |
|-------|--------|----------|
| Can't fetch page 2+ | Limited analysis | Contact host, check plugins |
| 400 Bad Request | Graceful handling | Script continues with what it got |
| All 3 sites fail | API-wide issue | Check WordPress REST API |
| Missing top 50 posts | Incomplete analysis | Use WordPress export as backup |
---
## Next Steps
1. **Run analyzer** and note pagination limits for each site
2. **Check logs** - see how many posts were fetched
3. **If limited:**
- Note the numbers (e.g., "Only fetched 100 of 262")
- Contact your hosting provider
- Ask about REST API pagination limits
4. **Re-run when fixed** (hosting provider increases limit)
---
## Still Having Issues?
Check:
1. ✓ WordPress credentials correct
2. ✓ REST API enabled on all 3 sites
3. ✓ User has read permissions
4. ✓ No IP blocking (firewall/security)
5. ✓ No SSL certificate issues
6. ✓ Sites are online and responding
See: `guides/SEO_ANALYZER_GUIDE.md` → Troubleshooting section

View File

@@ -0,0 +1,684 @@
# Editorial Strategy & Content Audit Guide
**Date:** February 2026
**Status:** Strategic Planning Document
**Goal:** Transform scattered content into a coherent, profitable editorial strategy
---
## 📋 Table of Contents
1. [Your Current Situation](#your-current-situation)
2. [The Real Problem](#the-real-problem)
3. [Content Audit Strategy](#content-audit-strategy)
4. [AI-Powered Analysis Tools](#ai-powered-analysis-tools)
5. [Implementation Plan](#implementation-plan)
6. [Expected Results](#expected-results)
---
## 🎯 Your Current Situation
**What You Have:**
- 262 blog posts
- ~717 monthly organic visits
- Mix of content types (torrents, VPN, streaming, software, gaming)
- Sponsored link monetization model
- ~€50/month budget for tools
**What's Broken:**
- ✗ Incoherent articles scattered across categories
- ✗ No clear editorial line or niche focus
- ✗ Content not aligned for monetization
- ✗ Unclear which topics actually drive traffic
- ✗ Likely cannibalization (multiple posts on same topic)
- ✗ Off-brand/thin content diluting authority
---
## 🔍 The Real Problem: Editorial Chaos
### Current State (Estimated)
```
262 Posts
├─ 97 posts (37%) - Off-brand/unclear fit
├─ 65 posts (25%) - Thin/low traffic
├─ 45 posts (17%) - Duplicate/cannibalized topics
├─ 40 posts (15%) - Good, focused content
└─ 15 posts (6%) - High-performing, monetizable
```
### Why This Matters
1. **User Confusion:** Visitors can't figure out your site's purpose
2. **SEO Penalty:** Google sees incoherent content as low authority
3. **Low Monetization:** Content not aligned with high-CPM sponsor topics
4. **Wasted Effort:** Building authority in too many directions
5. **Poor ROI:** 262 posts producing ~717 visits (2.7 visits/post)
### The Opportunity
**With focused editorial line:**
- Consolidate 262 posts → 180-200 strong posts
- Improve authority in 3-4 core topics
- Target high-CPM sponsored content
- Increase traffic 30-50% (950-1,100 visits/month)
- Better sponsor rates & link opportunities
---
## 📊 Content Audit Strategy
### What We Need to Understand
For each post, analyze:
```
✓ Topic/Category - What's it about?
✓ Performance - Traffic, position, impressions
✓ Depth - Word count, comprehensiveness
✓ Monetization - CPM potential of topic
✓ Relationships - Does it duplicate other posts?
✓ Intent - User intent it targets
✓ Quality - Engagement metrics
```
### Topics to Analyze
Based on your site, expected topics:
**High-Value Topics (Keep & Expand):**
- VPN guides & reviews (High CPM: $5-10)
- Software tools & comparisons (CPM: $3-8)
- Legal streaming alternatives (CPM: $2-4)
- Gaming guides & emulation (CPM: $2-4)
**Medium-Value Topics (Keep & Consolidate):**
- Torrenting guides (Low CPM: $0.5)
- General tools & tutorials (CPM: $2-3)
**Low-Value Topics (Consolidate or Delete):**
- Unrelated content
- Thin posts (<500 words)
- Off-brand content
- Duplicate posts
---
## 🤖 AI-Powered Analysis Tools
### Tool 1: Content Audit & Topic Clustering
**What it does:**
```
Input: All 262 posts (titles + excerpts + traffic data)
AI Analysis:
• Classify each post into topics
• Group related posts
• Identify cannibalization
• Calculate topic authority scores
• Assess monetization potential
Output: Topic map showing:
• Which topics dominate your site
• Traffic distribution
• Quality of content in each topic
• Cannibalization hotspots
```
**Example Output:**
```
TOPIC CLUSTERS IDENTIFIED:
1. YggTorrent & Ratio Building
Posts: 12 | Traffic: 5,200/mo | Avg Position: 8.3
Authority: 85/100 | CPM: $0.5 | Cannibalization: HIGH
Recommendation: Consolidate into 1-2 definitive guides
2. VPN & Privacy
Posts: 22 | Traffic: 3,100/mo | Avg Position: 12.1
Authority: 72/100 | CPM: $8.0 | Cannibalization: MEDIUM
Recommendation: Expand (+5 new posts) - HIGH VALUE
3. Software & Tools
Posts: 45 | Traffic: 4,200/mo | Avg Position: 15.8
Authority: 58/100 | CPM: $5.0 | Cannibalization: HIGH
Recommendation: Consolidate, reorganize, expand
4. Streaming Guides
Posts: 38 | Traffic: 2,100/mo | Avg Position: 22.5
Authority: 45/100 | CPM: $2.0 | Cannibalization: HIGH
Recommendation: Consolidate, refocus on legal options
5. Gaming & Emulation
Posts: 18 | Traffic: 900/mo | Avg Position: 28.3
Authority: 35/100 | CPM: $3.0 | Cannibalization: LOW
Recommendation: Keep but don't expand
6. Other/Unrelated
Posts: 127 | Traffic: 2,500/mo | Avg Position: 40.1
Authority: 10/100 | CPM: $1.0 | Cannibalization: VERY HIGH
Recommendation: DELETE or radically consolidate
```
### Tool 2: Cannibalization Analysis
**Identifies:**
```
Posts competing for same keywords:
• Post #12 & #45 & #88 - "YggTorrent ratio"
• Post #34 & #67 - "Best VPN 2025"
• Post #123 & #198 - "Streaming sites"
Problem: Google doesn't know which to rank
Solution: Merge into 1 comprehensive guide
```
### Tool 3: Monetization Potential
**Calculates for each topic:**
```
CPM (Cost Per Mille - per 1,000 impressions):
VPN: $5-10 CPM (HIGH)
→ 3,100 impressions × $0.008 = $24.80/month
→ If expanded to 10,000 impressions = $80/month
Software/Tools: $3-8 CPM (MEDIUM-HIGH)
Gaming: $2-4 CPM (MEDIUM)
Legal Streaming: $2-4 CPM (MEDIUM)
Torrents/File Sharing: $0.50 CPM (VERY LOW)
→ Sponsors avoid - seen as "piracy enabling"
Current Focus Problem:
• 37% of traffic from low/no-CPM topics
• Missing 50% of monetization potential
```
### Tool 4: Editorial Coherence Score
**Analyzes:**
```
✓ Do your posts tell a coherent story?
✓ Do they align with a clear niche?
✓ Is navigation logical?
✓ Are there contradictory messages?
✓ What % of content is actually useful?
Your Current Score: 23/100 (Very scattered)
Potential Score: 85/100 (With refactoring)
```
---
## 🛠️ Implementation Plan
### Phase 1: Audit & Decision (Week 1-2)
**Step 1: Run Content Audit**
```bash
python content_audit_and_strategy.py
```
**Step 2: Review Outputs**
- Identify actual topic clusters
- See traffic distribution
- Understand cannibalization
- Calculate monetization by topic
**Step 3: Decide Editorial Line**
Choose one strategic approach:
#### Option A: "Ethical Tech & Privacy" (Recommended)
```
Core Topics:
• VPN & Privacy tools
• Software tools & comparisons
• Gaming & emulation guides
• Legal streaming alternatives
Drop:
• Torrenting/file sharing (too low CPM)
• Illegal streaming content
Benefits:
• Higher CPM sponsors
• Clearer ethical positioning
• Better advertiser fit
• Easier to build authority
CPM Average: 4-8 (Good)
Traffic Potential: +40-50%
Monetization: Excellent
```
#### Option B: "Everything Tech & Hacks" (Broader)
```
Core Topics:
• VPN & privacy
• Software tools
• File sharing optimized
• Streaming guides
• Gaming & emulation
• General hacks
Benefits:
• Broader audience
• More content flexibility
Challenges:
• Lower average CPM
• Harder to build authority
CPM Average: 2-4 (Okay)
Traffic Potential: +20-30%
Monetization: Moderate
```
#### Option C: "File Sharing & Downloads" (Original)
```
Core Topics:
• Torrent sites & trackers
• VPN for privacy
• Ratio building
• Download tools
Problems:
• Very low CPM ($0.50)
• Sponsor avoidance
• Poor monetization
Recommendation: NOT RECOMMENDED
CPM Average: 1-2 (Poor)
```
**Recommendation:** Option A ("Ethical Tech & Privacy") gives best balance of:
- Higher monetization ($4-8 CPM)
- Clearer positioning
- Better growth potential
- Easier to scale
### Phase 2: Consolidation (Week 3-6)
**Identify Posts to Merge:**
```
Create consolidated guides:
VPN Topic:
Merge: #34, #67, #145, #198 → "Best VPNs 2025: Complete Guide"
Merge: #45, #89 → "VPN Comparison: Speed vs Privacy"
Delete: #12, #56 (thin content)
Result: 22 posts → 3-4 comprehensive guides
Software Tools:
Merge: Multiple tool reviews → Category-based guides
Delete: 20 outdated tool reviews
Result: 45 posts → 12-15 focused guides
```
**WordPress Work:**
```
1. For each merge:
• Choose the post with best traffic
• Copy unique content from others
• Combine into one comprehensive post
• Update internal links
• Redirect old posts to new post
2. Delete off-brand:
• Set 301 redirects if they have links
• Remove from search console
3. Reorganize categories:
Create structure:
└─ Tech Tools & Software
├─ VPN & Privacy
├─ Software Reviews
└─ Tools & Utilities
└─ Guides & Tutorials
├─ Gaming
├─ Streaming
└─ General Tech
```
### Phase 3: Reorganization (Week 7-10)
**Fix Information Architecture:**
```
Before (Messy):
Home
├─ Category 1
├─ Category 2
├─ Random post
├─ Category 3
└─ ...
After (Organized):
Home
├─ VPN & Security
│ ├─ Best VPNs
│ ├─ VPN Reviews
│ └─ VPN Guides
├─ Software & Tools
│ ├─ Software Reviews
│ ├─ Comparisons
│ └─ Tutorials
├─ Gaming
│ ├─ Game Guides
│ └─ Emulation
└─ About/Resources
```
**Internal Linking Strategy:**
```
Create topic clusters with strong internal linking:
VPN Topic:
Best VPNs (Hub) → links to:
• VPN Review 1
• VPN Review 2
• VPN Comparison
• VPN Guides
Each post links back to hub
Users stay in topic cluster
Google understands topic authority
```
**Navigation Improvements:**
```
• Add breadcrumb navigation
• Create category landing pages
• Add topic-specific sidebars
• Improve internal linking
• Add "related posts" section
```
### Phase 4: Build High-Value Content (Week 11+)
**Create 15-20 New Posts in High-CPM Topics:**
```
VPN Topic (High CPM $8):
• VPN for Gaming Guide
• VPN Speed Comparison
• VPN for Streaming
• VPN Security Features
(Target: 5-8 new posts)
Software Topic (CPM $5):
• Software Comparison Guides
• Tool Tutorials
• Productivity Tools
(Target: 5-6 new posts)
Gaming Topic (CPM $3):
• Game Guides
• Emulation Tutorials
(Target: 3-4 new posts)
Legal Streaming (CPM $2-4):
• Legal Streaming Guides
• Service Comparisons
(Target: 2-3 new posts)
```
**Sponsored Link Strategy:**
```
High-CPM Content = Better sponsor fit:
VPN Posts:
• Target: VPN companies
• Affiliate links & sponsored content
• Expected: $50-100/month
Software Posts:
• Target: Tool reviews/comparisons
• Affiliate partnerships
• Expected: $30-50/month
Total Monthly Potential: $80-150 from sponsorships
(Up from current ~$20-30)
```
---
## 📈 Expected Results
### Before Refactoring
```
Posts: 262
Monthly Traffic: 717 visits
Visits/Post: 2.7
Topic Coherence: 23/100
Cannibalization: HIGH
Monetization: Low ($0.50-2 CPM avg)
Authority: Scattered across 14 topics
Monthly Revenue: ~$20-30
User Experience: Confusing
SEO Performance: Poor (scattered authority)
Growth Trajectory: Flat
```
### After Refactoring (3 months)
```
Posts: 180-200 (40-50 consolidated)
Monthly Traffic: 950-1,100 visits (+33-53%)
Visits/Post: 5-6 (doubled)
Topic Coherence: 75-85/100
Cannibalization: LOW
Monetization: Medium ($4-6 CPM avg)
Authority: Strong in 3-4 core topics
Monthly Revenue: $80-150 (3-5x increase)
User Experience: Clear & coherent
SEO Performance: Strong (focused authority)
Growth Trajectory: Upward
```
### 12-Month Projection
```
If you continue building (15-20 posts/year in high-CPM topics):
Month 12 Traffic: 1,500-2,000 visits (+110-180%)
Monthly Revenue: $200-300 from sponsorships
Topic Authority: Strong in 3-4 areas
Organic growth: Compound effect
```
---
## 🚀 Tools to Build
### Must-Have (Phase 1)
**`content_audit_and_strategy.py`**
```
Input: posts_with_analytics.csv
Outputs:
1. content_audit_report.md (strategic recommendations)
2. topic_clusters.csv (all topics with metrics)
3. consolidation_plan.csv (which posts to merge)
4. cannibalization_analysis.csv (competing posts)
```
**Input Data Needed:**
```
From your existing system:
✓ Post ID
✓ Title
✓ Content (first 1000 chars)
✓ Traffic
✓ Impressions
✓ Category/Tags
✓ URL
```
### Nice-to-Have (Phase 2+)
**`monetization_optimizer.py`**
- Calculate CPM potential by topic
- Recommend sponsored link placement
- Estimate revenue by topic
**`content_health_checker.py`**
- Identify thin content (<500 words)
- Find outdated posts
- Detect poor engagement
**`topic_authority_tracker.py`**
- Track topical authority progress
- Monitor keyword rankings by topic
- Show growth over time
---
## 💰 Investment & ROI
### Cost
```
Tool Development: $0 (I'll build it)
Time to Audit: 2-3 hours reading reports
Time to Consolidate: 20-30 hours (WordPress work)
Time to Reorganize: 10-15 hours
Time to Create New Content: 60-80 hours (4-6 weeks)
Total Time: ~100-130 hours over 3 months
Tool Costs: $0 (using existing data)
ROI Calculation:
Current Revenue: $20-30/month
Projected Revenue: $80-150/month
Monthly Gain: $50-120/month
6-Month Gain: $300-720
12-Month Gain: $600-1,440
Time Investment ROI:
130 hours of work → $600-1,440 annual gain
= $4.60-11 per hour gain
(Ongoing passive income)
```
### Budget Considerations
```
Month 1: $0 (audit & planning)
Month 2: $0 (consolidation work)
Month 3: $0 (reorganization)
Month 4+: $0 (you have the tools)
Optional Paid Tools (if needed):
• Ahrefs/SEMrush: $100/month (NOT needed to start)
• Tools you might not need yet
```
---
## 📋 Next Steps
### Week 1: Decide
1. **Read this document**
2. **Choose editorial direction:**
- Option A: "Ethical Tech & Privacy" (Recommended)
- Option B: "Everything Tech & Hacks"
- Option C: Keep current direction
3. **Get buy-in from yourself** on the plan
### Week 2: Audit
1. **I build:** `content_audit_and_strategy.py`
2. **You run:** The script
3. **You review:** Generated reports
4. **You finalize:** Consolidation plan
### Week 3-6: Execute
1. **Merge posts** in WordPress
2. **Set up redirects**
3. **Reorganize categories**
4. **Fix internal links**
### Week 7-12: Build
1. **Create 15-20 new posts** in high-CPM topics
2. **Develop sponsored content strategy**
3. **Track progress**
4. **Measure traffic gains**
---
## ❓ Key Questions to Answer
Before we start, decide:
1. **Which editorial direction?**
- A: Ethical Tech & Privacy (Recommended)
- B: Broader "Everything Tech"
- C: Keep current mixed approach
2. **Willing to delete content?**
- Some off-brand/thin posts will need to go
- ~20-30 posts potentially deleted
3. **Willing to consolidate?**
- Merge 40-50 posts into stronger guides
- Better user experience
- Better monetization
4. **Timeline?**
- Can you dedicate 10-15 hours/week for 3 months?
- Or spread it over 6 months with less weekly commitment?
5. **Monetization focus?**
- Maximize sponsored revenue?
- Build audience first, monetize later?
- Both equally?
---
## 🎯 Success Metrics
We'll measure success by:
```
✓ Topic coherence score (23→75+)
✓ Monthly traffic (717→1,000+)
✓ Posts (262→200 - consolidated)
✓ Average CPM ($0.50→$4+)
✓ Monthly revenue ($20→$100+)
✓ User experience (subjective improvement)
✓ Sponsor interest (easier pitches)
```
---
## 📞 Ready?
This plan gives you:
✅ Clear editorial direction
✅ Data-driven consolidation plan
✅ Higher monetization strategy
✅ Better user experience
✅ Stronger SEO authority
✅ 30-50% traffic growth potential
✅ 3-5x revenue potential
**Next action:** Let me know:
1. Which editorial direction you prefer?
2. When you can dedicate time to this?
3. If you want me to build the audit tool?
Let's transform your scattered site into a focused authority! 🚀
---
**Document Version:** 1.0
**Last Updated:** February 2026
**Status:** Ready for Implementation

View File

@@ -0,0 +1,328 @@
# Export Posts for AI Decision Making - Complete Guide
## What This Script Does
Exports **ALL posts from all 3 WordPress sites** with complete details to CSV, so you can:
1. Upload to Claude or other AI for analysis
2. Get AI recommendations for:
- Which site each post should be on
- Which posts to consolidate (duplicates)
- Which posts to delete (low-traffic)
- Content gaps to fill
---
## Quick Start
```bash
python scripts/export_posts_for_ai_decision.py
```
**Output:** `output/reports/all_posts_for_ai_decision_TIMESTAMP.csv`
---
## What Gets Exported
### For Each Post:
- **Site**: Which website it's on (mistergeek.net, webscroll.fr, hellogeek.net)
- **Post ID**: WordPress ID
- **Status**: Published or Draft
- **Title**: Post title
- **URL**: Full post URL
- **Dates**: Published and modified dates
- **Categories & Tags**: Current categorization
- **Content Preview**: First 500 characters (for context)
- **Excerpt**: Post excerpt
- **SEO Data**:
- Rank Math title
- Meta description
- Focus keyword
- **Word Count**: Content length
---
## Complete Workflow
### Step 1: Export All Posts
```bash
python scripts/export_posts_for_ai_decision.py
```
**Output:**
```
=======================================================================
EXPORTING ALL POSTS FOR AI DECISION MAKING
=======================================================================
mistergeek.net:
Total: 328
Published: 266
Drafts: 62
webscroll.fr:
Total: 17
Published: 13
Drafts: 4
hellogeek.net:
Total: 23
Published: 20
Drafts: 3
───────────────────────────────────────────────────────────────────
Total across all sites: 368 posts
Published: 299
Drafts: 69
───────────────────────────────────────────────────────────────────
✓ CSV file: output/reports/all_posts_for_ai_decision_20260216_150000.csv
```
### Step 2: Open CSV and Review
```bash
open output/reports/all_posts_for_ai_decision_*.csv
```
You'll see all posts with their full details in a spreadsheet format.
### Step 3: Upload to AI for Analysis
**Option A: Claude (Recommended)**
1. Copy the CSV file path
2. Open https://claude.ai
3. Paste the CSV content or upload the file
4. Ask Claude to analyze and recommend:
```
Please analyze this CSV of blog posts and:
1. Categorize each by topic (VPN, software, gaming, torrenting, streaming, etc.)
2. Recommend which website each should be on:
- mistergeek.net: High-value (VPN, software, gaming, tech guides)
- webscroll.fr: Torrenting/file-sharing content
- hellogeek.net: Low-traffic, experimental, off-brand
3. Identify duplicate/similar posts that should be consolidated
4. Flag posts for deletion (very low word count or clearly spam)
5. Provide a CSV with recommendations
```
**Option B: ChatGPT**
1. Upload CSV file
2. Ask same analysis questions
**Option C: Google Sheets + Claude**
1. Import CSV to Google Sheets
2. Add column: "AI Recommendation"
3. Use Claude to fill in recommendations
4. Share sheet with team for decisions
### Step 4: Create Master Decision Spreadsheet
Based on AI recommendations, create a master sheet with decisions:
| Site | Post ID | Title | Current Site | Recommended | Action | Priority | Notes |
|------|---------|-------|--------------|------------|--------|----------|-------|
| mistergeek.net | 2845 | Best VPN 2025 | mistergeek | mistergeek | KEEP | High | High traffic, optimize |
| mistergeek.net | 1234 | YggTorrent Guide | mistergeek | webscroll.fr | MOVE | Medium | Torrent content |
| mistergeek.net | 5678 | Random Post | mistergeek | hellogeek | MOVE | Low | Very low traffic |
| webscroll.fr | 100 | Tracker Guide | webscroll | webscroll | KEEP | High | Core content |
### Step 5: Execute Moves
```bash
# Week 1: Move posts to webscroll.fr
# Week 2: Move posts to hellogeek.net
# Week 3-4: Consolidate duplicates
# Week 5: Optimize remaining posts on mistergeek.net
```
---
## CSV Columns Explained
### Identification
- **site**: Current website
- **post_id**: WordPress post ID
- **status**: "publish" or "draft"
### Content
- **title**: Post title
- **slug**: URL slug
- **url**: Full post URL
- **excerpt**: Short excerpt if available
- **content_preview**: First 500 characters of post content (for topic analysis)
- **word_count**: Number of words in post
### Metadata
- **date_published**: When published
- **date_modified**: Last update
- **author_id**: Post author
- **categories**: WordPress categories
- **tags**: WordPress tags
### SEO
- **seo_title**: Rank Math SEO title
- **meta_description**: Rank Math or Yoast meta description
- **focus_keyword**: Primary keyword
---
## AI Prompt Examples
### Prompt 1: Basic Categorization
```
I have a CSV of 368 blog posts from 3 websites. Please:
1. Categorize each post by PRIMARY topic:
- VPN
- Software/Tools
- Gaming
- Streaming
- Torrenting
- File-Sharing
- General Tech
- Other
2. For each post, recommend which site it should be on:
- mistergeek.net: VPN, Software, Gaming, General Tech (high-value)
- webscroll.fr: Torrenting, File-Sharing (niche audience)
- hellogeek.net: Other, low-traffic experimental content
3. Return a CSV with columns:
post_id, current_site, title, recommended_site, topic, reason
```
### Prompt 2: Identify Duplicates
```
Please identify posts that cover the same or very similar topics:
1. Group similar posts together
2. For each group, identify which is the best (highest quality, most traffic)
3. Recommend keeping the best and consolidating others into it
4. Suggest which posts to delete vs merge
Return: List of duplicate groups with consolidation recommendations
```
### Prompt 3: Strategic Recommendations
```
Based on this data, provide strategic recommendations for:
1. Which topics are over-represented?
2. Which topics are under-represented?
3. What content gaps exist?
4. Which low-traffic posts should be deleted?
5. What new content should be created?
6. How to optimize each site's focus?
Consider SEO benefits of topic consolidation and site specialization.
```
---
## Using AI Recommendations
Once you get AI recommendations:
1. **Create master spreadsheet** in Google Sheets with all decisions
2. **Share with team** for final approval
3. **Document assumptions** (e.g., "Traffic = quality indicator")
4. **Plan execution** by priority and complexity
5. **Execute moves** following the [MULTI_SITE_STRATEGY.md](MULTI_SITE_STRATEGY.md) guide
---
## Expected CSV Size
- **368 posts** = ~150-200 KB CSV file
- Can be opened in:
- Excel
- Google Sheets
- Apple Numbers
- Any text editor
---
## Command Options
```bash
# Basic usage (default)
python scripts/export_posts_for_ai_decision.py
# Custom output location
python scripts/export_posts_for_ai_decision.py --output /path/to/my_export.csv
```
---
## Example AI Response Format
When you ask Claude to analyze, it might return:
```csv
post_id,current_site,title,recommended_site,topic,action,reason
2845,mistergeek.net,Best VPN 2025,mistergeek.net,VPN,KEEP,High traffic + relevance
1234,mistergeek.net,YggTorrent Guide,webscroll.fr,Torrenting,MOVE,Belongs in torrent-focused site
5678,mistergeek.net,Random Niche,hellogeek.net,Other,MOVE,Very low traffic + off-brand
...
```
---
## Best Practices
1. **Include context in prompts**: Tell AI your goal (improve SEO, consolidate authority)
2. **Ask for reasoning**: "Why should this post move?"
3. **Use multiple analyses**: Get 2-3 different recommendations and compare
4. **Manual review**: Don't blindly follow AI, use it to inform your decisions
5. **Test incrementally**: Move a few high-confidence posts first, measure impact, then scale
---
## Next Steps
1. **Run export:**
```bash
python scripts/export_posts_for_ai_decision.py
```
2. **Upload CSV to Claude:**
- Open https://claude.ai
- Upload CSV file or paste content
- Ask for categorization and site recommendations
3. **Review AI output** and create master decision spreadsheet
4. **Execute moves** following MULTI_SITE_STRATEGY.md guide
5. **Monitor results** for 30 days in Google Analytics
---
## Troubleshooting
### "No posts found"
- Check credentials in .env
- Verify WordPress sites are online
- Try running diagnostic: `python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net`
### "Authentication failed"
- Verify username and app password in .env
- Check user has read permission
- Re-generate app password in WordPress
### CSV is empty or missing columns
- Check that WordPress REST API is returning data
- Verify Rank Math plugin is active (for SEO fields)
- Check for errors in terminal output
---
Ready to export and analyze? Run:
```bash
python scripts/export_posts_for_ai_decision.py
```

View File

@@ -0,0 +1,297 @@
# Install Rank Math API Manager Extended - Complete Guide
## What This Plugin Does
This extended version of the Rank Math API Manager plugin adds **GET endpoints** to read Rank Math SEO metadata (the original only had POST for updating).
### New GET Endpoints
```
GET /wp-json/rank-math-api/v2/get-meta/{post_id}
→ Retrieve Rank Math meta for a single post
GET /wp-json/rank-math-api/v2/posts?per_page=100&page=1&status=publish
→ Retrieve all posts with their Rank Math meta (paginated)
POST /wp-json/rank-math-api/v2/update-meta
→ Update Rank Math meta (original functionality)
```
---
## Installation
### Option 1: Install from File (Easiest)
**Step 1: Download the Plugin File**
The plugin file is at:
```
/Users/acid/Documents/seo/wordpress-plugins/rank-math-api-manager-extended.php
```
**Step 2: Upload to WordPress**
1. Download the file
2. In WordPress Admin:
```
Plugins → Add New → Upload Plugin
```
3. Select file: `rank-math-api-manager-extended.php`
4. Click: **Install Now**
5. Click: **Activate Plugin**
### Option 2: Install Manually via FTP
1. Connect to your server via FTP
2. Navigate to: `/wp-content/plugins/`
3. Create folder: `rank-math-api-manager-extended`
4. Upload `rank-math-api-manager-extended.php` to that folder
5. In WordPress Admin: Plugins → Activate "Rank Math API Manager Extended"
### Option 3: Install via SSH/Command Line
```bash
# SSH into your server
cd /path/to/wordpress/wp-content/plugins/
# Create plugin folder
mkdir rank-math-api-manager-extended
# Upload file (if you have it locally)
# Or create it directly:
cat > rank-math-api-manager-extended/rank-math-api-manager-extended.php << 'EOF'
[Paste the entire plugin code here]
EOF
# Then activate in WordPress Admin
```
---
## Verify Installation
### Step 1: Check Plugin is Activated
In WordPress Admin:
```
Plugins → Installed Plugins
Look for: "Rank Math API Manager Extended"
Status: Should say "Active"
```
### Step 2: Test the GET Endpoint
Run this curl command (replace credentials and domain):
```bash
curl -u "your_username:your_app_password" \
"https://www.mistergeek.net/wp-json/rank-math-api/v2/posts?per_page=1&status=publish"
```
**You should see:**
```json
[
{
"id": 2845,
"title": "Best VPN Services 2025",
"slug": "best-vpn-services",
"url": "https://www.mistergeek.net/best-vpn-services/",
"status": "publish",
"rank_math_title": "The Best VPN Services 2025",
"rank_math_description": "Discover the best VPN services...",
"rank_math_focus_keyword": "best VPN",
"rank_math_canonical_url": ""
}
]
```
If you see this: ✓ **SUCCESS!**
### Step 3: Run Diagnostic
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
**You should now see:**
```
Available meta fields:
• rank_math_description: Discover the best VPN...
• rank_math_title: The Best VPN Services 2025
• rank_math_focus_keyword: best VPN
```
---
## Available API Endpoints
### 1. GET Single Post Meta
```bash
curl -u "username:password" \
"https://www.mistergeek.net/wp-json/rank-math-api/v2/get-meta/2845"
```
**Response:**
```json
{
"post_id": 2845,
"post_title": "Best VPN Services 2025",
"post_url": "https://www.mistergeek.net/best-vpn-services/",
"rank_math_title": "The Best VPN Services 2025",
"rank_math_description": "Discover the best VPN services...",
"rank_math_focus_keyword": "best VPN",
"rank_math_canonical_url": ""
}
```
### 2. GET All Posts (Paginated)
```bash
curl -u "username:password" \
"https://www.mistergeek.net/wp-json/rank-math-api/v2/posts?per_page=100&page=1&status=publish"
```
**Query Parameters:**
- `per_page` - Number of posts per page (1-100, default: 100)
- `page` - Page number (default: 1)
- `status` - Post status: publish, draft, pending, trash (default: publish)
**Response:** Array of posts with meta fields
### 3. POST Update Meta
```bash
curl -u "username:password" \
-X POST \
-H "Content-Type: application/json" \
-d '{
"post_id": 2845,
"rank_math_title": "New Title",
"rank_math_description": "New description"
}' \
"https://www.mistergeek.net/wp-json/rank-math-api/v2/update-meta"
```
---
## Update the SEO Analyzer Script
Now that the plugin is installed, update the script to use the new endpoint:
**File:** `/Users/acid/Documents/seo/scripts/multi_site_seo_analyzer.py`
The script should automatically detect the meta fields from the REST API response. Just run:
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 50
```
The meta descriptions will now be fetched from Rank Math!
---
## Install on All 3 Sites
Repeat the same installation steps for:
- [ ] mistergeek.net ← Install here first to test
- [ ] webscroll.fr
- [ ] hellogeek.net
For each site:
1. Upload plugin via WordPress Admin
2. Activate plugin
3. Test with curl command
4. Run diagnostic
---
## Troubleshooting
### "Plugin could not be activated"
**Solutions:**
1. Check PHP syntax: `php -l rank-math-api-manager-extended.php`
2. Ensure `/wp-content/plugins/` folder exists and is writable
3. Check WordPress error log: `/wp-content/debug.log`
### "Endpoint not found" (404)
**Solutions:**
1. Verify plugin is activated
2. Verify correct URL: `/wp-json/rank-math-api/v2/posts` (not v1)
3. Flush WordPress rewrite rules:
```
WordPress Admin → Settings → Permalinks → Save Changes
```
### "Unauthorized" (401)
**Solutions:**
1. Verify credentials (username and app password)
2. Verify user has `read_posts` permission (at least Author role)
3. Check if security plugin is blocking REST API
### "No meta fields returned"
**Solutions:**
1. Verify Rank Math SEO is installed and activated
2. Verify posts have Rank Math meta set (check in WordPress editor)
3. Check WordPress database: `wp_postmeta` table has `rank_math_*` entries
---
## Security Notes
This plugin respects WordPress permissions:
- **Read access:** Requires `read_posts` capability (any logged-in user)
- **Write access:** Requires `edit_posts` capability (Author or higher)
- Uses HTTP Basic Auth (same as original)
For production, consider:
- Using HTTPS only (not HTTP)
- Restricting API access by IP in `.htaccess` or security plugin
- Creating a separate API user with limited permissions
---
## Remove Plugin
If you need to uninstall:
1. In WordPress Admin: Plugins → Deactivate "Rank Math API Manager Extended"
2. Delete the plugin folder: `/wp-content/plugins/rank-math-api-manager-extended/`
3. Original Rank Math SEO still works
---
## Next Steps
1. **Install the plugin** on mistergeek.net
2. **Test with curl:**
```bash
curl -u "username:password" \
"https://www.mistergeek.net/wp-json/rank-math-api/v2/posts?per_page=1"
```
3. **Run diagnostic:**
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
4. **Run analyzer:**
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 50
```
5. **Install on other 2 sites** and repeat
---
## Support
If you encounter issues:
1. Check the troubleshooting section above
2. Verify curl command works (tests plugin directly)
3. Check WordPress debug log: `/wp-content/debug.log`
4. Share the error message and we can debug together
Ready to install? Download the plugin file and upload it! 🚀

View File

@@ -0,0 +1,484 @@
# Multi-Site Content Strategy
**Status:** Strategic Framework
**Scope:** 3 websites, 260+ posts, content redistribution
**Goal:** Maximize traffic and monetization across your network
---
## 🌐 Your Website Ecosystem
### mistergeek.net (Main Site - Core Brand)
- **Focus:** Tech, software, VPN, gaming, tutorials
- **Monetization:** Sponsors, affiliate links (high CPM topics)
- **Content Type:** Quality guides, comparisons, in-depth tutorials
- **Target Traffic:** 70% of total network traffic
- **Current:** ~717 visits/month → Target: 1,200+ visits/month
### webscroll.fr (Secondary - Niche Focus)
- **Focus:** Torrenting, file-sharing, tracker guides
- **Monetization:** Limited (low CPM), but targeted audience
- **Content Type:** Tracker guides, ratio guides, tutorials
- **Target Traffic:** 20% of network traffic
- **Current:** Unknown → Target: 300-400 visits/month
### hellogeek.net (Experimental - Off-Brand)
- **Focus:** Everything else - experimental, low-traffic, niche
- **Monetization:** Secondary, experimental
- **Content Type:** Mixed, exploratory
- **Target Traffic:** 10% of network traffic
- **Current:** Unknown → Target: 100-150 visits/month
---
## 📊 Content Classification System
### By Topic (How to Categorize Posts)
```
HIGH-VALUE (Keep on mistergeek.net):
├─ VPN & Privacy (CPM: $5-10)
├─ Software & Tools (CPM: $3-8)
├─ Gaming & Emulation (CPM: $2-4)
└─ General Tech Guides (CPM: $2-5)
MEDIUM-VALUE (Move to webscroll.fr):
├─ Torrenting Guides (CPM: $0.50-1)
├─ Tracker Reviews (CPM: $0.50-1)
└─ File-Sharing Tutorials (CPM: $0.50-1)
LOW-VALUE (Move to hellogeek.net):
├─ Experimental Content
├─ Low-Traffic Posts (<20 visits)
├─ Off-Brand Content
└─ Testing/Ideas
```
### By Status
```
PUBLISHED (262 posts)
├─ High Traffic (>100 visits) → Keep on mistergeek.net
├─ Medium Traffic (20-100 visits) → Consolidate or move
├─ Low Traffic (<20 visits) → Move to hellogeek.net
└─ Extremely Low (<5 visits) → Delete or merge
DRAFTS (Unknown quantity)
├─ Complete, ready to publish → Decide which site
├─ Incomplete, needs work → Complete for high-value topics
└─ Outdated/Off-topic → Delete
```
### By Author
```
By "Expert" (Sponsored Posts)
├─ Keep on mistergeek.net
├─ Highlight as sponsored content
├─ Use for monetization
└─ Track separately for revenue
```
---
## 🎯 Distribution Strategy
### STAGE 1: Analysis (Week 1)
**Run:** `content_strategy_analyzer.py`
**What it does:**
```
Input:
• All published posts (mistergeek.net)
• All draft posts
• Post metadata (traffic, author, category)
Output:
✓ content_distribution.csv
✓ content_strategy_report.md
✓ analysis_summary.json
```
**Outputs Include:**
```
Content by Site:
• mistergeek.net: 180 posts (recommended)
• webscroll.fr: 40 posts (recommended)
• hellogeek.net: 40 posts (recommended)
Content by Action:
• KEEP & OPTIMIZE: 120 posts
• CONSOLIDATE: 45 posts
• MOVE_TO_OTHER_SITE: 60 posts
• DELETE: 25 posts
• REPUBLISH_DRAFTS: 12 posts
```
### STAGE 2: Decision Making (Week 2)
**For each post, decide:**
#### Keep on mistergeek.net (Traffic >50, High CPM topics)
```
Post #42: "VPN for Gaming 2025"
✓ Traffic: 150 visits
✓ Topic: VPN (high CPM)
✓ Decision: KEEP & OPTIMIZE
→ Action: Improve, add links, monetize
Post #156: "Best Software for Productivity"
✓ Traffic: 80 visits
✓ Topic: Software (medium CPM)
✓ Decision: KEEP & OPTIMIZE
→ Action: Improve, affiliate links
```
#### Move to webscroll.fr (Torrent/File-sharing)
```
Post #12: "YggTorrent Ratio Guide"
✓ Topic: Torrenting
✓ Decision: MOVE_TO_WEBSCROLL
→ Action: Export, import, redirect
Post #45: "Best Torrent Trackers 2025"
✓ Topic: Torrenting
✓ Decision: MOVE_TO_WEBSCROLL
→ Action: Export, import, redirect
```
#### Move to hellogeek.net (Low traffic, experimental)
```
Post #234: "Random Tech Experiment"
✓ Traffic: 3 visits
✓ Topic: Other
✓ Decision: MOVE_TO_HELLOGEEK
→ Action: Export, import, redirect
Post #289: "Niche Gaming Topic"
✓ Traffic: 15 visits
✓ Topic: Gaming (but low traffic)
✓ Decision: MOVE_TO_HELLOGEEK
→ Action: Can potentially grow here
```
#### Consolidate (Merge duplicates)
```
Posts #12, #45, #88: "YggTorrent ratio"
✓ Same topic, competing
✓ Decision: CONSOLIDATE
→ Action: Keep best, merge others, redirect
Posts #34, #67: "Best VPN"
✓ Same intent
✓ Decision: CONSOLIDATE
→ Action: Merge into one comprehensive guide
```
#### Delete (Thin, off-brand, zero traffic)
```
Post #156: "Unrelated topic"
✓ Traffic: 0
✓ Impressions: 5
✓ Decision: DELETE
→ Action: No redirects, just remove
Post #203: "Test article"
✓ Traffic: 1
✓ Too thin
✓ Decision: DELETE
→ Action: Remove
```
#### Republish Drafts
```
Draft: "Complete VPN Guide"
✓ Complete, ready
✓ Topic: VPN (high CPM)
✓ Decision: PUBLISH_ON_MISTERGEEK
→ Action: Publish, promote, monetize
Draft: "Streaming guide"
✓ Incomplete
✓ Decision: COMPLETE_OR_ABANDON
→ Action: Decide if worth completing
```
### STAGE 3: Implementation (Weeks 3-8)
#### 3.1 Content Export/Import (WordPress)
**For mistergeek.net (Keep):**
```
WordPress:
• No action - stays published
• Update internal links
• Remove links to moved posts
• Add redirects for consolidated posts
```
**For webscroll.fr (Move):**
```
WordPress (source):
1. Export posts (use WordPress export plugin)
2. Get post IDs, URLs, content
3. Set up 301 redirects
webscroll.fr (destination):
1. Import posts (WordPress import)
2. Update internal links
3. Reorganize categories
```
**For hellogeek.net (Move):**
```
Same as webscroll.fr process
```
#### 3.2 URL Redirect Strategy
**Important: SEO-friendly redirects**
```
mistergeek.net/old-post/ → hellogeek.net/old-post/
(Use 301 permanent redirects)
Why:
• Preserve SEO value
• Pass link authority
• Maintain user experience
• Allow analytics tracking
```
#### 3.3 Sponsored Content Handling
**Posts by "Expert" (Sponsored):**
```
Rule: Keep on mistergeek.net or webscroll.fr
Reason: These drive revenue, don't move
Process:
1. Identify all "Expert" posts
2. Evaluate quality & performance
3. Keep in appropriate site
4. Highlight as sponsored
5. Track for revenue attribution
```
---
## 📈 Expected Impact
### Before Distribution
```
mistergeek.net: 262 posts
• 717 visits/month
• 2.7 visits/post
• 23/100 coherence score
• Low monetization potential
• Scattered authority
webscroll.fr: Unknown
hellogeek.net: Unknown
```
### After Distribution
```
mistergeek.net: 180-200 focused posts
• 1,000-1,200 visits/month (+40-70%)
• 5-6 visits/post (doubled)
• 80+/100 coherence score
• High monetization potential
• Strong authority in core topics
webscroll.fr: 40-50 posts
• 300-400 visits/month
• 6-8 visits/post
• Dedicated torrent audience
• Moderate monetization
hellogeek.net: 30-40 posts
• 100-150 visits/month
• Experimental content
• Testing ground for ideas
• Low monetization pressure
```
---
## 🛠️ Actionable Workflow
### Week 1: Analyze
```bash
# Run analysis
python scripts/content_strategy_analyzer.py
# Review outputs
open output/reports/content_strategy_report.md
open output/analysis/content_distribution.csv
```
### Week 2: Decide
```
For each post, decide:
✓ mistergeek.net (stay, optimize)
✓ webscroll.fr (move)
✓ hellogeek.net (move)
✓ Consolidate (merge)
✓ Delete (remove)
✓ Publish (drafts)
Create: Master spreadsheet with decisions
• Post ID
• Title
• Current site
• Recommended site
• Action
• Priority
```
### Week 3-4: Export/Import
```
For webscroll.fr:
1. Export torrent-related posts from mistergeek.net
2. Import to webscroll.fr
3. Set up 301 redirects
4. Update internal links
For hellogeek.net:
1. Export low-traffic/experimental posts
2. Import to hellogeek.net
3. Set up 301 redirects
4. Reorganize structure
```
### Week 5-6: Consolidate
```
For duplicate topics:
1. Identify duplicate posts
2. Keep the best performer
3. Merge unique content into winner
4. Set up 301 redirects from others
5. Update internal links
6. Remove thin versions
```
### Week 7-8: Optimize
```
For mistergeek.net (now focused):
1. Update internal linking
2. Reorganize navigation
3. Create topic pillars
4. Enhance monetization
5. Update category pages
6. Test user experience
```
---
## 📊 Monetization Strategy by Site
### mistergeek.net (Primary Income)
```
Topics & CPM:
• VPN: $5-10 CPM
• Software: $3-8 CPM
• Gaming: $2-4 CPM
• General Tech: $2-5 CPM
Monthly Potential:
• 1,200 visits × avg $0.005 = $6
• Better: Sponsored links = $50-100/month
• Affiliate partnerships = $20-50/month
• Total: $70-150/month
```
### webscroll.fr (Secondary Income)
```
Topics & CPM:
• Torrents: $0.50-1 CPM
• File-sharing: $0.50-1 CPM
Monthly Potential:
• 350 visits × avg $0.001 = $0.35
• Reality: Very low (sponsors avoid)
• Audience monetization: $10-20/month
• Total: $10-20/month
```
### hellogeek.net (Testing & Growth)
```
Purpose: Testing & experimental
• Low monetization pressure
• Growth playground
• Niche audience testing
• Monthly: $5-10/month
```
---
## ✅ Checklist
### Pre-Migration
- [ ] Analyze all content with content_strategy_analyzer.py
- [ ] Review content_strategy_report.md
- [ ] Decide distribution for each post
- [ ] Get drafts CSV ready
- [ ] Set up 301 redirects on mistergeek.net
- [ ] Install WordPress import/export plugins
### Migration Phase
- [ ] Export posts from mistergeek.net
- [ ] Import to webscroll.fr
- [ ] Import to hellogeek.net
- [ ] Set up 301 redirects
- [ ] Update internal links
- [ ] Test 404 vs redirects
### Post-Migration
- [ ] Consolidate duplicates
- [ ] Reorganize categories
- [ ] Update navigation
- [ ] Test user experience on each site
- [ ] Monitor analytics for issues
- [ ] Update XML sitemaps
---
## 🎯 Success Metrics
### Coherence
```
Before: 23/100 (scattered across 14 topics)
After: 80+/100 (focused on 3-4 topics)
```
### Traffic
```
Before: 717 visits/month
After: 1,300+ visits/month (+80%)
```
### Monetization
```
Before: $20-30/month
After: $85-160/month (+3-5x)
```
### Authority
```
Before: Scattered authority
After: Strong in core topics
```
---
## 🚀 Next Action
1. **Prepare draft CSV:** Export your draft posts to `input/drafts/drafts.csv`
2. **Run analysis:** `python scripts/content_strategy_analyzer.py`
3. **Review report:** `open output/reports/content_strategy_report.md`
4. **Make decisions:** Plan your content redistribution
5. **Execute:** Follow the week-by-week workflow
Ready to transform your content ecosystem? 🌐

61
guides/OUTPUT_GUIDE.md Normal file
View File

@@ -0,0 +1,61 @@
# Output Directory
Generated analysis results and logs.
## Directory Structure
```
output/
├── results/ (Analysis results)
│ ├── seo_optimization_report.md (📍 PRIMARY DELIVERABLE)
│ ├── posts_with_analytics.csv (Enriched posts dataset)
│ ├── posts_prioritized.csv (All posts ranked 0-100)
│ ├── keyword_opportunities.csv (26 optimization opportunities)
│ └── content_gaps.csv (New content ideas)
└── logs/ (Analysis logs)
├── import_log.txt
├── opportunity_analysis_log.txt
└── content_gap_analysis_log.txt
```
## Primary Deliverable
**`results/seo_optimization_report.md`**
- Executive summary with key metrics
- Top 20 posts ranked by optimization potential
- AI-generated recommendations for each post
- Keyword opportunities breakdown
- Content gap analysis
- 90-day phased action plan
- Estimated traffic gains
## Supporting Files
**`results/posts_prioritized.csv`**
- All 262 posts ranked by priority score (0-100)
- Use this to see the full ranking and select which posts to optimize next
**`results/keyword_opportunities.csv`**
- The 26 posts identified at positions 11-30
- Includes AI recommendations and estimated gains
- Sort by opportunity_score to find quick wins
**`results/posts_with_analytics.csv`**
- Enriched dataset with all metrics merged
- Use for custom analysis or future reference
## Log Files
**`logs/import_log.txt`**
- Analytics integration report
- URL matching success rate
- Any unmatched URLs for manual review
**`logs/opportunity_analysis_log.txt`**
- Keyword opportunity analysis details
- Posts processed and opportunities found
**`logs/content_gap_analysis_log.txt`**
- Content gap analysis results
- New topics identified

View File

@@ -0,0 +1,417 @@
# Real-Time CSV Monitoring - Progressive Writing Guide
## What is Progressive CSV?
The analyzer now writes results to the CSV file **as they're analyzed** in real-time, instead of waiting until all posts are analyzed.
```
Traditional Mode:
Analyze 262 posts → Wait (2-3 min) → Write CSV
Progressive Mode (NEW):
Analyze post 1 → Write row 1
Analyze post 2 → Write row 2
Analyze post 3 → Write row 3
... (watch it grow in real-time)
```
---
## How It Works
### Enabled by Default
```bash
python scripts/multi_site_seo_analyzer.py
```
Progressive CSV **enabled** by default. The CSV file starts writing immediately as analysis begins.
### Disable (Write Only at End)
```bash
python scripts/multi_site_seo_analyzer.py --no-progressive
```
Use this if you prefer to wait for final results (slightly faster, no real-time visibility).
---
## Real-Time Monitoring
### Monitor Progress in Excel/Google Sheets
**Option 1: Watch CSV grow in real-time**
```bash
# Terminal 1: Start analyzer
python scripts/multi_site_seo_analyzer.py
# Terminal 2: Watch file grow
tail -f output/reports/seo_analysis_*.csv
```
Output:
```
site,post_id,status,title,overall_score
mistergeek.net,1,publish,"VPN Guide",45
mistergeek.net,2,publish,"Best Software",72
mistergeek.net,3,publish,"Gaming Setup",38
mistergeek.net,4,draft,"Draft Post",28
[... more rows appear as analysis continues]
```
**Option 2: Open CSV in Excel while running**
1. Start analyzer: `python scripts/multi_site_seo_analyzer.py`
2. Open file: `output/reports/seo_analysis_*.csv` in Excel
3. **Set to auto-refresh** (Excel → Options → Data → Refresh Data)
4. Watch rows appear as posts are analyzed
**Option 3: Open in Google Sheets**
1. Start analyzer
2. Upload CSV to Google Sheets
3. File → "Enable live editing"
4. Rows appear in real-time
---
## Examples
### Example 1: Basic Progressive Analysis
```bash
python scripts/multi_site_seo_analyzer.py
```
**Output:**
- CSV created immediately
- Rows added as posts are analyzed
- Monitor with `tail -f output/reports/seo_analysis_*.csv`
- Takes ~2-3 minutes for 262 posts
- Final step: Add AI recommendations and re-write CSV
### Example 2: Progressive + Drafts
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
**Output:**
- Analyzes published + draft posts
- Shows status column: "publish" or "draft"
- Rows appear in real-time
- Drafts analyzed after published posts
### Example 3: Progressive + AI Recommendations
```bash
python scripts/multi_site_seo_analyzer.py --top-n 20
```
**Output:**
- Initial CSV: ~2 minutes with all posts (no AI yet)
- Then: AI analysis for top 20 (~5-10 minutes)
- Final CSV: Includes AI recommendations for top 20
- You can see progress in two phases
### Example 4: Disable Progressive (Batch Mode)
```bash
python scripts/multi_site_seo_analyzer.py --no-progressive
```
**Output:**
- Analyzes all posts in memory
- Only writes CSV when complete (~3-5 minutes)
- Single output file at the end
- Slightly faster execution
---
## Monitoring Setup
### Terminal Monitoring
**Watch CSV as it grows:**
```bash
# In one terminal
python scripts/multi_site_seo_analyzer.py
# In another terminal (macOS/Linux)
tail -f output/reports/seo_analysis_*.csv | head -20
# Or with watch command (every 2 seconds)
watch -n 2 'wc -l output/reports/seo_analysis_*.csv'
# On Windows
Get-Content output/reports/seo_analysis_*.csv -Tail 5
```
### Spreadsheet Monitoring
**Google Sheets (recommended):**
```
1. Google Drive → New → Google Sheets
2. File → Open → Upload CSV
3. Let Google Sheets auto-import
4. File → Import → "Replace spreadsheet" (if updating)
5. Watch rows add in real-time
```
**Excel (macOS/Windows):**
```
1. Open Excel
2. File → Open → Navigate to output/reports/
3. Select seo_analysis_*.csv
4. Right-click → Format Cells → "Enable auto-refresh"
5. Watch rows appear
```
---
## File Progress Examples
### Snapshot 1 (30 seconds in)
```
site,post_id,status,title,overall_score
mistergeek.net,1,publish,"Complete VPN Guide",92
mistergeek.net,2,publish,"Best VPN Services",88
mistergeek.net,3,publish,"VPN for Gaming",76
mistergeek.net,4,publish,"Streaming with VPN",72
```
### Snapshot 2 (1 minute in)
```
[Same as above, plus:]
mistergeek.net,5,publish,"Best Software Tools",85
mistergeek.net,6,publish,"Software Comparison",78
mistergeek.net,7,draft,"Incomplete Software",35
mistergeek.net,8,publish,"Gaming Setup Guide",68
webscroll.fr,1,publish,"YggTorrent Guide",45
...
```
### Snapshot 3 (Final, with AI)
```
[All 262+ posts, plus AI recommendations in last column:]
mistergeek.net,1,publish,"Complete VPN...",92,"Consider adding..."
mistergeek.net,2,publish,"Best VPN...",88,"Strong, no changes"
mistergeek.net,3,publish,"VPN for Gaming",76,"Expand meta..."
```
---
## Performance Impact
### With Progressive CSV (default)
- Disk writes: Continuous (one per post)
- CPU: Slightly higher (writing to disk)
- Disk I/O: Continuous
- Visibility: Real-time
- Time: ~2-3 minutes (262 posts) + AI
### Without Progressive CSV (--no-progressive)
- Disk writes: One large write at end
- CPU: Slightly lower (batch write)
- Disk I/O: Single large operation
- Visibility: No progress updates
- Time: ~2-3 minutes (262 posts) + AI
**Difference is negligible** (< 5% performance difference).
---
## Troubleshooting
### CSV Shows 0 Bytes
**Problem:** CSV file exists but shows 0 bytes.
**Solution:**
- Give the script a few seconds to start writing
- Check if analyzer is still running: `ps aux | grep multi_site`
- Verify directory exists: `ls -la output/reports/`
### Can't Open CSV While Writing
**Problem:** Excel says "file is in use" or "file is locked".
**Solutions:**
- Open as read-only (don't modify)
- Use Google Sheets instead (auto-refreshes)
- Use `--no-progressive` flag and wait for completion
- Wait for final CSV to be written (analyzer complete)
### File Grows Then Stops
**Problem:** CSV stops growing partway through.
**Likely cause:** Analyzer hit an error or is running AI recommendations.
**Solutions:**
- Check terminal for error messages
- If using `--top-n 20`, AI phase might be in progress (~5-10 min)
- Check file size: `ls -lh output/reports/seo_analysis_*.csv`
### Want to See Only New Rows?
Use tail to show only new additions:
```bash
# Show last 10 rows
tail -n 10 output/reports/seo_analysis_*.csv
# Watch new rows as they're added (macOS/Linux)
tail -f output/reports/seo_analysis_*.csv
# Or use watch
watch -n 1 'tail -20 output/reports/seo_analysis_*.csv'
```
---
## Workflow Examples
### Quick Monitoring (Simple)
```bash
# Terminal 1
python scripts/multi_site_seo_analyzer.py --include-drafts
# Terminal 2 (watch progress)
watch -n 2 'wc -l output/reports/seo_analysis_*.csv'
# Output every 2 seconds:
# 30 output/reports/seo_analysis_20250216_120000.csv
# 60 output/reports/seo_analysis_20250216_120000.csv
# 92 output/reports/seo_analysis_20250216_120000.csv
# [... grows to 262+]
```
### Live Dashboard (Advanced)
```bash
# Terminal 1: Run analyzer
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
# Terminal 2: Monitor with live stats
watch -n 1 'echo "=== CSV Status ===" && \
wc -l output/reports/seo_analysis_*.csv && \
echo "" && \
echo "=== Last 5 Rows ===" && \
tail -5 output/reports/seo_analysis_*.csv && \
echo "" && \
echo "=== Worst Scores ===" && \
tail -20 output/reports/seo_analysis_*.csv | sort -t, -k14 -n | head -5'
```
### Team Collaboration
```bash
# 1. Start analyzer with progressive CSV
python scripts/multi_site_seo_analyzer.py
# 2. Upload to Google Sheets
# File → Import → Upload CSV → Replace Spreadsheet
# 3. Share with team
# File → Share → Add team members
# 4. Team watches progress in real-time on Google Sheets
# Rows appear as analysis runs
```
---
## Data Quality Notes
### During Progressive Write
- Each row is **complete** when written (all analysis fields present)
- AI recommendations field is empty until AI phase completes
- Safe to view/read while running
### After Completion
- All rows updated with final data
- AI recommendations added for top N posts
- CSV fully populated and ready for import/action
### File Integrity
- Progressive CSV is **safe to view while running**
- Each row flush after write (atomic operation)
- No risk of corruption during analysis
---
## Command Reference
```bash
# Default (progressive CSV enabled)
python scripts/multi_site_seo_analyzer.py
# Disable progressive (batch write)
python scripts/multi_site_seo_analyzer.py --no-progressive
# Progressive + drafts
python scripts/multi_site_seo_analyzer.py --include-drafts
# Progressive + AI + drafts
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
# Disable progressive + no AI
python scripts/multi_site_seo_analyzer.py --no-progressive --no-ai
# All options combined
python scripts/multi_site_seo_analyzer.py \
--include-drafts \
--top-n 20 \
--output my_report.csv
# (progressive enabled by default)
```
---
## Summary
| Feature | Default | Flag |
|---------|---------|------|
| Progressive CSV | Enabled | `--no-progressive` to disable |
| Write Mode | Real-time rows | Batch at end (with flag) |
| Monitoring | Real-time in Excel/Sheets | Not available (with flag) |
| Performance | ~2-3 min + AI | Slightly faster (negligible) |
---
## Next Steps
1. **Run with progressive CSV:**
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
2. **Monitor in real-time:**
```bash
# Terminal 2
tail -f output/reports/seo_analysis_*.csv
```
3. **Or open in Google Sheets** and watch rows add live
4. **When complete**, review CSV and start optimizing
Ready to see it in action? Run:
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```

310
guides/PROJECT_GUIDE.md Normal file
View File

@@ -0,0 +1,310 @@
# SEO Analysis & Improvement System - Project Guide
## 📋 Overview
A complete 4-phase SEO analysis pipeline that:
1. **Integrates** Google Analytics, Search Console, and WordPress data
2. **Identifies** high-potential keywords for optimization (positions 11-30)
3. **Discovers** new content opportunities using AI
4. **Generates** a comprehensive report with 90-day action plan
## 📂 Project Structure
```
seo/
├── input/ # SOURCE DATA (your exports)
│ ├── new-propositions.csv # WordPress posts
│ ├── README.md # How to export data
│ └── analytics/
│ ├── ga4_export.csv # Google Analytics
│ └── gsc/
│ ├── Pages.csv # GSC pages (required)
│ ├── Requêtes.csv # GSC queries (optional)
│ └── ...
├── output/ # RESULTS (auto-generated)
│ ├── results/
│ │ ├── seo_optimization_report.md # 📍 PRIMARY OUTPUT
│ │ ├── posts_with_analytics.csv
│ │ ├── posts_prioritized.csv
│ │ ├── keyword_opportunities.csv
│ │ └── content_gaps.csv
│ │
│ ├── logs/
│ │ ├── import_log.txt
│ │ ├── opportunity_analysis_log.txt
│ │ └── content_gap_analysis_log.txt
│ │
│ └── README.md # Output guide
├── 🚀 run_analysis.sh # Run entire pipeline
├── analytics_importer.py # Phase 1: Merge data
├── opportunity_analyzer.py # Phase 2: Find wins
├── content_gap_analyzer.py # Phase 3: Find gaps
├── report_generator.py # Phase 4: Generate report
├── config.py
├── requirements.txt
├── .env.example
└── .gitignore
```
## 🚀 Getting Started
### Step 1: Prepare Input Data
**Place WordPress posts CSV:**
```
input/new-propositions.csv
```
**Export Google Analytics 4:**
1. Go to: Analytics > Reports > Engagement > Pages and Screens
2. Set date range: Last 90 days
3. Download CSV → Save as: `input/analytics/ga4_export.csv`
**Export Google Search Console (Pages):**
1. Go to: Performance
2. Set date range: Last 90 days
3. Export CSV → Save as: `input/analytics/gsc/Pages.csv`
### Step 2: Run Analysis
```bash
# Run entire pipeline
./run_analysis.sh
# OR run steps individually
./venv/bin/python analytics_importer.py
./venv/bin/python opportunity_analyzer.py
./venv/bin/python content_gap_analyzer.py
./venv/bin/python report_generator.py
```
### Step 3: Review Report
Open: **`output/results/seo_optimization_report.md`**
Contains:
- Executive summary with current metrics
- Top 20 posts ranked by opportunity (with AI recommendations)
- Keyword opportunities breakdown
- Content gap analysis
- 90-day phased action plan
## 📊 What Each Script Does
### `analytics_importer.py` (Phase 1)
**Purpose:** Merge analytics data with WordPress posts
**Input:**
- `input/new-propositions.csv` (WordPress posts)
- `input/analytics/ga4_export.csv` (Google Analytics)
- `input/analytics/gsc/Pages.csv` (Search Console)
**Output:**
- `output/results/posts_with_analytics.csv` (enriched dataset)
- `output/logs/import_log.txt` (matching report)
**Handles:** French and English column names, URL normalization, multi-source merging
### `opportunity_analyzer.py` (Phase 2)
**Purpose:** Identify high-potential optimization opportunities
**Input:**
- `output/results/posts_with_analytics.csv`
**Output:**
- `output/results/keyword_opportunities.csv` (26 opportunities)
- `output/logs/opportunity_analysis_log.txt`
**Features:**
- Filters posts at positions 11-30 (page 2-3)
- Calculates opportunity scores (0-100)
- Generates AI recommendations for top 20 posts
### `content_gap_analyzer.py` (Phase 3)
**Purpose:** Discover new content opportunities
**Input:**
- `output/results/posts_with_analytics.csv`
- `input/analytics/gsc/Requêtes.csv` (optional)
**Output:**
- `output/results/content_gaps.csv`
- `output/logs/content_gap_analysis_log.txt`
**Features:**
- Topic cluster extraction
- Gap identification
- AI-powered content suggestions
### `report_generator.py` (Phase 4)
**Purpose:** Create comprehensive report with action plan
**Input:**
- All analysis results from phases 1-3
**Output:**
- `output/results/seo_optimization_report.md`**PRIMARY DELIVERABLE**
- `output/results/posts_prioritized.csv`
**Features:**
- Comprehensive markdown report
- All 262 posts ranked
- 90-day action plan with estimated gains
## 📈 Understanding Your Report
### Key Metrics (Executive Summary)
- **Total Posts:** All posts analyzed
- **Monthly Traffic:** Current organic traffic
- **Total Impressions:** Search visibility (90 days)
- **Average Position:** Current ranking position
- **Opportunities:** Posts ready to optimize
### Top 20 Posts to Optimize
Each post shows:
- **Title** (the post name)
- **Current Position** (search ranking)
- **Impressions** (search visibility)
- **Traffic** (organic visits)
- **Priority Score** (0-100 opportunity rating)
- **Status** (page 1 vs page 2-3)
- **Recommendations** (how to improve)
### Priority Scoring (0-100)
Higher scores = more opportunity for gain with less effort
Calculated from:
- **Position (35%)** - How close to page 1
- **Traffic Potential (30%)** - Search impressions
- **CTR Gap (20%)** - Improvement opportunity
- **Content Quality (15%)** - Existing engagement
## 🎯 Action Plan
### Week 1-2: Quick Wins (+100 visits/month)
- Focus on posts at positions 11-15
- Update SEO titles and meta descriptions
- 30-60 minutes per post
### Week 3-4: Core Optimization (+150 visits/month)
- Posts 6-15 in priority list
- Add content sections
- Improve structure with headers
- 2-3 hours per post
### Week 5-8: New Content (+300 visits/month)
- Create 3-5 new posts from gap analysis
- Target high-search-demand topics
- 4-6 hours per post
### Week 9-12: Refinement (+100 visits/month)
- Monitor ranking improvements
- Refine underperforming optimizations
- Prepare next round of analysis
**Total: +650 visits/month potential gain**
## 🔧 Configuration
Edit `.env` to customize analysis:
```bash
# Position range for opportunities
ANALYSIS_MIN_POSITION=11
ANALYSIS_MAX_POSITION=30
# Minimum impressions to consider
ANALYSIS_MIN_IMPRESSIONS=50
# Posts for AI recommendations
ANALYSIS_TOP_N_POSTS=20
```
## 🐛 Troubleshooting
### Missing Input Files
```
❌ Error: File not found: input/...
```
→ Check that all files are in the correct locations
### Empty Report Titles
✓ FIXED - Now correctly loads post titles from multiple column names
### No Opportunities Found
```
⚠️ No opportunities found in specified range
```
→ Try lowering `ANALYSIS_MIN_IMPRESSIONS` in `.env`
### API Errors
```
❌ AI generation failed: ...
```
→ Check `OPENROUTER_API_KEY` in `.env` and account balance
## 📚 Additional Resources
- **`input/README.md`** - How to export analytics data
- **`output/README.md`** - Output files guide
- **`QUICKSTART_ANALYSIS.md`** - Step-by-step tutorial
- **`ANALYSIS_SYSTEM.md`** - Technical documentation
## ✅ Success Checklist
- [ ] All input files placed in `input/` directory
- [ ] `.env` file configured with API key
- [ ] Ran `./run_analysis.sh` successfully
- [ ] Reviewed `output/results/seo_optimization_report.md`
- [ ] Identified 5-10 quick wins to start with
- [ ] Created action plan for first week
## 🎓 Key Learnings
### Why Positions 11-30 Matter
- **Page 1** posts are hard to move
- **Page 2-3** posts are easy wins (small improvements move them up)
- **Quick gains:** 1-2 position improvements = CTR increases 20-30%
### CTR Expectations by Position
- Position 1: ~30% CTR
- Position 5-10: 4-7% CTR
- Position 11-15: 1-2% CTR (quick wins)
- Position 16-20: 0.8-1% CTR
- Position 21-30: ~0.5% CTR
### Content Quality Signals
- Higher bounce rate = less relevant content
- Low traffic = poor CTR or position
- Low impressions = insufficient optimization
## 📞 Support
### Check Logs First
```
output/logs/import_log.txt
output/logs/opportunity_analysis_log.txt
output/logs/content_gap_analysis_log.txt
```
### Common Issues
1. **Empty titles** → Fixed with flexible column name mapping
2. **File not found** → Check file locations match structure
3. **API errors** → Verify API key and account balance
4. **No opportunities** → Lower minimum impressions threshold
## 🚀 Ready to Optimize?
1. Prepare your input data
2. Run `./run_analysis.sh`
3. Open the report
4. Start with quick wins
5. Track improvements in 4 weeks
Good luck boosting your SEO! 📈
---
**Last Updated:** February 2026
**System Status:** Production Ready ✅

View File

@@ -0,0 +1,145 @@
# Quick Start: Multi-Site SEO Analyzer
## 30-Second Setup
### 1. Configure WordPress Access
Update `.env` with your 3 site credentials:
```bash
WORDPRESS_MISTERGEEK_URL=https://www.mistergeek.net
WORDPRESS_MISTERGEEK_USERNAME=your_username
WORDPRESS_MISTERGEEK_PASSWORD=your_app_password
WORDPRESS_WEBSCROLL_URL=https://www.webscroll.fr
WORDPRESS_WEBSCROLL_USERNAME=your_username
WORDPRESS_WEBSCROLL_PASSWORD=your_app_password
WORDPRESS_HELLOGEEK_URL=https://www.hellogeek.net
WORDPRESS_HELLOGEEK_USERNAME=your_username
WORDPRESS_HELLOGEEK_PASSWORD=your_app_password
```
### 2. Run Analyzer
```bash
# With AI recommendations (recommended)
python scripts/multi_site_seo_analyzer.py
# Without AI (faster, free)
python scripts/multi_site_seo_analyzer.py --no-ai
# Custom AI posts (top 20)
python scripts/multi_site_seo_analyzer.py --top-n 20
```
### 3. Review Results
```bash
# Markdown summary (human-friendly)
open output/reports/seo_analysis_*_summary.md
# Detailed CSV (for importing to sheets)
open output/reports/seo_analysis_*.csv
```
## What Gets Analyzed
### Title (40% of score)
- ✓ Length: 50-70 characters optimal
- ✓ Power words: "best", "complete", "guide", etc.
- ✓ Numbers: "2025", "Top 10", etc.
- ✓ Readability: No weird special chars
### Meta Description (60% of score)
- ✓ Present: Required for full score
- ✓ Length: 120-160 characters optimal
- ✓ Call-to-action: "learn", "discover", "find", etc.
- ✓ Compelling: Not just keywords
## Cost
| Command | Cost | Time |
|---------|------|------|
| `--no-ai` | $0 | 2-3 min |
| `-top-n 10` | ~$0.10 | 5-10 min |
| `-top-n 20` | ~$0.50 | 10-15 min |
| `-top-n 50` | ~$1.00 | 20-30 min |
## Understanding Output
### Score Ranges
| Score | Status | Action |
|-------|--------|--------|
| 0-25 | Critical | Fix immediately |
| 25-50 | Poor | Optimize soon |
| 50-75 | Fair | Improve when possible |
| 75-90 | Good | Minor tweaks only |
| 90-100 | Excellent | No changes needed |
### Priority Order
1. Posts with score < 50 (biggest impact)
2. Posts with missing meta description (easy fix)
3. Posts with weak titles (quick improvement)
4. High-traffic posts with any issues (traffic × improvement)
## One-Liner to Get Started
If all 3 sites use the **same credentials**:
```bash
# Just set primary site, others inherit
WORDPRESS_URL=https://www.mistergeek.net \
WORDPRESS_USERNAME=your_user \
WORDPRESS_APP_PASSWORD=your_pass \
OPENROUTER_API_KEY=your_key \
python scripts/multi_site_seo_analyzer.py --no-ai
```
## Common Commands
```bash
# Published posts only (default)
python scripts/multi_site_seo_analyzer.py
# Published + draft posts
python scripts/multi_site_seo_analyzer.py --include-drafts
# Quick scan, no AI
python scripts/multi_site_seo_analyzer.py --no-ai
# Drafts + AI recommendations
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 10
# Analyze with recommendations for top 30
python scripts/multi_site_seo_analyzer.py --top-n 30
# Save to custom location
python scripts/multi_site_seo_analyzer.py --output my_report.csv
```
## Troubleshooting
**"No posts found"**
- Check credentials in .env
- Verify site is online
- Try without auth: remove username/password from config
**"Connection refused"**
- Verify site URLs (https, www)
- Check if REST API is enabled
- Try https://yoursite.com/wp-json/ in browser
**"No AI recommendations"**
- Check OPENROUTER_API_KEY is set
- Verify key has credits
- Use --no-ai to test other features
## Next Steps
1. Run: `python scripts/multi_site_seo_analyzer.py`
2. Open: `output/reports/seo_analysis_*_summary.md`
3. Implement: Top 5 recommendations per site
4. Re-run: 30 days later to track improvement

View File

@@ -0,0 +1,288 @@
# Rank Math REST API Configuration - Complete Guide
## The Problem
Rank Math meta fields (`rank_math_description`, `rank_math_title`, etc.) are not exposed in the WordPress REST API by default. Our SEO analyzer needs these fields to be available.
## Solution: Enable REST API in Rank Math
### Step 1: Go to Rank Math Settings
In WordPress Admin:
```
Rank Math → Settings → Advanced
```
### Step 2: Find REST API Section
Look for one of these options:
- **"REST API"** - Enable
- **"Expose in REST API"** - Check/Enable
- **"API"** - Look for REST API toggle
- **"Integrations"** - REST API section
### Step 3: Enable All Rank Math Fields
Make sure these are exposed:
- ✓ SEO Title (`rank_math_title`)
- ✓ SEO Description (`rank_math_description`)
- ✓ Focus Keyword (`rank_math_focus_keyword`)
- ✓ Canonical URL (`rank_math_canonical_url`)
### Step 4: Save Changes
Click **Save** and wait for confirmation.
---
## Verify It Works
### Test 1: Use curl
Run this command (replace credentials and domain):
```bash
curl -u "your_username:your_app_password" \
"https://www.mistergeek.net/wp-json/wp/v2/posts?per_page=1&status=publish" \
| jq '.[] | .meta | keys'
```
**You should see:**
```json
[
"rank_math_description",
"rank_math_title",
"rank_math_focus_keyword",
...other fields...
]
```
If you see Rank Math fields: ✓ **SUCCESS!**
If you don't see them: ⚠️ **Rank Math REST API not enabled yet**
### Test 2: Run Diagnostic Again
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
**Look for:**
```
Available meta fields:
• rank_math_description: Your SEO description...
• rank_math_title: Your SEO title...
• rank_math_focus_keyword: your keyword
```
---
## If REST API Still Not Working
### Check 1: Rank Math Version
Make sure you have **latest Rank Math version**:
```
WordPress Admin → Plugins → Rank Math SEO
Check version number, update if available
```
### Check 2: WordPress Version Compatibility
Rank Math REST API support requires:
- ✓ WordPress 5.0+
- ✓ Rank Math 1.0.50+
If older: **Update both WordPress and Rank Math**
### Check 3: User Permissions
Your WordPress user must have:
-**Administrator** or **Editor** role
-`edit_posts` capability
-`read_posts` capability
Try with **Administrator** account if unsure.
### Check 4: Security Plugin Blocking
Some security plugins block REST API:
- Wordfence
- Sucuri
- iThemes Security
- All in One WP Security
**Try temporarily disabling** to test:
```
WordPress Admin → Plugins → Deactivate [Security Plugin]
Run diagnostic
Re-enable plugin
```
If diagnostic works after disabling: **The security plugin is blocking REST API**
**Fix:** Whitelist your IP in security plugin settings, or contact plugin support.
### Check 5: Server Configuration
Some hosting limits REST API:
- GoDaddy (sometimes)
- Bluehost (sometimes)
- Cheap shared hosting
**Test with curl:**
```bash
curl "https://www.mistergeek.net/wp-json/"
```
Should return API info. If 403/404: **Contact hosting provider**
---
## Alternative: Use Rank Math API Manager Plugin
If the above doesn't work, you can use the **Rank Math API Manager** plugin:
1. **Install plugin:**
- GitHub: https://github.com/devora-as/rank-math-api-manager
- Or search "Rank Math API Manager" in WordPress plugin directory
2. **Activate plugin:**
```
WordPress Admin → Plugins → Activate Rank Math API Manager
```
3. **Configure:**
- Plugin provides custom REST API endpoints
- Our script can be updated to use these endpoints
4. **Contact us** if you want to integrate this approach
---
## Complete Checklist
Before running analyzer:
- [ ] Installed Rank Math SEO (latest version)
- [ ] WordPress 5.0+
- [ ] Rank Math 1.0.50+
- [ ] Admin/Editor user account
- [ ] Rank Math REST API enabled in settings
- [ ] Verified with diagnostic command
- [ ] Verified with curl command
- [ ] No security plugin blocking REST API
- [ ] Hosting supports REST API
## Quick Troubleshooting
| Symptom | Cause | Fix |
|---------|-------|-----|
| No `rank_math_*` fields in diagnostic | REST API not enabled | Enable in Rank Math Settings → Advanced |
| 401 Unauthorized error | Wrong credentials | Verify username and app password |
| 403 Forbidden | User lacks permissions | Use Administrator account |
| 404 error | REST API blocked | Check security plugin or hosting |
| Empty meta fields | Rank Math not setting meta | Check if posts have Rank Math data in admin |
---
## Step-by-Step Setup (Visual Guide)
### Step 1: Login to WordPress Admin
```
https://www.mistergeek.net/wp-admin/
```
### Step 2: Go to Rank Math Settings
```
Left menu → Rank Math → Settings
```
### Step 3: Find Advanced Tab
```
Tabs: Dashboard | Wizards | Analytics | Content AI | Settings | Tools
Click: Settings
Sub-tabs: General | Advanced | Integrations
Click: Advanced
```
### Step 4: Find REST API Section
```
Look for: "REST API" heading or toggle
Sub-options: "Expose in REST API" checkboxes
```
### Step 5: Enable Checkboxes
```
✓ Expose SEO Title in REST API
✓ Expose SEO Description in REST API
✓ Expose Focus Keyword in REST API
✓ Expose Canonical URL in REST API
```
### Step 6: Save
```
Click: Save button at bottom
Wait for: "Settings saved" message
```
### Step 7: Test
```
Terminal: python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
Look for: rank_math_description in output
```
---
## If You Find Different Settings
Rank Math UI changes between versions. If the above steps don't match your screen:
**Search in Rank Math:**
```
1. Open Rank Math → Settings
2. Use browser Find (Ctrl+F or Cmd+F)
3. Search for: "REST" or "API"
4. Follow the UI from there
```
---
## After Enabling REST API
1. **Run diagnostic:**
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
2. **Should see:**
```
Available meta fields:
• rank_math_description: Your description here...
• rank_math_title: Your title...
• rank_math_focus_keyword: keyword
```
3. **Then run analyzer:**
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 50
```
4. **Meta descriptions will now be detected!**
---
## Next Steps
1. **Go to Rank Math settings** and enable REST API
2. **Run diagnostic** to verify:
```bash
python scripts/multi_site_seo_analyzer.py --diagnose https://www.mistergeek.net
```
3. **If successful**, run full analyzer:
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 50
```
4. **Share the diagnostic output** if you're still having issues
Let me know when you've enabled REST API in Rank Math! 🚀

View File

@@ -0,0 +1,443 @@
# Multi-Site SEO Analyzer Guide
**Purpose:** Fetch posts from all 3 WordPress sites, analyze titles and meta descriptions, and provide AI-powered optimization recommendations.
**Output:** CSV with detailed analysis + Markdown summary report
---
## Overview
The Multi-Site SEO Analyzer does the following:
1. **Fetches** all published posts from your 3 WordPress sites (mistergeek.net, webscroll.fr, hellogeek.net)
2. **Analyzes** each post's:
- Title (length, power words, numbers, readability)
- Meta description (presence, length, call-to-action)
3. **Scores** posts on SEO best practices (0-100)
4. **Generates** AI recommendations for your top priority posts
5. **Exports** results to CSV for action
---
## Setup
### Step 1: Configure WordPress Access
Update your `.env` file with credentials for all 3 sites:
```bash
# Primary site (fallback for others if not specified)
WORDPRESS_URL=https://www.mistergeek.net
WORDPRESS_USERNAME=your_username
WORDPRESS_APP_PASSWORD=your_app_password
# Site 1: mistergeek.net (uses primary credentials if not specified)
WORDPRESS_MISTERGEEK_URL=https://www.mistergeek.net
WORDPRESS_MISTERGEEK_USERNAME=your_username
WORDPRESS_MISTERGEEK_PASSWORD=your_app_password
# Site 2: webscroll.fr
WORDPRESS_WEBSCROLL_URL=https://www.webscroll.fr
WORDPRESS_WEBSCROLL_USERNAME=your_username
WORDPRESS_WEBSCROLL_PASSWORD=your_app_password
# Site 3: hellogeek.net
WORDPRESS_HELLOGEEK_URL=https://www.hellogeek.net
WORDPRESS_HELLOGEEK_USERNAME=your_username
WORDPRESS_HELLOGEEK_PASSWORD=your_app_password
# OpenRouter API (for AI recommendations)
OPENROUTER_API_KEY=your_key
```
**Note:** If a site's credentials are not specified, the script uses the primary site credentials.
### Step 2: Verify Your .env
```bash
cat .env | grep -E "WORDPRESS|OPENROUTER"
```
---
## Usage
### Basic Usage (with AI recommendations)
```bash
python scripts/multi_site_seo_analyzer.py
```
This will:
- Fetch all posts from 3 sites
- Analyze each post
- Generate AI recommendations for top 10 worst-scoring posts
- Export results to CSV and Markdown
### Include Draft Posts
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
Analyzes both published and draft posts. Useful for:
- Optimizing posts before publishing
- Recovering removed content saved as drafts
- Getting full picture of all content
- CSV will show `status` column (publish/draft)
### Skip AI (Save Cost)
```bash
python scripts/multi_site_seo_analyzer.py --no-ai
```
Analyzes posts without AI recommendations. Good for:
- Quick overview
- Sites with >500 posts (AI costs add up)
- Budget testing
### Generate AI for Top 20 Posts
```bash
python scripts/multi_site_seo_analyzer.py --top-n 20
```
AI recommendations for 20 worst-scoring posts instead of 10.
### Combine Options
```bash
# Analyze published + drafts with AI for top 20
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
# Analyze drafts only (then filter in Excel to status=draft)
python scripts/multi_site_seo_analyzer.py --include-drafts --no-ai
```
### Custom Output File
```bash
python scripts/multi_site_seo_analyzer.py --output output/custom_report.csv
```
---
## Output Files
### 1. CSV Report: `seo_analysis_TIMESTAMP.csv`
Contains all analyzed posts with columns:
| Column | Description |
|--------|-------------|
| `site` | Website (mistergeek.net, webscroll.fr, hellogeek.net) |
| `post_id` | WordPress post ID |
| `title` | Post title |
| `slug` | Post slug |
| `url` | Full post URL |
| `meta_description` | Current meta description |
| `title_score` | Title SEO score (0-100) |
| `title_issues` | Title problems (too short, no power words, etc.) |
| `title_recommendations` | How to improve title |
| `meta_score` | Meta description SEO score (0-100) |
| `meta_issues` | Meta description problems |
| `meta_recommendations` | How to improve meta |
| `overall_score` | Combined score (40% title + 60% meta) |
| `ai_recommendations` | Claude-generated specific recommendations |
### 2. Summary Report: `seo_analysis_TIMESTAMP_summary.md`
Human-readable markdown with:
- Overall statistics (total posts, average score, cost)
- Priority issues (missing meta, weak titles, weak descriptions)
- Per-site breakdown
- Top 5 posts to optimize on each site
- Legend explaining scores
---
## Understanding Scores
### Title Score (0-100)
**What's analyzed:**
- Length (target: 50-70 characters)
- Power words (best, complete, guide, ultimate, essential, etc.)
- Numbers (top 5, 2025, etc.)
- Special characters that might break rendering
**Optimal title example:**
"The Complete 2025 Guide to VPN Services (Updated)"
- Length: 57 characters ✓
- Power words: "Complete", "Guide" ✓
- Numbers: "2025" ✓
- Score: 95/100
### Meta Description Score (0-100)
**What's analyzed:**
- Presence (missing = 0 score)
- Length (target: 120-160 characters)
- Call-to-action (learn, discover, find, check, etc.)
**Optimal meta example:**
"Discover the best VPN services for 2025. Compare 50+ options, learn about encryption, and find the perfect VPN for your needs. Updated monthly."
- Length: 149 characters ✓
- CTA: "Discover", "Compare", "learn", "find" ✓
- Score: 90/100
### Overall Score (0-100)
```
Overall = (Title Score × 40%) + (Meta Score × 60%)
```
Meta description weighted heavier because it directly impacts click-through rates from search results.
---
## Action Plan
### 1. Review Results
```bash
# Open the summary report
open output/reports/seo_analysis_*.md
# Or open the detailed CSV
open output/reports/seo_analysis_*.csv
```
### 2. Prioritize by Score
**High Priority (Score < 50):**
- Title issues OR missing/weak meta
- Implement AI recommendations immediately
- Estimated impact: 10-20% CTR improvement
**Medium Priority (Score 50-75):**
- Minor title or meta issues
- Apply recommendations when convenient
- Estimated impact: 5-10% CTR improvement
**Low Priority (Score > 75):**
- Already optimized
- Only update if major content changes
### 3. Batch Implementation
**For WordPress:**
1. Go to WordPress admin
2. Edit post
3. Update title (if recommended)
4. Update meta description in Yoast SEO or All in One SEO:
- Yoast: Bottom of editor → "SEO" tab → Meta description
- AIOSEO: Right sidebar → "General" → Description
5. Save post
**OR use bulk operations** if your SEO plugin supports it.
### 4. Monitor Impact
Re-run the analyzer in 30 days:
```bash
python scripts/multi_site_seo_analyzer.py
```
Track improvements:
- Average score increase
- Fewer posts with score < 50
- Posts moved from "Missing meta" to "Strong meta"
---
## Cost Estimation
### AI Recommendation Costs
Using Claude 3.5 Sonnet via OpenRouter ($3 input / $15 output per 1M tokens):
**Scenario 1: 10 posts with AI**
- ~2,500 input tokens per post × 10 = 25,000 input tokens
- ~500 output tokens per post × 10 = 5,000 output tokens
- Cost: (25,000 × $3 + 5,000 × $15) / 1,000,000 = **$0.105** (~11¢)
**Scenario 2: 50 posts with AI**
- 125,000 input + 25,000 output tokens
- Cost: **$0.525** (~52¢)
**Scenario 3: No AI (--no-ai flag)**
- Cost: **$0.00**
### Monthly Scenarios
| Scenario | Frequency | Cost/Month |
|----------|-----------|-----------|
| No AI | Weekly | $0 |
| 10 posts/week | Weekly | ~€0.40 |
| 20 posts/week | Weekly | ~€0.80 |
| 50 posts/month | Once | ~€0.50 |
---
## Troubleshooting
### "Connection refused" on a site
**Problem:** WordPress site is down or credentials are wrong.
**Solutions:**
1. Check site URL is correct (https, www vs no-www)
2. Verify credentials: Try logging in manually
3. Check if site has REST API enabled: `https://yoursite.com/wp-json/`
4. Skip that site temporarily (remove from config, re-run)
### "No posts found"
**Problem:** API returns 0 posts.
**Solutions:**
1. Verify credentials have permission to read posts
2. Check if posts exist on the site
3. Try without authentication (remove from config)
4. Check if REST API is disabled
### AI recommendations are empty
**Problem:** OpenRouter API call failed.
**Solutions:**
1. Verify OPENROUTER_API_KEY is set: `echo $OPENROUTER_API_KEY`
2. Check API key is valid (not expired, has credits)
3. Try with --no-ai flag to verify the rest works
4. Check internet connection
### Memory issues with 1000+ posts
**Problem:** Script runs out of memory.
**Solutions:**
1. Run --no-ai version first (lighter)
2. Analyze one site at a time (modify config temporarily)
3. Increase system memory or close other apps
---
## Advanced Usage
### Analyze One Site
Temporarily comment out sites in config.py or create a custom script:
```python
from scripts.multi_site_seo_analyzer import MultiSiteSEOAnalyzer
from scripts.config import Config
analyzer = MultiSiteSEOAnalyzer()
# Override to just one site
analyzer.sites_config = {
'mistergeek.net': Config.WORDPRESS_SITES['mistergeek.net']
}
analyzer.run(use_ai=True, top_n=20)
```
### Export to Google Sheets
1. Download the CSV
2. Open Google Sheets
3. File → Import → Upload CSV
4. Share link with team
5. Filter by site or score
6. Add "Completed" checkbox column
7. Track progress as you optimize
### Integrate with WordPress via Zapier
1. Export CSV from analyzer
2. Use Zapier to trigger WordPress post updates
3. Automatically update meta descriptions for high-priority posts
4. (Advanced - requires Zapier Pro)
---
## Examples
### Example 1: Post with Low Title Score
```
Title: "VPN"
Title Issues: Too short (3), Missing power word, No number
Title Score: 10/100
Recommendation: Expand title to include benefit and year
Better Title: "Best VPN Services 2025: Complete Guide"
```
### Example 2: Post with Missing Meta
```
Meta Description: [MISSING]
Meta Score: 0/100
AI Recommendation:
"Write a meta description: 'Learn about the best VPN services for 2025.
Compare 50+ providers, understand encryption, and choose the right VPN
for your needs. Updated weekly.' (150 characters)"
```
### Example 3: Strong Post (No Changes Needed)
```
Title: "The Complete Guide to NordVPN: Features, Pricing, and Reviews"
Title Issues: None
Title Score: 95/100
Meta: "Comprehensive review of NordVPN including speed tests, security features, pricing plans, and user reviews. Find out if NordVPN is right for you."
Meta Issues: None
Meta Score: 95/100
Overall Score: 95/100
Status: No changes needed ✓
```
---
## FAQ
**Q: How often should I run this?**
A: Monthly or after publishing 10+ new posts. More frequent for highly competitive topics.
**Q: Will changing titles affect SEO?**
A: No, titles can be improved without penalty. The URL stays the same, so search rankings are preserved.
**Q: Should I update all weak meta descriptions?**
A: Prioritize posts with traffic. Update high-traffic posts first for maximum impact.
**Q: Can I use this on a site with 5000+ posts?**
A: Yes, but consider:
- Using --no-ai on first run (faster)
- Running once per month instead of weekly
- Focusing AI analysis on high-traffic posts only
**Q: What if my site uses a different SEO plugin?**
A: The script looks for common meta description fields. If it finds nothing, add one manually. Plugin doesn't matter; the meta description HTML is standard.
---
## Next Steps
1. **Run the analyzer:** `python scripts/multi_site_seo_analyzer.py`
2. **Review the report:** Open `output/reports/seo_analysis_*_summary.md`
3. **Prioritize:** Identify posts with score < 50
4. **Implement:** Update titles and meta descriptions
5. **Track:** Re-run in 30 days to measure improvement
6. **Monitor:** Watch Google Search Console for CTR improvements
Ready to optimize? Let's go! 🚀

View File

@@ -0,0 +1,430 @@
# Storage & Draft Posts - Complete Guide
## Storage Architecture
### How Data is Stored
The Multi-Site SEO Analyzer **does NOT use a local database**. Instead:
1. **Fetches on-demand** from WordPress REST API
2. **Analyzes in-memory** using Python
3. **Exports to CSV files** for long-term storage and review
```
┌─────────────────────────────┐
│ 3 WordPress Sites │
│ (via REST API) │
└──────────┬──────────────────┘
├─→ Fetch posts (published + optional drafts)
┌──────────▼──────────────────┐
│ Python Analysis │
│ (in-memory processing) │
└──────────┬──────────────────┘
├─→ Analyze titles
├─→ Analyze meta descriptions
├─→ Score (0-100)
├─→ AI recommendations (optional)
┌──────────▼──────────────────┐
│ CSV File Export │
│ (persistent storage) │
└─────────────────────────────┘
```
### Why CSV Instead of Database?
**Advantages:**
- ✓ No database setup or maintenance
- ✓ Easy to import to Excel/Google Sheets
- ✓ Human-readable format
- ✓ Shareable with non-technical team members
- ✓ Version control friendly (Git-trackable)
- ✓ No dependencies on database software
**Disadvantages:**
- ✗ Each run is independent (no running total)
- ✗ No real-time updates
- ✗ Manual comparison between runs
**When to use database instead:**
- If analyzing >10,000 posts regularly
- If you need real-time dashboards
- If you want automatic tracking over time
---
## CSV Output Structure
### File Location
```
output/reports/seo_analysis_TIMESTAMP.csv
```
### Columns
| Column | Description | Example |
|--------|-------------|---------|
| `site` | WordPress site | mistergeek.net |
| `post_id` | WordPress post ID | 2845 |
| `status` | Post status | publish / draft |
| `title` | Post title | "Best VPN Services 2025" |
| `slug` | URL slug | best-vpn-services-2025 |
| `url` | Full URL | https://mistergeek.net/best-vpn-2025/ |
| `meta_description` | Meta description text | "Compare 50+ VPN..." |
| `title_score` | Title SEO score (0-100) | 92 |
| `title_issues` | Problems with title | "None" |
| `title_recommendations` | How to improve | "None" |
| `meta_score` | Meta description score (0-100) | 88 |
| `meta_issues` | Problems with meta | "None" |
| `meta_recommendations` | How to improve | "None" |
| `overall_score` | Combined score | 90 |
| `ai_recommendations` | Claude-generated tips | "Consider adding..." |
### Importing to Google Sheets
1. Download CSV from `output/reports/`
2. Open Google Sheets
3. File → Import → Upload CSV
4. Add columns for tracking:
- [ ] Status (Not Started / In Progress / Done)
- [ ] Notes
- [ ] Date Completed
5. Share with team
6. Filter and sort as needed
---
## Draft Posts Feature
### What Are Drafts?
Draft posts are unpublished WordPress posts. They're:
- Written but not published
- Not visible on the website
- Still indexed by WordPress
- Perfect for analyzing before publishing
### Using Draft Posts
**By default**, the analyzer fetches **only published posts**:
```bash
python scripts/multi_site_seo_analyzer.py
```
**To include draft posts**, use the `--include-drafts` flag:
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
### Output with Drafts
The CSV will include a `status` column showing which posts are published vs. draft:
```csv
site,post_id,status,title,meta_score,overall_score
mistergeek.net,2845,publish,"Best VPN",88,90
mistergeek.net,2901,draft,"New VPN Draft",45,55
webscroll.fr,1234,publish,"Torrent Guide",72,75
webscroll.fr,1235,draft,"Draft Tracker Review",20,30
```
### Use Cases for Drafts
**1. Optimize Before Publishing**
If you have draft posts ready to publish:
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
Review their SEO scores and improve titles/meta before publishing.
**2. Recover Previous Content**
If you have removed posts saved as drafts:
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
Analyze them to decide: republish, improve, or delete.
**3. Audit Unpublished Work**
See what's sitting in drafts that could be published:
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts | grep "draft"
```
---
## Complete Examples
### Example 1: Analyze Published Only
```bash
python scripts/multi_site_seo_analyzer.py
```
**Output:**
- Analyzes: ~262 published posts
- Time: 2-3 minutes
- Drafts: Not included
### Example 2: Analyze Published + Drafts
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
**Output:**
- Analyzes: ~262 published + X drafts
- Time: 2-5 minutes (depending on draft count)
- Shows status column: "publish" or "draft"
### Example 3: Analyze Published + Drafts + AI
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
```
**Output:**
- Analyzes: All posts (published + drafts)
- AI recommendations: Top 20 worst-scoring posts
- Cost: ~$0.20
- Time: 10-15 minutes
### Example 4: Focus on Drafts Only
While the script always includes both, you can filter in Excel/Sheets:
1. Run: `python scripts/multi_site_seo_analyzer.py --include-drafts`
2. Open CSV in Google Sheets
3. Filter `status` column = "draft"
4. Sort by `overall_score` (lowest first)
5. Optimize top 10 drafts before publishing
---
## Comparing Results Over Time
### Manual Comparison
Since results are exported to CSV, you can track progress manually:
```bash
# Week 1
python scripts/multi_site_seo_analyzer.py --no-ai
# Save: seo_analysis_week1.csv
# (Optimize posts for 4 weeks)
# Week 5
python scripts/multi_site_seo_analyzer.py --no-ai
# Save: seo_analysis_week5.csv
# Compare in Excel/Sheets:
# Sort both by post_id
# Compare scores: Week 1 vs Week 5
```
### Calculating Improvement
Example:
| Post | Week 1 Score | Week 5 Score | Change |
|------|--------------|--------------|--------|
| Best VPN | 45 | 92 | +47 |
| Top 10 Software | 38 | 78 | +40 |
| Streaming Guide | 52 | 65 | +13 |
| **Average** | **45** | **78** | **+33** |
---
## Organizing Your CSV Files
### Naming Convention
Create a folder for historical analysis:
```
output/
├── reports/
│ ├── 2025-02-16_initial_analysis.csv
│ ├── 2025-03-16_after_optimization.csv
│ ├── 2025-04-16_follow_up.csv
│ └── seo_analysis_20250216_120000.csv (latest)
```
### Archive Strategy
1. Run analyzer monthly
2. Save result with date: `seo_analysis_2025-02-16.csv`
3. Keep 12 months of history
4. Compare trends over time
---
## Advanced: Storing Recommendations
### Using a Master Spreadsheet
Instead of relying on CSV alone, create a master Google Sheet:
**Columns:**
- Post ID
- Title
- Current Score
- Issues
- Improvements Needed
- Status (Not Started / In Progress / Done)
- Completed Date
- New Score
**Process:**
1. Run analyzer: `python scripts/multi_site_seo_analyzer.py`
2. Copy relevant rows to master spreadsheet
3. As you optimize: update "Status" and "New Score"
4. Track progress visually
---
## Performance Considerations
### Fetch Time
- **Published only:** ~10-30 seconds (262 posts)
- **Published + drafts:** ~10-30 seconds (+X seconds per 100 drafts)
Drafts don't significantly impact speed since both are fetched in same API call.
### Analysis Time
- **Without AI:** ~1-2 minutes
- **With AI (10 posts):** ~5-10 minutes
- **With AI (50 posts):** ~20-30 minutes
AI recommendations add most of the time (not the fetching).
### Memory Usage
- **262 posts:** ~20-30 MB
- **262 posts + 100 drafts:** ~35-50 MB
No memory issues for typical WordPress sites.
---
## Troubleshooting
### "No drafts found"
**Problem:** You're using `--include-drafts` but get same result as without it.
**Solutions:**
1. Verify you have draft posts on the site
2. Check user has permission to view drafts (needs edit_posts capability)
3. Try logging in and checking WordPress directly
### CSV Encoding Issues
**Problem:** CSV opens with weird characters in Excel.
**Solution:** Open with UTF-8 encoding:
- Excel: File → Open → Select CSV → Click "Edit"
- Sheets: Upload CSV, let Google handle encoding
### Want to Use a Database Later?
If you outgrow CSV files, consider:
**SQLite** (built-in, no installation):
```python
import sqlite3
conn = sqlite3.connect('seo_analysis.db')
# Insert results into database
```
**PostgreSQL** (professional option):
```python
import psycopg2
conn = psycopg2.connect("dbname=seo_db user=postgres")
# Insert results
```
But for now, CSV is perfect for your needs.
---
## Summary
### Storage
| Aspect | Implementation |
|--------|-----------------|
| Database? | No - CSV files |
| Location | `output/reports/` |
| Format | CSV (Excel/Sheets compatible) |
| Persistence | Permanent (until deleted) |
### Draft Posts
| Aspect | Usage |
|--------|-------|
| Default | Published only |
| Include drafts | `--include-drafts` flag |
| Output column | `status` (publish/draft) |
| Use case | Optimize before publishing, recover removed content |
### Commands
```bash
# Published only
python scripts/multi_site_seo_analyzer.py
# Published + Drafts
python scripts/multi_site_seo_analyzer.py --include-drafts
# Published + Drafts + AI
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
# Skip AI (faster)
python scripts/multi_site_seo_analyzer.py --no-ai
```
---
## Next Steps
1. **First run (published only):**
```bash
python scripts/multi_site_seo_analyzer.py --no-ai
```
2. **Analyze results:**
```bash
open output/reports/seo_analysis_*.csv
```
3. **Optimize published posts** with score < 50
4. **Second run (include drafts):**
```bash
python scripts/multi_site_seo_analyzer.py --include-drafts
```
5. **Decide on drafts:** Publish, improve, or delete
6. **Track progress:** Re-run monthly and compare scores
Ready? Start with: `python scripts/multi_site_seo_analyzer.py --include-drafts`