Major refactoring to create a clean, integrated CLI application: ### New Features: - Unified CLI executable (./seo) with simple command structure - All commands accept optional CSV file arguments - Auto-detection of latest files when no arguments provided - Simplified output directory structure (output/ instead of output/reports/) - Cleaner export filename format (all_posts_YYYY-MM-DD.csv) ### Commands: - export: Export all posts from WordPress sites - analyze [csv]: Analyze posts with AI (optional CSV input) - recategorize [csv]: Recategorize posts with AI - seo_check: Check SEO quality - categories: Manage categories across sites - approve [files]: Review and approve recommendations - full_pipeline: Run complete workflow - analytics, gaps, opportunities, report, status ### Changes: - Moved all scripts to scripts/ directory - Created config.yaml for configuration - Updated all scripts to use output/ directory - Deprecated old seo-cli.py in favor of new ./seo - Added AGENTS.md and CHANGELOG.md documentation - Consolidated README.md with updated usage ### Technical: - Added PyYAML dependency - Removed hardcoded configuration values - All scripts now properly integrated - Better error handling and user feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
8.6 KiB
SEO Analysis & Improvement System - Project Guide
📋 Overview
A complete 4-phase SEO analysis pipeline that:
- Integrates Google Analytics, Search Console, and WordPress data
- Identifies high-potential keywords for optimization (positions 11-30)
- Discovers new content opportunities using AI
- Generates a comprehensive report with 90-day action plan
📂 Project Structure
seo/
├── input/ # SOURCE DATA (your exports)
│ ├── new-propositions.csv # WordPress posts
│ ├── README.md # How to export data
│ └── analytics/
│ ├── ga4_export.csv # Google Analytics
│ └── gsc/
│ ├── Pages.csv # GSC pages (required)
│ ├── Requêtes.csv # GSC queries (optional)
│ └── ...
│
├── output/ # RESULTS (auto-generated)
│ ├── results/
│ │ ├── seo_optimization_report.md # 📍 PRIMARY OUTPUT
│ │ ├── posts_with_analytics.csv
│ │ ├── posts_prioritized.csv
│ │ ├── keyword_opportunities.csv
│ │ └── content_gaps.csv
│ │
│ ├── logs/
│ │ ├── import_log.txt
│ │ ├── opportunity_analysis_log.txt
│ │ └── content_gap_analysis_log.txt
│ │
│ └── README.md # Output guide
│
├── 🚀 run_analysis.sh # Run entire pipeline
├── analytics_importer.py # Phase 1: Merge data
├── opportunity_analyzer.py # Phase 2: Find wins
├── content_gap_analyzer.py # Phase 3: Find gaps
├── report_generator.py # Phase 4: Generate report
├── config.py
├── requirements.txt
├── .env.example
└── .gitignore
🚀 Getting Started
Step 1: Prepare Input Data
Place WordPress posts CSV:
input/new-propositions.csv
Export Google Analytics 4:
- Go to: Analytics > Reports > Engagement > Pages and Screens
- Set date range: Last 90 days
- Download CSV → Save as:
input/analytics/ga4_export.csv
Export Google Search Console (Pages):
- Go to: Performance
- Set date range: Last 90 days
- Export CSV → Save as:
input/analytics/gsc/Pages.csv
Step 2: Run Analysis
# Run entire pipeline
./run_analysis.sh
# OR run steps individually
./venv/bin/python analytics_importer.py
./venv/bin/python opportunity_analyzer.py
./venv/bin/python content_gap_analyzer.py
./venv/bin/python report_generator.py
Step 3: Review Report
Open: output/results/seo_optimization_report.md
Contains:
- Executive summary with current metrics
- Top 20 posts ranked by opportunity (with AI recommendations)
- Keyword opportunities breakdown
- Content gap analysis
- 90-day phased action plan
📊 What Each Script Does
analytics_importer.py (Phase 1)
Purpose: Merge analytics data with WordPress posts
Input:
input/new-propositions.csv(WordPress posts)input/analytics/ga4_export.csv(Google Analytics)input/analytics/gsc/Pages.csv(Search Console)
Output:
output/results/posts_with_analytics.csv(enriched dataset)output/logs/import_log.txt(matching report)
Handles: French and English column names, URL normalization, multi-source merging
opportunity_analyzer.py (Phase 2)
Purpose: Identify high-potential optimization opportunities
Input:
output/results/posts_with_analytics.csv
Output:
output/results/keyword_opportunities.csv(26 opportunities)output/logs/opportunity_analysis_log.txt
Features:
- Filters posts at positions 11-30 (page 2-3)
- Calculates opportunity scores (0-100)
- Generates AI recommendations for top 20 posts
content_gap_analyzer.py (Phase 3)
Purpose: Discover new content opportunities
Input:
output/results/posts_with_analytics.csvinput/analytics/gsc/Requêtes.csv(optional)
Output:
output/results/content_gaps.csvoutput/logs/content_gap_analysis_log.txt
Features:
- Topic cluster extraction
- Gap identification
- AI-powered content suggestions
report_generator.py (Phase 4)
Purpose: Create comprehensive report with action plan
Input:
- All analysis results from phases 1-3
Output:
output/results/seo_optimization_report.md← PRIMARY DELIVERABLEoutput/results/posts_prioritized.csv
Features:
- Comprehensive markdown report
- All 262 posts ranked
- 90-day action plan with estimated gains
📈 Understanding Your Report
Key Metrics (Executive Summary)
- Total Posts: All posts analyzed
- Monthly Traffic: Current organic traffic
- Total Impressions: Search visibility (90 days)
- Average Position: Current ranking position
- Opportunities: Posts ready to optimize
Top 20 Posts to Optimize
Each post shows:
- Title (the post name)
- Current Position (search ranking)
- Impressions (search visibility)
- Traffic (organic visits)
- Priority Score (0-100 opportunity rating)
- Status (page 1 vs page 2-3)
- Recommendations (how to improve)
Priority Scoring (0-100)
Higher scores = more opportunity for gain with less effort
Calculated from:
- Position (35%) - How close to page 1
- Traffic Potential (30%) - Search impressions
- CTR Gap (20%) - Improvement opportunity
- Content Quality (15%) - Existing engagement
🎯 Action Plan
Week 1-2: Quick Wins (+100 visits/month)
- Focus on posts at positions 11-15
- Update SEO titles and meta descriptions
- 30-60 minutes per post
Week 3-4: Core Optimization (+150 visits/month)
- Posts 6-15 in priority list
- Add content sections
- Improve structure with headers
- 2-3 hours per post
Week 5-8: New Content (+300 visits/month)
- Create 3-5 new posts from gap analysis
- Target high-search-demand topics
- 4-6 hours per post
Week 9-12: Refinement (+100 visits/month)
- Monitor ranking improvements
- Refine underperforming optimizations
- Prepare next round of analysis
Total: +650 visits/month potential gain
🔧 Configuration
Edit .env to customize analysis:
# Position range for opportunities
ANALYSIS_MIN_POSITION=11
ANALYSIS_MAX_POSITION=30
# Minimum impressions to consider
ANALYSIS_MIN_IMPRESSIONS=50
# Posts for AI recommendations
ANALYSIS_TOP_N_POSTS=20
🐛 Troubleshooting
Missing Input Files
❌ Error: File not found: input/...
→ Check that all files are in the correct locations
Empty Report Titles
✓ FIXED - Now correctly loads post titles from multiple column names
No Opportunities Found
⚠️ No opportunities found in specified range
→ Try lowering ANALYSIS_MIN_IMPRESSIONS in .env
API Errors
❌ AI generation failed: ...
→ Check OPENROUTER_API_KEY in .env and account balance
📚 Additional Resources
input/README.md- How to export analytics dataoutput/README.md- Output files guideQUICKSTART_ANALYSIS.md- Step-by-step tutorialANALYSIS_SYSTEM.md- Technical documentation
✅ Success Checklist
- All input files placed in
input/directory .envfile configured with API key- Ran
./run_analysis.shsuccessfully - Reviewed
output/results/seo_optimization_report.md - Identified 5-10 quick wins to start with
- Created action plan for first week
🎓 Key Learnings
Why Positions 11-30 Matter
- Page 1 posts are hard to move
- Page 2-3 posts are easy wins (small improvements move them up)
- Quick gains: 1-2 position improvements = CTR increases 20-30%
CTR Expectations by Position
- Position 1: ~30% CTR
- Position 5-10: 4-7% CTR
- Position 11-15: 1-2% CTR (quick wins)
- Position 16-20: 0.8-1% CTR
- Position 21-30: ~0.5% CTR
Content Quality Signals
- Higher bounce rate = less relevant content
- Low traffic = poor CTR or position
- Low impressions = insufficient optimization
📞 Support
Check Logs First
output/logs/import_log.txt
output/logs/opportunity_analysis_log.txt
output/logs/content_gap_analysis_log.txt
Common Issues
- Empty titles → Fixed with flexible column name mapping
- File not found → Check file locations match structure
- API errors → Verify API key and account balance
- No opportunities → Lower minimum impressions threshold
🚀 Ready to Optimize?
- Prepare your input data
- Run
./run_analysis.sh - Open the report
- Start with quick wins
- Track improvements in 4 weeks
Good luck boosting your SEO! 📈
Last Updated: February 2026 System Status: Production Ready ✅