Major refactoring to create a clean, integrated CLI application: ### New Features: - Unified CLI executable (./seo) with simple command structure - All commands accept optional CSV file arguments - Auto-detection of latest files when no arguments provided - Simplified output directory structure (output/ instead of output/reports/) - Cleaner export filename format (all_posts_YYYY-MM-DD.csv) ### Commands: - export: Export all posts from WordPress sites - analyze [csv]: Analyze posts with AI (optional CSV input) - recategorize [csv]: Recategorize posts with AI - seo_check: Check SEO quality - categories: Manage categories across sites - approve [files]: Review and approve recommendations - full_pipeline: Run complete workflow - analytics, gaps, opportunities, report, status ### Changes: - Moved all scripts to scripts/ directory - Created config.yaml for configuration - Updated all scripts to use output/ directory - Deprecated old seo-cli.py in favor of new ./seo - Added AGENTS.md and CHANGELOG.md documentation - Consolidated README.md with updated usage ### Technical: - Added PyYAML dependency - Removed hardcoded configuration values - All scripts now properly integrated - Better error handling and user feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
418 lines
9.6 KiB
Markdown
418 lines
9.6 KiB
Markdown
# Real-Time CSV Monitoring - Progressive Writing Guide
|
|
|
|
## What is Progressive CSV?
|
|
|
|
The analyzer now writes results to the CSV file **as they're analyzed** in real-time, instead of waiting until all posts are analyzed.
|
|
|
|
```
|
|
Traditional Mode:
|
|
Analyze 262 posts → Wait (2-3 min) → Write CSV
|
|
|
|
Progressive Mode (NEW):
|
|
Analyze post 1 → Write row 1
|
|
Analyze post 2 → Write row 2
|
|
Analyze post 3 → Write row 3
|
|
... (watch it grow in real-time)
|
|
```
|
|
|
|
---
|
|
|
|
## How It Works
|
|
|
|
### Enabled by Default
|
|
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py
|
|
```
|
|
|
|
Progressive CSV **enabled** by default. The CSV file starts writing immediately as analysis begins.
|
|
|
|
### Disable (Write Only at End)
|
|
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py --no-progressive
|
|
```
|
|
|
|
Use this if you prefer to wait for final results (slightly faster, no real-time visibility).
|
|
|
|
---
|
|
|
|
## Real-Time Monitoring
|
|
|
|
### Monitor Progress in Excel/Google Sheets
|
|
|
|
**Option 1: Watch CSV grow in real-time**
|
|
|
|
```bash
|
|
# Terminal 1: Start analyzer
|
|
python scripts/multi_site_seo_analyzer.py
|
|
|
|
# Terminal 2: Watch file grow
|
|
tail -f output/reports/seo_analysis_*.csv
|
|
```
|
|
|
|
Output:
|
|
```
|
|
site,post_id,status,title,overall_score
|
|
mistergeek.net,1,publish,"VPN Guide",45
|
|
mistergeek.net,2,publish,"Best Software",72
|
|
mistergeek.net,3,publish,"Gaming Setup",38
|
|
mistergeek.net,4,draft,"Draft Post",28
|
|
[... more rows appear as analysis continues]
|
|
```
|
|
|
|
**Option 2: Open CSV in Excel while running**
|
|
|
|
1. Start analyzer: `python scripts/multi_site_seo_analyzer.py`
|
|
2. Open file: `output/reports/seo_analysis_*.csv` in Excel
|
|
3. **Set to auto-refresh** (Excel → Options → Data → Refresh Data)
|
|
4. Watch rows appear as posts are analyzed
|
|
|
|
**Option 3: Open in Google Sheets**
|
|
|
|
1. Start analyzer
|
|
2. Upload CSV to Google Sheets
|
|
3. File → "Enable live editing"
|
|
4. Rows appear in real-time
|
|
|
|
---
|
|
|
|
## Examples
|
|
|
|
### Example 1: Basic Progressive Analysis
|
|
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py
|
|
```
|
|
|
|
**Output:**
|
|
- CSV created immediately
|
|
- Rows added as posts are analyzed
|
|
- Monitor with `tail -f output/reports/seo_analysis_*.csv`
|
|
- Takes ~2-3 minutes for 262 posts
|
|
- Final step: Add AI recommendations and re-write CSV
|
|
|
|
### Example 2: Progressive + Drafts
|
|
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py --include-drafts
|
|
```
|
|
|
|
**Output:**
|
|
- Analyzes published + draft posts
|
|
- Shows status column: "publish" or "draft"
|
|
- Rows appear in real-time
|
|
- Drafts analyzed after published posts
|
|
|
|
### Example 3: Progressive + AI Recommendations
|
|
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py --top-n 20
|
|
```
|
|
|
|
**Output:**
|
|
- Initial CSV: ~2 minutes with all posts (no AI yet)
|
|
- Then: AI analysis for top 20 (~5-10 minutes)
|
|
- Final CSV: Includes AI recommendations for top 20
|
|
- You can see progress in two phases
|
|
|
|
### Example 4: Disable Progressive (Batch Mode)
|
|
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py --no-progressive
|
|
```
|
|
|
|
**Output:**
|
|
- Analyzes all posts in memory
|
|
- Only writes CSV when complete (~3-5 minutes)
|
|
- Single output file at the end
|
|
- Slightly faster execution
|
|
|
|
---
|
|
|
|
## Monitoring Setup
|
|
|
|
### Terminal Monitoring
|
|
|
|
**Watch CSV as it grows:**
|
|
|
|
```bash
|
|
# In one terminal
|
|
python scripts/multi_site_seo_analyzer.py
|
|
|
|
# In another terminal (macOS/Linux)
|
|
tail -f output/reports/seo_analysis_*.csv | head -20
|
|
|
|
# Or with watch command (every 2 seconds)
|
|
watch -n 2 'wc -l output/reports/seo_analysis_*.csv'
|
|
|
|
# On Windows
|
|
Get-Content output/reports/seo_analysis_*.csv -Tail 5
|
|
```
|
|
|
|
### Spreadsheet Monitoring
|
|
|
|
**Google Sheets (recommended):**
|
|
|
|
```
|
|
1. Google Drive → New → Google Sheets
|
|
2. File → Open → Upload CSV
|
|
3. Let Google Sheets auto-import
|
|
4. File → Import → "Replace spreadsheet" (if updating)
|
|
5. Watch rows add in real-time
|
|
```
|
|
|
|
**Excel (macOS/Windows):**
|
|
|
|
```
|
|
1. Open Excel
|
|
2. File → Open → Navigate to output/reports/
|
|
3. Select seo_analysis_*.csv
|
|
4. Right-click → Format Cells → "Enable auto-refresh"
|
|
5. Watch rows appear
|
|
```
|
|
|
|
---
|
|
|
|
## File Progress Examples
|
|
|
|
### Snapshot 1 (30 seconds in)
|
|
|
|
```
|
|
site,post_id,status,title,overall_score
|
|
mistergeek.net,1,publish,"Complete VPN Guide",92
|
|
mistergeek.net,2,publish,"Best VPN Services",88
|
|
mistergeek.net,3,publish,"VPN for Gaming",76
|
|
mistergeek.net,4,publish,"Streaming with VPN",72
|
|
```
|
|
|
|
### Snapshot 2 (1 minute in)
|
|
|
|
```
|
|
[Same as above, plus:]
|
|
mistergeek.net,5,publish,"Best Software Tools",85
|
|
mistergeek.net,6,publish,"Software Comparison",78
|
|
mistergeek.net,7,draft,"Incomplete Software",35
|
|
mistergeek.net,8,publish,"Gaming Setup Guide",68
|
|
webscroll.fr,1,publish,"YggTorrent Guide",45
|
|
...
|
|
```
|
|
|
|
### Snapshot 3 (Final, with AI)
|
|
|
|
```
|
|
[All 262+ posts, plus AI recommendations in last column:]
|
|
mistergeek.net,1,publish,"Complete VPN...",92,"Consider adding..."
|
|
mistergeek.net,2,publish,"Best VPN...",88,"Strong, no changes"
|
|
mistergeek.net,3,publish,"VPN for Gaming",76,"Expand meta..."
|
|
```
|
|
|
|
---
|
|
|
|
## Performance Impact
|
|
|
|
### With Progressive CSV (default)
|
|
|
|
- Disk writes: Continuous (one per post)
|
|
- CPU: Slightly higher (writing to disk)
|
|
- Disk I/O: Continuous
|
|
- Visibility: Real-time
|
|
- Time: ~2-3 minutes (262 posts) + AI
|
|
|
|
### Without Progressive CSV (--no-progressive)
|
|
|
|
- Disk writes: One large write at end
|
|
- CPU: Slightly lower (batch write)
|
|
- Disk I/O: Single large operation
|
|
- Visibility: No progress updates
|
|
- Time: ~2-3 minutes (262 posts) + AI
|
|
|
|
**Difference is negligible** (< 5% performance difference).
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### CSV Shows 0 Bytes
|
|
|
|
**Problem:** CSV file exists but shows 0 bytes.
|
|
|
|
**Solution:**
|
|
- Give the script a few seconds to start writing
|
|
- Check if analyzer is still running: `ps aux | grep multi_site`
|
|
- Verify directory exists: `ls -la output/reports/`
|
|
|
|
### Can't Open CSV While Writing
|
|
|
|
**Problem:** Excel says "file is in use" or "file is locked".
|
|
|
|
**Solutions:**
|
|
- Open as read-only (don't modify)
|
|
- Use Google Sheets instead (auto-refreshes)
|
|
- Use `--no-progressive` flag and wait for completion
|
|
- Wait for final CSV to be written (analyzer complete)
|
|
|
|
### File Grows Then Stops
|
|
|
|
**Problem:** CSV stops growing partway through.
|
|
|
|
**Likely cause:** Analyzer hit an error or is running AI recommendations.
|
|
|
|
**Solutions:**
|
|
- Check terminal for error messages
|
|
- If using `--top-n 20`, AI phase might be in progress (~5-10 min)
|
|
- Check file size: `ls -lh output/reports/seo_analysis_*.csv`
|
|
|
|
### Want to See Only New Rows?
|
|
|
|
Use tail to show only new additions:
|
|
|
|
```bash
|
|
# Show last 10 rows
|
|
tail -n 10 output/reports/seo_analysis_*.csv
|
|
|
|
# Watch new rows as they're added (macOS/Linux)
|
|
tail -f output/reports/seo_analysis_*.csv
|
|
|
|
# Or use watch
|
|
watch -n 1 'tail -20 output/reports/seo_analysis_*.csv'
|
|
```
|
|
|
|
---
|
|
|
|
## Workflow Examples
|
|
|
|
### Quick Monitoring (Simple)
|
|
|
|
```bash
|
|
# Terminal 1
|
|
python scripts/multi_site_seo_analyzer.py --include-drafts
|
|
|
|
# Terminal 2 (watch progress)
|
|
watch -n 2 'wc -l output/reports/seo_analysis_*.csv'
|
|
|
|
# Output every 2 seconds:
|
|
# 30 output/reports/seo_analysis_20250216_120000.csv
|
|
# 60 output/reports/seo_analysis_20250216_120000.csv
|
|
# 92 output/reports/seo_analysis_20250216_120000.csv
|
|
# [... grows to 262+]
|
|
```
|
|
|
|
### Live Dashboard (Advanced)
|
|
|
|
```bash
|
|
# Terminal 1: Run analyzer
|
|
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
|
|
|
|
# Terminal 2: Monitor with live stats
|
|
watch -n 1 'echo "=== CSV Status ===" && \
|
|
wc -l output/reports/seo_analysis_*.csv && \
|
|
echo "" && \
|
|
echo "=== Last 5 Rows ===" && \
|
|
tail -5 output/reports/seo_analysis_*.csv && \
|
|
echo "" && \
|
|
echo "=== Worst Scores ===" && \
|
|
tail -20 output/reports/seo_analysis_*.csv | sort -t, -k14 -n | head -5'
|
|
```
|
|
|
|
### Team Collaboration
|
|
|
|
```bash
|
|
# 1. Start analyzer with progressive CSV
|
|
python scripts/multi_site_seo_analyzer.py
|
|
|
|
# 2. Upload to Google Sheets
|
|
# File → Import → Upload CSV → Replace Spreadsheet
|
|
|
|
# 3. Share with team
|
|
# File → Share → Add team members
|
|
|
|
# 4. Team watches progress in real-time on Google Sheets
|
|
# Rows appear as analysis runs
|
|
```
|
|
|
|
---
|
|
|
|
## Data Quality Notes
|
|
|
|
### During Progressive Write
|
|
|
|
- Each row is **complete** when written (all analysis fields present)
|
|
- AI recommendations field is empty until AI phase completes
|
|
- Safe to view/read while running
|
|
|
|
### After Completion
|
|
|
|
- All rows updated with final data
|
|
- AI recommendations added for top N posts
|
|
- CSV fully populated and ready for import/action
|
|
|
|
### File Integrity
|
|
|
|
- Progressive CSV is **safe to view while running**
|
|
- Each row flush after write (atomic operation)
|
|
- No risk of corruption during analysis
|
|
|
|
---
|
|
|
|
## Command Reference
|
|
|
|
```bash
|
|
# Default (progressive CSV enabled)
|
|
python scripts/multi_site_seo_analyzer.py
|
|
|
|
# Disable progressive (batch write)
|
|
python scripts/multi_site_seo_analyzer.py --no-progressive
|
|
|
|
# Progressive + drafts
|
|
python scripts/multi_site_seo_analyzer.py --include-drafts
|
|
|
|
# Progressive + AI + drafts
|
|
python scripts/multi_site_seo_analyzer.py --include-drafts --top-n 20
|
|
|
|
# Disable progressive + no AI
|
|
python scripts/multi_site_seo_analyzer.py --no-progressive --no-ai
|
|
|
|
# All options combined
|
|
python scripts/multi_site_seo_analyzer.py \
|
|
--include-drafts \
|
|
--top-n 20 \
|
|
--output my_report.csv
|
|
# (progressive enabled by default)
|
|
```
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
| Feature | Default | Flag |
|
|
|---------|---------|------|
|
|
| Progressive CSV | Enabled | `--no-progressive` to disable |
|
|
| Write Mode | Real-time rows | Batch at end (with flag) |
|
|
| Monitoring | Real-time in Excel/Sheets | Not available (with flag) |
|
|
| Performance | ~2-3 min + AI | Slightly faster (negligible) |
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
1. **Run with progressive CSV:**
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py --include-drafts
|
|
```
|
|
|
|
2. **Monitor in real-time:**
|
|
```bash
|
|
# Terminal 2
|
|
tail -f output/reports/seo_analysis_*.csv
|
|
```
|
|
|
|
3. **Or open in Google Sheets** and watch rows add live
|
|
|
|
4. **When complete**, review CSV and start optimizing
|
|
|
|
Ready to see it in action? Run:
|
|
```bash
|
|
python scripts/multi_site_seo_analyzer.py --include-drafts
|
|
```
|