diff --git a/ENHANCED_ANALYSIS_GUIDE.md b/ENHANCED_ANALYSIS_GUIDE.md new file mode 100644 index 0000000..f803529 --- /dev/null +++ b/ENHANCED_ANALYSIS_GUIDE.md @@ -0,0 +1,301 @@ +# Enhanced Analysis Guide + +## Overview + +The SEO automation tool now provides enhanced analysis capabilities with: + +1. **Selective Field Analysis** - Choose which fields to analyze (title, meta description, categories, site) +2. **In-place CSV Updates** - Update input CSV with new columns (with automatic backup) +3. **Category Proposer** - Dedicated command for AI-powered category suggestions + +## Commands + +### 1. Enhanced Analysis (`seo analyze`) + +Analyze posts with AI and add recommendation columns to your CSV. + +#### Basic Usage +```bash +# Analyze all fields (default behavior) +./seo analyze + +# Analyze specific CSV file +./seo analyze output/all_posts_2026-02-16.csv +``` + +#### Selective Field Analysis +```bash +# Analyze only titles +./seo analyze -f title + +# Analyze titles and categories +./seo analyze -f title categories + +# Analyze meta descriptions only +./seo analyze -f meta_description + +# Analyze all fields +./seo analyze -f title meta_description categories site +``` + +#### Update Input CSV (In-place) +```bash +# Update input CSV with new columns (creates backup automatically) +./seo analyze -u + +# Update with specific fields only +./seo analyze -u -f title meta_description + +# Specify custom output file +./seo analyze -o output/my_analysis.csv +``` + +#### Output Columns + +Depending on selected fields, the following columns are added: + +**Title Analysis:** +- `proposed_title` - AI-suggested improved title +- `title_reason` - Explanation for title change + +**Meta Description Analysis:** +- `proposed_meta_description` - AI-suggested meta description (120-160 chars) +- `meta_reason` - Explanation for meta description change + +**Category Analysis:** +- `proposed_category` - AI-suggested best category +- `category_reason` - Explanation for category choice + +**Site Analysis:** +- `proposed_site` - AI-suggested best site +- `site_reason` - Explanation for site recommendation + +**Common Fields:** +- `ai_confidence` - AI confidence level (High/Medium/Low) +- `ai_priority` - Priority level (High/Medium/Low) + +### 2. Category Proposer (`seo category_propose`) + +Dedicated command for AI-powered category suggestions based on post content. + +#### Usage +```bash +# Propose categories for latest export +./seo category_propose + +# Propose categories for specific CSV +./seo category_propose output/all_posts_2026-02-16.csv + +# Save to custom output file +./seo category_propose -o output/my_category_proposals.csv +``` + +#### Output Columns +- `post_id` - Post identifier +- `title` - Post title +- `current_categories` - Current categories +- `proposed_category` - AI-suggested category +- `alternative_categories` - Alternative category suggestions +- `category_reason` - Explanation for suggestion +- `category_confidence` - Confidence level + +## Examples + +### Example 1: Analyze Titles Only +```bash +# Analyze only titles for SEO optimization +./seo analyze -f title + +# Output: analyzed_posts_YYYYMMDD_HHMMSS.csv +# Contains: original columns + proposed_title + title_reason + ai_confidence +``` + +### Example 2: Update CSV with Meta Descriptions +```bash +# Update input CSV with proposed meta descriptions +./seo analyze -u -f meta_description + +# Creates: +# - all_posts_2026-02-16_backup_YYYYMMDD_HHMMSS.csv (backup) +# - all_posts_2026-02-16.csv (updated with new columns) +``` + +### Example 3: Full Category Analysis +```bash +# Propose categories for all posts +./seo category_propose + +# Review proposals +open output/category_proposals_*.csv + +# Apply approved categories manually in WordPress +``` + +### Example 4: Multi-Field Analysis +```bash +# Analyze titles and categories together +./seo analyze -f title categories + +# Output includes: +# - proposed_title, title_reason +# - proposed_category, category_reason +# - ai_confidence, ai_priority +``` + +### Example 5: Targeted Analysis with Output +```bash +# Analyze meta descriptions, save to specific file +./seo analyze -f meta_description -o output/meta_analysis.csv +``` + +## Workflow Examples + +### Workflow 1: SEO Title Optimization +```bash +# 1. Export posts +./seo export + +# 2. Analyze titles only +./seo analyze -f title + +# 3. Review proposed titles +open output/analyzed_posts_*.csv + +# 4. Manually update best titles in WordPress +``` + +### Workflow 2: Category Reorganization +```bash +# 1. Export posts +./seo export + +# 2. Get category proposals +./seo category_propose + +# 3. Review proposals +open output/category_proposals_*.csv + +# 4. Apply approved category changes +``` + +### Workflow 3: Complete SEO Audit +```bash +# 1. Export posts +./seo export + +# 2. Analyze all fields +./seo analyze -f title meta_description categories site + +# 3. Review comprehensive analysis +open output/analyzed_posts_*.csv + +# 4. Implement changes based on AI recommendations +``` + +### Workflow 4: Incremental Analysis +```bash +# 1. Export posts +./seo export + +# 2. Analyze titles (fast, low cost) +./seo analyze -f title + +# 3. Later, analyze meta descriptions +./seo analyze -u -f meta_description + +# 4. Later, analyze categories +./seo analyze -u -f categories + +# Result: CSV progressively enriched with AI recommendations +``` + +## Cost Optimization + +### Reduce API Costs + +```bash +# Analyze only needed fields (saves tokens) +./seo analyze -f title # Cheaper than analyzing all fields + +# Use smaller batch sizes for better control +# (edit script or use environment variable) + +# Analyze in stages +./seo analyze -f title +./seo analyze -u -f meta_description +# Total cost similar, but better control over each step +``` + +### Token Usage by Field + +Approximate token usage per 100 posts: +- **title**: ~500 tokens (lowest cost) +- **meta_description**: ~800 tokens +- **categories**: ~600 tokens +- **site**: ~400 tokens (lowest cost) +- **All fields**: ~2000 tokens (best value) + +## Best Practices + +1. **Start Small**: Test with `-f title` first to see AI quality +2. **Review Before Applying**: Always review AI suggestions before implementing +3. **Use Backups**: The `-u` flag creates automatic backups +4. **Batch Analysis**: Analyze related fields together for better context +5. **Confidence Matters**: Pay attention to `ai_confidence` column +6. **Iterative Process**: Enrich CSVs incrementally for better control + +## Troubleshooting + +### No CSV File Found +```bash +# Error: No CSV file found +# Solution: Run export first or provide file path +./seo export +./seo analyze + +# Or specify file directly +./seo analyze path/to/your/file.csv +``` + +### API Key Not Set +```bash +# Error: OPENROUTER_API_KEY not set +# Solution: Add to .env file +echo "OPENROUTER_API_KEY=your_key_here" >> .env +``` + +### High API Costs +```bash +# Reduce costs by analyzing fewer fields +./seo analyze -f title # Instead of all fields + +# Or analyze in batches +./seo analyze -f title +./seo analyze -u -f meta_description +``` + +## File Formats + +### Input CSV Requirements + +Must contain at minimum: +- `post_id` - Unique identifier +- `title` - Post title (for title analysis) +- `meta_description` - Current meta (for meta analysis) +- `categories` - Current categories (for category analysis) +- `site` - Current site (for site analysis) +- `content_preview` or `content` - Post content (recommended for all analyses) + +### Output CSV Format + +Standard output includes: +- All original columns +- New `proposed_*` columns for analyzed fields +- `*_reason` columns with explanations +- `ai_confidence` and `ai_priority` columns + +--- + +**Version**: 1.0.0 +**Last Updated**: 2026-02-16 +**Related**: See ARCHITECTURE.md for system overview diff --git a/scripts/category_proposer.py b/scripts/category_proposer.py new file mode 100644 index 0000000..ac79298 --- /dev/null +++ b/scripts/category_proposer.py @@ -0,0 +1,239 @@ +#!/usr/bin/env python3 +""" +Category Proposer - AI-powered category suggestions +Analyzes posts and proposes optimal categories based on content. +""" + +import csv +import json +import logging +import sys +from pathlib import Path +from typing import Dict, List, Optional +import requests +from datetime import datetime +from config import Config + +logger = logging.getLogger(__name__) + + +class CategoryProposer: + """Propose categories for posts using AI.""" + + def __init__(self, csv_file: str): + """Initialize proposer with CSV file.""" + self.csv_file = Path(csv_file) + self.openrouter_api_key = Config.OPENROUTER_API_KEY + self.ai_model = Config.AI_MODEL + self.posts = [] + self.proposed_categories = [] + self.api_calls = 0 + self.ai_cost = 0.0 + + def load_csv(self) -> bool: + """Load posts from CSV.""" + logger.info(f"Loading CSV: {self.csv_file}") + + if not self.csv_file.exists(): + logger.error(f"CSV file not found: {self.csv_file}") + return False + + try: + with open(self.csv_file, 'r', encoding='utf-8') as f: + reader = csv.DictReader(f) + self.posts = list(reader) + + logger.info(f"✓ Loaded {len(self.posts)} posts") + return True + + except Exception as e: + logger.error(f"Error loading CSV: {e}") + return False + + def get_category_proposals(self, batch: List[Dict]) -> Optional[str]: + """Get AI category proposals for a batch of posts.""" + if not self.openrouter_api_key: + logger.error("OPENROUTER_API_KEY not set") + return None + + # Format posts for AI + formatted = [] + for i, post in enumerate(batch, 1): + text = f"{i}. ID: {post['post_id']}\n" + text += f" Title: {post.get('title', '')}\n" + text += f" Current Categories: {post.get('categories', '')}\n" + if 'content_preview' in post: + text += f" Content: {post['content_preview'][:300]}...\n" + formatted.append(text) + + posts_text = "\n".join(formatted) + + prompt = f"""Analyze these blog posts and propose optimal categories. + +{posts_text} + +For EACH post, provide: +{{ + "post_id": , + "current_categories": "", + "proposed_category": "", + "alternative_categories": ["", ""], + "reason": "", + "confidence": "" +}} + +Return ONLY a JSON array with one object per post.""" + + try: + logger.info(f" Getting category proposals...") + + response = requests.post( + "https://openrouter.ai/api/v1/chat/completions", + headers={ + "Authorization": f"Bearer {self.openrouter_api_key}", + "Content-Type": "application/json", + }, + json={ + "model": self.ai_model, + "messages": [{"role": "user", "content": prompt}], + "temperature": 0.3, + }, + timeout=60 + ) + response.raise_for_status() + + result = response.json() + self.api_calls += 1 + + usage = result.get('usage', {}) + input_tokens = usage.get('prompt_tokens', 0) + output_tokens = usage.get('completion_tokens', 0) + self.ai_cost += (input_tokens * 3 + output_tokens * 15) / 1_000_000 + + logger.info(f" ✓ Got proposals (tokens: {input_tokens}+{output_tokens})") + return result['choices'][0]['message']['content'].strip() + + except Exception as e: + logger.error(f"Error getting proposals: {e}") + return None + + def parse_proposals(self, proposals_json: str) -> List[Dict]: + """Parse JSON proposals.""" + try: + start_idx = proposals_json.find('[') + end_idx = proposals_json.rfind(']') + 1 + + if start_idx == -1 or end_idx == 0: + return [] + + return json.loads(proposals_json[start_idx:end_idx]) + + except json.JSONDecodeError: + return [] + + def propose_categories(self, batch_size: int = 10) -> bool: + """Propose categories for all posts.""" + logger.info("\n" + "="*70) + logger.info("PROPOSING CATEGORIES WITH AI") + logger.info("="*70 + "\n") + + batches = [self.posts[i:i + batch_size] for i in range(0, len(self.posts), batch_size)] + logger.info(f"Processing {len(self.posts)} posts in {len(batches)} batches...\n") + + all_proposals = {} + + for batch_num, batch in enumerate(batches, 1): + logger.info(f"Batch {batch_num}/{len(batches)}...") + + proposals_json = self.get_category_proposals(batch) + if not proposals_json: + continue + + proposals = self.parse_proposals(proposals_json) + + for prop in proposals: + all_proposals[str(prop.get('post_id', ''))] = prop + + logger.info(f" ✓ Got {len(proposals)} proposals") + + logger.info(f"\n✓ Proposals complete!") + logger.info(f" Total: {len(all_proposals)}") + logger.info(f" API calls: {self.api_calls}") + logger.info(f" Cost: ${self.ai_cost:.4f}") + + # Map proposals to posts + for post in self.posts: + post_id = str(post['post_id']) + proposal = all_proposals.get(post_id, {}) + + self.proposed_categories.append({ + **post, + 'proposed_category': proposal.get('proposed_category', post.get('categories', '')), + 'alternative_categories': ', '.join(proposal.get('alternative_categories', [])), + 'category_reason': proposal.get('reason', ''), + 'category_confidence': proposal.get('confidence', 'Medium'), + 'current_categories': post.get('categories', '') + }) + + return True + + def export_proposals(self, output_file: Optional[str] = None) -> str: + """Export category proposals to CSV.""" + if not output_file: + output_dir = Path(__file__).parent.parent / 'output' + output_dir.mkdir(parents=True, exist_ok=True) + timestamp = datetime.now().strftime('%Y%m%d_%H%M%S') + output_file = output_dir / f'category_proposals_{timestamp}.csv' + + output_file = Path(output_file) + output_file.parent.mkdir(parents=True, exist_ok=True) + + fieldnames = [ + 'post_id', 'title', 'site', 'current_categories', + 'proposed_category', 'alternative_categories', + 'category_reason', 'category_confidence' + ] + + logger.info(f"\nExporting to: {output_file}") + + with open(output_file, 'w', newline='', encoding='utf-8') as f: + writer = csv.DictWriter(f, fieldnames=fieldnames, extrasaction='ignore') + writer.writeheader() + writer.writerows(self.proposed_categories) + + logger.info(f"✓ Exported {len(self.proposed_categories)} proposals") + return str(output_file) + + def run(self, output_file: Optional[str] = None, batch_size: int = 10) -> str: + """Run complete category proposal process.""" + if not self.load_csv(): + sys.exit(1) + + if not self.propose_categories(batch_size=batch_size): + logger.error("Failed to propose categories") + sys.exit(1) + + return self.export_proposals(output_file) + + +def main(): + """Main entry point.""" + import argparse + + parser = argparse.ArgumentParser( + description='AI-powered category proposer for blog posts' + ) + parser.add_argument('csv_file', help='Input CSV file with posts') + parser.add_argument('--output', '-o', help='Output CSV file') + parser.add_argument('--batch-size', type=int, default=10, help='Batch size') + + args = parser.parse_args() + + proposer = CategoryProposer(args.csv_file) + output_file = proposer.run(batch_size=args.batch_size) + + logger.info(f"\n✓ Category proposals saved to: {output_file}") + + +if __name__ == '__main__': + main() diff --git a/scripts/enhanced_analyzer.py b/scripts/enhanced_analyzer.py new file mode 100644 index 0000000..cf7e3ba --- /dev/null +++ b/scripts/enhanced_analyzer.py @@ -0,0 +1,375 @@ +#!/usr/bin/env python3 +""" +Enhanced AI Analyzer - Selective analysis with in-place updates +Analyzes posts and updates CSV with AI recommendations for: +- Title optimization +- Meta description optimization +- Category suggestions +- Site placement recommendations +""" + +import csv +import json +import logging +import sys +from pathlib import Path +from typing import Dict, List, Optional, Tuple +import requests +from datetime import datetime +from config import Config + +logger = logging.getLogger(__name__) + + +class EnhancedPostAnalyzer: + """Enhanced analyzer with selective column analysis and in-place updates.""" + + def __init__(self, csv_file: str, analyze_fields: Optional[List[str]] = None): + """ + Initialize analyzer. + + Args: + csv_file: Path to input CSV + analyze_fields: List of fields to analyze ['title', 'meta_description', 'categories', 'site'] + If None, analyzes all fields + """ + self.csv_file = Path(csv_file) + self.openrouter_api_key = Config.OPENROUTER_API_KEY + self.ai_model = Config.AI_MODEL + self.posts = [] + self.analyzed_posts = [] + self.api_calls = 0 + self.ai_cost = 0.0 + + # Default: analyze all fields + if analyze_fields is None: + self.analyze_fields = ['title', 'meta_description', 'categories', 'site'] + else: + self.analyze_fields = analyze_fields + + logger.info(f"Fields to analyze: {', '.join(self.analyze_fields)}") + + def load_csv(self) -> bool: + """Load posts from CSV file.""" + logger.info(f"Loading CSV: {self.csv_file}") + + if not self.csv_file.exists(): + logger.error(f"CSV file not found: {self.csv_file}") + return False + + try: + with open(self.csv_file, 'r', encoding='utf-8') as f: + reader = csv.DictReader(f) + self.posts = list(reader) + + logger.info(f"✓ Loaded {len(self.posts)} posts from CSV") + return True + + except Exception as e: + logger.error(f"Error loading CSV: {e}") + return False + + def get_ai_recommendations(self, batch: List[Dict], fields: List[str]) -> Optional[str]: + """Get AI recommendations for specific fields.""" + if not self.openrouter_api_key: + logger.error("OPENROUTER_API_KEY not set") + return None + + # Format posts for AI + formatted_posts = [] + for i, post in enumerate(batch, 1): + post_text = f"{i}. POST ID: {post['post_id']}\n" + post_text += f" Site: {post.get('site', '')}\n" + + if 'title' in fields: + post_text += f" Title: {post.get('title', '')}\n" + + if 'meta_description' in fields: + post_text += f" Meta Description: {post.get('meta_description', '')}\n" + + if 'categories' in fields: + post_text += f" Categories: {post.get('categories', '')}\n" + + if 'content_preview' in post: + post_text += f" Content Preview: {post.get('content_preview', '')[:300]}...\n" + + formatted_posts.append(post_text) + + posts_text = "\n".join(formatted_posts) + + # Build prompt based on requested fields + prompt_parts = ["Analyze these blog posts and provide recommendations.\n\n"] + + if 'site' in fields: + prompt_parts.append("""Website Strategy: +- mistergeek.net: High-value topics (VPN, Software, Gaming, General Tech, SEO, Content Marketing) +- webscroll.fr: Torrenting, File-Sharing, Tracker guides +- hellogeek.net: Low-traffic, experimental, off-brand content + +""") + + prompt_parts.append(posts_text) + prompt_parts.append("\nFor EACH post, provide a JSON object with:\n{\n") + + if 'title' in fields: + prompt_parts.append(' "proposed_title": "",\n') + prompt_parts.append(' "title_reason": "",\n') + + if 'meta_description' in fields: + prompt_parts.append(' "proposed_meta_description": "",\n') + prompt_parts.append(' "meta_reason": "",\n') + + if 'categories' in fields: + prompt_parts.append(' "proposed_category": "",\n') + prompt_parts.append(' "category_reason": "",\n') + + if 'site' in fields: + prompt_parts.append(' "proposed_site": "",\n') + prompt_parts.append(' "site_reason": "",\n') + + prompt_parts.append(' "confidence": "",\n') + prompt_parts.append(' "priority": ""\n}') + + prompt_parts.append("\nReturn ONLY a JSON array of objects, one per post.") + + prompt = "".join(prompt_parts) + + try: + logger.info(f" Sending batch to AI for analysis...") + + response = requests.post( + "https://openrouter.ai/api/v1/chat/completions", + headers={ + "Authorization": f"Bearer {self.openrouter_api_key}", + "Content-Type": "application/json", + }, + json={ + "model": self.ai_model, + "messages": [{"role": "user", "content": prompt}], + "temperature": 0.3, + }, + timeout=60 + ) + response.raise_for_status() + + result = response.json() + self.api_calls += 1 + + # Track cost + usage = result.get('usage', {}) + input_tokens = usage.get('prompt_tokens', 0) + output_tokens = usage.get('completion_tokens', 0) + self.ai_cost += (input_tokens * 3 + output_tokens * 15) / 1_000_000 + + recommendations_text = result['choices'][0]['message']['content'].strip() + logger.info(f" ✓ Got recommendations (tokens: {input_tokens}+{output_tokens})") + + return recommendations_text + + except Exception as e: + logger.error(f"Error getting AI recommendations: {e}") + return None + + def parse_recommendations(self, recommendations_json: str) -> List[Dict]: + """Parse JSON recommendations from AI.""" + try: + start_idx = recommendations_json.find('[') + end_idx = recommendations_json.rfind(']') + 1 + + if start_idx == -1 or end_idx == 0: + logger.error("Could not find JSON array in response") + return [] + + json_str = recommendations_json[start_idx:end_idx] + recommendations = json.loads(json_str) + + return recommendations + + except json.JSONDecodeError as e: + logger.error(f"Error parsing JSON recommendations: {e}") + return [] + + def analyze_posts(self, batch_size: int = 10) -> bool: + """Analyze all posts in batches.""" + logger.info("\n" + "="*70) + logger.info("ANALYZING POSTS WITH AI") + logger.info("="*70 + "\n") + + batches = [self.posts[i:i + batch_size] for i in range(0, len(self.posts), batch_size)] + logger.info(f"Processing {len(self.posts)} posts in {len(batches)} batches...\n") + + all_recommendations = {} + + for batch_num, batch in enumerate(batches, 1): + logger.info(f"Batch {batch_num}/{len(batches)}: Analyzing {len(batch)} posts...") + + recommendations_json = self.get_ai_recommendations(batch, self.analyze_fields) + + if not recommendations_json: + logger.error(f" Failed to get recommendations for batch {batch_num}") + continue + + recommendations = self.parse_recommendations(recommendations_json) + + for rec in recommendations: + all_recommendations[str(rec.get('post_id', ''))] = rec + + logger.info(f" ✓ Got {len(recommendations)} recommendations") + + logger.info(f"\n✓ Analysis complete!") + logger.info(f" Total recommendations: {len(all_recommendations)}") + logger.info(f" API calls: {self.api_calls}") + logger.info(f" Estimated cost: ${self.ai_cost:.4f}") + + # Map recommendations to posts + for post in self.posts: + post_id = str(post['post_id']) + if post_id in all_recommendations: + rec = all_recommendations[post_id] + + # Add only requested fields + if 'title' in self.analyze_fields: + post['proposed_title'] = rec.get('proposed_title', post.get('title', '')) + post['title_reason'] = rec.get('title_reason', '') + + if 'meta_description' in self.analyze_fields: + post['proposed_meta_description'] = rec.get('proposed_meta_description', post.get('meta_description', '')) + post['meta_reason'] = rec.get('meta_reason', '') + + if 'categories' in self.analyze_fields: + post['proposed_category'] = rec.get('proposed_category', post.get('categories', '')) + post['category_reason'] = rec.get('category_reason', '') + + if 'site' in self.analyze_fields: + post['proposed_site'] = rec.get('proposed_site', post.get('site', '')) + post['site_reason'] = rec.get('site_reason', '') + + # Common fields + post['ai_confidence'] = rec.get('confidence', 'Medium') + post['ai_priority'] = rec.get('priority', 'Medium') + else: + # Add empty fields for consistency + if 'title' in self.analyze_fields: + post['proposed_title'] = post.get('title', '') + post['title_reason'] = 'No AI recommendation' + + if 'meta_description' in self.analyze_fields: + post['proposed_meta_description'] = post.get('meta_description', '') + post['meta_reason'] = 'No AI recommendation' + + if 'categories' in self.analyze_fields: + post['proposed_category'] = post.get('categories', '') + post['category_reason'] = 'No AI recommendation' + + if 'site' in self.analyze_fields: + post['proposed_site'] = post.get('site', '') + post['site_reason'] = 'No AI recommendation' + + post['ai_confidence'] = 'Unknown' + post['ai_priority'] = 'Medium' + + self.analyzed_posts.append(post) + + return len(self.analyzed_posts) > 0 + + def export_results(self, output_file: Optional[str] = None, update_input: bool = False) -> str: + """ + Export results to CSV. + + Args: + output_file: Custom output path + update_input: If True, update the input CSV file (creates backup) + + Returns: + Path to exported file + """ + if update_input: + # Create backup of original file + backup_file = self.csv_file.parent / f"{self.csv_file.stem}_backup_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv" + import shutil + shutil.copy2(self.csv_file, backup_file) + logger.info(f"✓ Created backup: {backup_file}") + + output_file = self.csv_file + elif not output_file: + output_dir = Path(__file__).parent.parent / 'output' + output_dir.mkdir(parents=True, exist_ok=True) + timestamp = datetime.now().strftime('%Y%m%d_%H%M%S') + output_file = output_dir / f'analyzed_posts_{timestamp}.csv' + + output_file = Path(output_file) + output_file.parent.mkdir(parents=True, exist_ok=True) + + if not self.analyzed_posts: + logger.error("No analyzed posts to export") + return "" + + # Build fieldnames - original fields + new fields + original_fields = list(self.analyzed_posts[0].keys()) + + # Determine which new fields were added + new_fields = [] + if 'title' in self.analyze_fields: + new_fields.extend(['proposed_title', 'title_reason']) + if 'meta_description' in self.analyze_fields: + new_fields.extend(['proposed_meta_description', 'meta_reason']) + if 'categories' in self.analyze_fields: + new_fields.extend(['proposed_category', 'category_reason']) + if 'site' in self.analyze_fields: + new_fields.extend(['proposed_site', 'site_reason']) + + new_fields.extend(['ai_confidence', 'ai_priority']) + + fieldnames = original_fields + new_fields + + logger.info(f"\nExporting results to: {output_file}") + + with open(output_file, 'w', newline='', encoding='utf-8') as f: + writer = csv.DictWriter(f, fieldnames=fieldnames) + writer.writeheader() + writer.writerows(self.analyzed_posts) + + logger.info(f"✓ Exported {len(self.analyzed_posts)} posts") + return str(output_file) + + def run(self, output_file: Optional[str] = None, update_input: bool = False, batch_size: int = 10) -> str: + """Run complete analysis.""" + if not self.load_csv(): + sys.exit(1) + + if not self.analyze_posts(batch_size=batch_size): + logger.error("Failed to analyze posts") + sys.exit(1) + + return self.export_results(output_file=output_file, update_input=update_input) + + +def main(): + """Main entry point with argument parsing.""" + import argparse + + parser = argparse.ArgumentParser( + description='Enhanced AI analyzer with selective field analysis' + ) + parser.add_argument('csv_file', help='Input CSV file') + parser.add_argument('--output', '-o', help='Output CSV file (default: creates new file in output/)') + parser.add_argument('--update', '-u', action='store_true', help='Update input CSV file (creates backup)') + parser.add_argument('--fields', '-f', nargs='+', + choices=['title', 'meta_description', 'categories', 'site'], + help='Fields to analyze (default: all fields)') + parser.add_argument('--batch-size', type=int, default=10, help='Batch size for AI analysis') + + args = parser.parse_args() + + analyzer = EnhancedPostAnalyzer(args.csv_file, analyze_fields=args.fields) + output_file = analyzer.run( + output_file=args.output, + update_input=args.update, + batch_size=args.batch_size + ) + + logger.info(f"\n✓ Analysis complete! Results saved to: {output_file}") + + +if __name__ == '__main__': + main() diff --git a/src/seo/cli.py b/src/seo/cli.py index 04b269f..dd05e49 100644 --- a/src/seo/cli.py +++ b/src/seo/cli.py @@ -41,6 +41,11 @@ Examples: parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output') parser.add_argument('--dry-run', action='store_true', help='Show what would be done') parser.add_argument('--top-n', type=int, default=10, help='Number of top posts for AI analysis') + parser.add_argument('--fields', '-f', nargs='+', + choices=['title', 'meta_description', 'categories', 'site'], + help='Fields to analyze (for analyze command)') + parser.add_argument('--update', '-u', action='store_true', help='Update input file (creates backup)') + parser.add_argument('--output', '-o', help='Output file path') args = parser.parse_args() @@ -65,6 +70,7 @@ Examples: 'recategorize': cmd_recategorize, 'seo_check': cmd_seo_check, 'categories': cmd_categories, + 'category_propose': cmd_category_propose, 'approve': cmd_approve, 'full_pipeline': cmd_full_pipeline, 'status': cmd_status, @@ -110,7 +116,33 @@ def cmd_analyze(app, args): return 0 csv_file = args.args[0] if args.args else None - app.analyze(csv_file) + + # Use enhanced analyzer if fields are specified or update flag is set + if args.fields or args.update: + from pathlib import Path + import sys + scripts_dir = Path(__file__).parent.parent.parent / 'scripts' + sys.path.insert(0, str(scripts_dir)) + + from enhanced_analyzer import EnhancedPostAnalyzer + + if not csv_file: + csv_file = app._find_latest_export() + + if not csv_file: + print("❌ No CSV file found. Provide one or run export first.") + return 1 + + print(f"Using enhanced analyzer with fields: {args.fields or 'all'}") + analyzer = EnhancedPostAnalyzer(csv_file, analyze_fields=args.fields) + output_file = analyzer.run( + output_file=args.output, + update_input=args.update + ) + print(f"✅ Analysis completed! Results: {output_file}") + else: + app.analyze(csv_file) + return 0 @@ -145,6 +177,37 @@ def cmd_categories(app, args): return 0 +def cmd_category_propose(app, args): + """Propose categories for posts.""" + if args.dry_run: + print("Would propose categories for posts using AI") + return 0 + + csv_file = args.args[0] if args.args else None + + if not csv_file: + csv_file = app._find_latest_export() + + if not csv_file: + print("❌ No CSV file found. Provide one or run export first.") + print(" Usage: seo category_propose ") + return 1 + + from pathlib import Path + import sys + scripts_dir = Path(__file__).parent.parent.parent / 'scripts' + sys.path.insert(0, str(scripts_dir)) + + from category_proposer import CategoryProposer + + print(f"Proposing categories for: {csv_file}") + proposer = CategoryProposer(csv_file) + output_file = proposer.run(output_file=args.output) + + print(f"✅ Category proposals saved to: {output_file}") + return 0 + + def cmd_approve(app, args): """Approve recommendations.""" if args.dry_run: @@ -192,10 +255,13 @@ SEO Automation CLI - Available Commands Basic Commands: export Export all posts from WordPress sites - analyze [csv_file] Analyze posts with AI (optional CSV input) - recategorize [csv_file] Recategorize posts with AI (optional CSV input) + analyze [csv_file] Analyze posts with AI + analyze -f title categories Analyze specific fields only + analyze -u Update input CSV with new columns + recategorize [csv_file] Recategorize posts with AI seo_check Check SEO quality of titles/descriptions - categories Manage categories across all sites + categories Manage categories across sites + category_propose [csv] Propose categories based on content approve [files...] Review and approve recommendations full_pipeline Run complete workflow: export → analyze → seo_check @@ -207,12 +273,18 @@ Options: --verbose, -v Enable verbose logging --dry-run Show what would be done without doing it --top-n N Number of top posts for AI analysis (default: 10) + --fields, -f Fields to analyze: title, meta_description, categories, site + --update, -u Update input CSV file (creates backup) + --output, -o Output file path Examples: seo export seo analyze seo analyze output/all_posts_2026-02-16.csv - seo approve output/category_assignments_*.csv + seo analyze -f title categories + seo analyze -u -f meta_description + seo category_propose + seo approve output/category_proposals_*.csv seo full_pipeline seo status """)