Add enhanced analysis with selective field analysis and category proposer

New Features: - Selective field analysis: Choose which fields to analyze (title, meta_description, categories, site) - In-place CSV updates: Update input CSV with new columns (automatic backup created) - Category proposer: Dedicated command for AI-powered category suggestions New Commands: - seo analyze -f title categories: Analyze specific fields only - seo analyze -u: Update input CSV with recommendations - seo category_propose: Propose categories based on content New Scripts: - enhanced_analyzer.py: Enhanced AI analyzer with selective analysis - category_proposer.py: Dedicated category proposal tool CLI Options: - --fields, -f: Specify fields to analyze - --update, -u: Update input CSV (creates backup) - --output, -o: Custom output file path Output Columns: - proposed_title, title_reason (for title analysis) - proposed_meta_description, meta_reason (for meta analysis) - proposed_category, category_reason (for category analysis) - proposed_site, site_reason (for site analysis) - ai_confidence, ai_priority (common to all) Documentation: - ENHANCED_ANALYSIS_GUIDE.md: Complete guide with examples Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-02-16 14:57:42 +01:00
parent 9d0a2c77eb
commit 1744d8e7db
4 changed files with 992 additions and 5 deletions
--- a/ENHANCED_ANALYSIS_GUIDE.md
+++ b/ENHANCED_ANALYSIS_GUIDE.md
@@ -0,0 +1,301 @@
 # Enhanced Analysis Guide
 ## Overview
 The SEO automation tool now provides enhanced analysis capabilities with:
 1. **Selective Field Analysis** - Choose which fields to analyze (title, meta description, categories, site)
 2. **In-place CSV Updates** - Update input CSV with new columns (with automatic backup)
 3. **Category Proposer** - Dedicated command for AI-powered category suggestions
 ## Commands
 ### 1. Enhanced Analysis (`seo analyze`)
 Analyze posts with AI and add recommendation columns to your CSV.
 #### Basic Usage
 ```bash
 # Analyze all fields (default behavior)
 ./seo analyze
 # Analyze specific CSV file
 ./seo analyze output/all_posts_2026-02-16.csv
 ```
 #### Selective Field Analysis
 ```bash
 # Analyze only titles
 ./seo analyze -f title
 # Analyze titles and categories
 ./seo analyze -f title categories
 # Analyze meta descriptions only
 ./seo analyze -f meta_description
 # Analyze all fields
 ./seo analyze -f title meta_description categories site
 ```
 #### Update Input CSV (In-place)
 ```bash
 # Update input CSV with new columns (creates backup automatically)
 ./seo analyze -u
 # Update with specific fields only
 ./seo analyze -u -f title meta_description
 # Specify custom output file
 ./seo analyze -o output/my_analysis.csv
 ```
 #### Output Columns
 Depending on selected fields, the following columns are added:
 **Title Analysis:**
 - `proposed_title` - AI-suggested improved title
 - `title_reason` - Explanation for title change
 **Meta Description Analysis:**
 - `proposed_meta_description` - AI-suggested meta description (120-160 chars)
 - `meta_reason` - Explanation for meta description change
 **Category Analysis:**
 - `proposed_category` - AI-suggested best category
 - `category_reason` - Explanation for category choice
 **Site Analysis:**
 - `proposed_site` - AI-suggested best site
 - `site_reason` - Explanation for site recommendation
 **Common Fields:**
 - `ai_confidence` - AI confidence level (High/Medium/Low)
 - `ai_priority` - Priority level (High/Medium/Low)
 ### 2. Category Proposer (`seo category_propose`)
 Dedicated command for AI-powered category suggestions based on post content.
 #### Usage
 ```bash
 # Propose categories for latest export
 ./seo category_propose
 # Propose categories for specific CSV
 ./seo category_propose output/all_posts_2026-02-16.csv
 # Save to custom output file
 ./seo category_propose -o output/my_category_proposals.csv
 ```
 #### Output Columns
 - `post_id` - Post identifier
 - `title` - Post title
 - `current_categories` - Current categories
 - `proposed_category` - AI-suggested category
 - `alternative_categories` - Alternative category suggestions
 - `category_reason` - Explanation for suggestion
 - `category_confidence` - Confidence level
 ## Examples
 ### Example 1: Analyze Titles Only
 ```bash
 # Analyze only titles for SEO optimization
 ./seo analyze -f title
 # Output: analyzed_posts_YYYYMMDD_HHMMSS.csv
 # Contains: original columns + proposed_title + title_reason + ai_confidence
 ```
 ### Example 2: Update CSV with Meta Descriptions
 ```bash
 # Update input CSV with proposed meta descriptions
 ./seo analyze -u -f meta_description
 # Creates:
 # - all_posts_2026-02-16_backup_YYYYMMDD_HHMMSS.csv (backup)
 # - all_posts_2026-02-16.csv (updated with new columns)
 ```
 ### Example 3: Full Category Analysis
 ```bash
 # Propose categories for all posts
 ./seo category_propose
 # Review proposals
 open output/category_proposals_*.csv
 # Apply approved categories manually in WordPress
 ```
 ### Example 4: Multi-Field Analysis
 ```bash
 # Analyze titles and categories together
 ./seo analyze -f title categories
 # Output includes:
 # - proposed_title, title_reason
 # - proposed_category, category_reason
 # - ai_confidence, ai_priority
 ```
 ### Example 5: Targeted Analysis with Output
 ```bash
 # Analyze meta descriptions, save to specific file
 ./seo analyze -f meta_description -o output/meta_analysis.csv
 ```
 ## Workflow Examples
 ### Workflow 1: SEO Title Optimization
 ```bash
 # 1. Export posts
 ./seo export
 # 2. Analyze titles only
 ./seo analyze -f title
 # 3. Review proposed titles
 open output/analyzed_posts_*.csv
 # 4. Manually update best titles in WordPress
 ```
 ### Workflow 2: Category Reorganization
 ```bash
 # 1. Export posts
 ./seo export
 # 2. Get category proposals
 ./seo category_propose
 # 3. Review proposals
 open output/category_proposals_*.csv
 # 4. Apply approved category changes
 ```
 ### Workflow 3: Complete SEO Audit
 ```bash
 # 1. Export posts
 ./seo export
 # 2. Analyze all fields
 ./seo analyze -f title meta_description categories site
 # 3. Review comprehensive analysis
 open output/analyzed_posts_*.csv
 # 4. Implement changes based on AI recommendations
 ```
 ### Workflow 4: Incremental Analysis
 ```bash
 # 1. Export posts
 ./seo export
 # 2. Analyze titles (fast, low cost)
 ./seo analyze -f title
 # 3. Later, analyze meta descriptions
 ./seo analyze -u -f meta_description
 # 4. Later, analyze categories
 ./seo analyze -u -f categories
 # Result: CSV progressively enriched with AI recommendations
 ```
 ## Cost Optimization
 ### Reduce API Costs
 ```bash
 # Analyze only needed fields (saves tokens)
 ./seo analyze -f title  # Cheaper than analyzing all fields
 # Use smaller batch sizes for better control
 # (edit script or use environment variable)
 # Analyze in stages
 ./seo analyze -f title
 ./seo analyze -u -f meta_description
 # Total cost similar, but better control over each step
 ```
 ### Token Usage by Field
 Approximate token usage per 100 posts:
 - **title**: ~500 tokens (lowest cost)
 - **meta_description**: ~800 tokens
 - **categories**: ~600 tokens
 - **site**: ~400 tokens (lowest cost)
 - **All fields**: ~2000 tokens (best value)
 ## Best Practices
 1. **Start Small**: Test with `-f title` first to see AI quality
 2. **Review Before Applying**: Always review AI suggestions before implementing
 3. **Use Backups**: The `-u` flag creates automatic backups
 4. **Batch Analysis**: Analyze related fields together for better context
 5. **Confidence Matters**: Pay attention to `ai_confidence` column
 6. **Iterative Process**: Enrich CSVs incrementally for better control
 ## Troubleshooting
 ### No CSV File Found
 ```bash
 # Error: No CSV file found
 # Solution: Run export first or provide file path
 ./seo export
 ./seo analyze
 # Or specify file directly
 ./seo analyze path/to/your/file.csv
 ```
 ### API Key Not Set
 ```bash
 # Error: OPENROUTER_API_KEY not set
 # Solution: Add to .env file
 echo "OPENROUTER_API_KEY=your_key_here" >> .env
 ```
 ### High API Costs
 ```bash
 # Reduce costs by analyzing fewer fields
 ./seo analyze -f title  # Instead of all fields
 # Or analyze in batches
 ./seo analyze -f title
 ./seo analyze -u -f meta_description
 ```
 ## File Formats
 ### Input CSV Requirements
 Must contain at minimum:
 - `post_id` - Unique identifier
 - `title` - Post title (for title analysis)
 - `meta_description` - Current meta (for meta analysis)
 - `categories` - Current categories (for category analysis)
 - `site` - Current site (for site analysis)
 - `content_preview` or `content` - Post content (recommended for all analyses)
 ### Output CSV Format
 Standard output includes:
 - All original columns
 - New `proposed_*` columns for analyzed fields
 - `*_reason` columns with explanations
 - `ai_confidence` and `ai_priority` columns
 ---
 **Version**: 1.0.0  
 **Last Updated**: 2026-02-16  
 **Related**: See ARCHITECTURE.md for system overview
--- a/scripts/category_proposer.py
+++ b/scripts/category_proposer.py
@@ -0,0 +1,239 @@
 #!/usr/bin/env python3
 """
 Category Proposer - AI-powered category suggestions
 Analyzes posts and proposes optimal categories based on content.
 """
 import csv
 import json
 import logging
 import sys
 from pathlib import Path
 from typing import Dict, List, Optional
 import requests
 from datetime import datetime
 from config import Config
 logger = logging.getLogger(__name__)
 class CategoryProposer:
    """Propose categories for posts using AI."""
    def __init__(self, csv_file: str):
        """Initialize proposer with CSV file."""
        self.csv_file = Path(csv_file)
        self.openrouter_api_key = Config.OPENROUTER_API_KEY
        self.ai_model = Config.AI_MODEL
        self.posts = []
        self.proposed_categories = []
        self.api_calls = 0
        self.ai_cost = 0.0
    def load_csv(self) -> bool:
        """Load posts from CSV."""
        logger.info(f"Loading CSV: {self.csv_file}")
        if not self.csv_file.exists():
            logger.error(f"CSV file not found: {self.csv_file}")
            return False
        try:
            with open(self.csv_file, 'r', encoding='utf-8') as f:
                reader = csv.DictReader(f)
                self.posts = list(reader)
            logger.info(f"✓ Loaded {len(self.posts)} posts")
            return True
        except Exception as e:
            logger.error(f"Error loading CSV: {e}")
            return False
    def get_category_proposals(self, batch: List[Dict]) -> Optional[str]:
        """Get AI category proposals for a batch of posts."""
        if not self.openrouter_api_key:
            logger.error("OPENROUTER_API_KEY not set")
            return None
        # Format posts for AI
        formatted = []
        for i, post in enumerate(batch, 1):
            text = f"{i}. ID: {post['post_id']}\n"
            text += f"   Title: {post.get('title', '')}\n"
            text += f"   Current Categories: {post.get('categories', '')}\n"
            if 'content_preview' in post:
                text += f"   Content: {post['content_preview'][:300]}...\n"
            formatted.append(text)
        posts_text = "\n".join(formatted)
        prompt = f"""Analyze these blog posts and propose optimal categories.
 {posts_text}
 For EACH post, provide:
 {{
  "post_id": <id>,
  "current_categories": "<current>",
  "proposed_category": "<best category>",
  "alternative_categories": ["<alt1>", "<alt2>"],
  "reason": "<brief explanation>",
  "confidence": "<High|Medium|Low>"
 }}
 Return ONLY a JSON array with one object per post."""
        try:
            logger.info(f"  Getting category proposals...")
            response = requests.post(
                "https://openrouter.ai/api/v1/chat/completions",
                headers={
                    "Authorization": f"Bearer {self.openrouter_api_key}",
                    "Content-Type": "application/json",
                },
                json={
                    "model": self.ai_model,
                    "messages": [{"role": "user", "content": prompt}],
                    "temperature": 0.3,
                },
                timeout=60
            )
            response.raise_for_status()
            result = response.json()
            self.api_calls += 1
            usage = result.get('usage', {})
            input_tokens = usage.get('prompt_tokens', 0)
            output_tokens = usage.get('completion_tokens', 0)
            self.ai_cost += (input_tokens * 3 + output_tokens * 15) / 1_000_000
            logger.info(f"  ✓ Got proposals (tokens: {input_tokens}+{output_tokens})")
            return result['choices'][0]['message']['content'].strip()
        except Exception as e:
            logger.error(f"Error getting proposals: {e}")
            return None
    def parse_proposals(self, proposals_json: str) -> List[Dict]:
        """Parse JSON proposals."""
        try:
            start_idx = proposals_json.find('[')
            end_idx = proposals_json.rfind(']') + 1
            if start_idx == -1 or end_idx == 0:
                return []
            return json.loads(proposals_json[start_idx:end_idx])
        except json.JSONDecodeError:
            return []
    def propose_categories(self, batch_size: int = 10) -> bool:
        """Propose categories for all posts."""
        logger.info("\n" + "="*70)
        logger.info("PROPOSING CATEGORIES WITH AI")
        logger.info("="*70 + "\n")
        batches = [self.posts[i:i + batch_size] for i in range(0, len(self.posts), batch_size)]
        logger.info(f"Processing {len(self.posts)} posts in {len(batches)} batches...\n")
        all_proposals = {}
        for batch_num, batch in enumerate(batches, 1):
            logger.info(f"Batch {batch_num}/{len(batches)}...")
            proposals_json = self.get_category_proposals(batch)
            if not proposals_json:
                continue
            proposals = self.parse_proposals(proposals_json)
            for prop in proposals:
                all_proposals[str(prop.get('post_id', ''))] = prop
            logger.info(f"  ✓ Got {len(proposals)} proposals")
        logger.info(f"\n✓ Proposals complete!")
        logger.info(f"  Total: {len(all_proposals)}")
        logger.info(f"  API calls: {self.api_calls}")
        logger.info(f"  Cost: ${self.ai_cost:.4f}")
        # Map proposals to posts
        for post in self.posts:
            post_id = str(post['post_id'])
            proposal = all_proposals.get(post_id, {})
            self.proposed_categories.append({
                **post,
                'proposed_category': proposal.get('proposed_category', post.get('categories', '')),
                'alternative_categories': ', '.join(proposal.get('alternative_categories', [])),
                'category_reason': proposal.get('reason', ''),
                'category_confidence': proposal.get('confidence', 'Medium'),
                'current_categories': post.get('categories', '')
            })
        return True
    def export_proposals(self, output_file: Optional[str] = None) -> str:
        """Export category proposals to CSV."""
        if not output_file:
            output_dir = Path(__file__).parent.parent / 'output'
            output_dir.mkdir(parents=True, exist_ok=True)
            timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
            output_file = output_dir / f'category_proposals_{timestamp}.csv'
        output_file = Path(output_file)
        output_file.parent.mkdir(parents=True, exist_ok=True)
        fieldnames = [
            'post_id', 'title', 'site', 'current_categories',
            'proposed_category', 'alternative_categories',
            'category_reason', 'category_confidence'
        ]
        logger.info(f"\nExporting to: {output_file}")
        with open(output_file, 'w', newline='', encoding='utf-8') as f:
            writer = csv.DictWriter(f, fieldnames=fieldnames, extrasaction='ignore')
            writer.writeheader()
            writer.writerows(self.proposed_categories)
        logger.info(f"✓ Exported {len(self.proposed_categories)} proposals")
        return str(output_file)
    def run(self, output_file: Optional[str] = None, batch_size: int = 10) -> str:
        """Run complete category proposal process."""
        if not self.load_csv():
            sys.exit(1)
        if not self.propose_categories(batch_size=batch_size):
            logger.error("Failed to propose categories")
            sys.exit(1)
        return self.export_proposals(output_file)
 def main():
    """Main entry point."""
    import argparse
    parser = argparse.ArgumentParser(
        description='AI-powered category proposer for blog posts'
    )
    parser.add_argument('csv_file', help='Input CSV file with posts')
    parser.add_argument('--output', '-o', help='Output CSV file')
    parser.add_argument('--batch-size', type=int, default=10, help='Batch size')
    args = parser.parse_args()
    proposer = CategoryProposer(args.csv_file)
    output_file = proposer.run(batch_size=args.batch_size)
    logger.info(f"\n✓ Category proposals saved to: {output_file}")
 if __name__ == '__main__':
    main()
--- a/scripts/enhanced_analyzer.py
+++ b/scripts/enhanced_analyzer.py
@@ -0,0 +1,375 @@
 #!/usr/bin/env python3
 """
 Enhanced AI Analyzer - Selective analysis with in-place updates
 Analyzes posts and updates CSV with AI recommendations for:
 - Title optimization
 - Meta description optimization
 - Category suggestions
 - Site placement recommendations
 """
 import csv
 import json
 import logging
 import sys
 from pathlib import Path
 from typing import Dict, List, Optional, Tuple
 import requests
 from datetime import datetime
 from config import Config
 logger = logging.getLogger(__name__)
 class EnhancedPostAnalyzer:
    """Enhanced analyzer with selective column analysis and in-place updates."""
    def __init__(self, csv_file: str, analyze_fields: Optional[List[str]] = None):
        """
        Initialize analyzer.
        Args:
            csv_file: Path to input CSV
            analyze_fields: List of fields to analyze ['title', 'meta_description', 'categories', 'site']
                           If None, analyzes all fields
        """
        self.csv_file = Path(csv_file)
        self.openrouter_api_key = Config.OPENROUTER_API_KEY
        self.ai_model = Config.AI_MODEL
        self.posts = []
        self.analyzed_posts = []
        self.api_calls = 0
        self.ai_cost = 0.0
        # Default: analyze all fields
        if analyze_fields is None:
            self.analyze_fields = ['title', 'meta_description', 'categories', 'site']
        else:
            self.analyze_fields = analyze_fields
        logger.info(f"Fields to analyze: {', '.join(self.analyze_fields)}")
    def load_csv(self) -> bool:
        """Load posts from CSV file."""
        logger.info(f"Loading CSV: {self.csv_file}")
        if not self.csv_file.exists():
            logger.error(f"CSV file not found: {self.csv_file}")
            return False
        try:
            with open(self.csv_file, 'r', encoding='utf-8') as f:
                reader = csv.DictReader(f)
                self.posts = list(reader)
            logger.info(f"✓ Loaded {len(self.posts)} posts from CSV")
            return True
        except Exception as e:
            logger.error(f"Error loading CSV: {e}")
            return False
    def get_ai_recommendations(self, batch: List[Dict], fields: List[str]) -> Optional[str]:
        """Get AI recommendations for specific fields."""
        if not self.openrouter_api_key:
            logger.error("OPENROUTER_API_KEY not set")
            return None
        # Format posts for AI
        formatted_posts = []
        for i, post in enumerate(batch, 1):
            post_text = f"{i}. POST ID: {post['post_id']}\n"
            post_text += f"   Site: {post.get('site', '')}\n"
            if 'title' in fields:
                post_text += f"   Title: {post.get('title', '')}\n"
            if 'meta_description' in fields:
                post_text += f"   Meta Description: {post.get('meta_description', '')}\n"
            if 'categories' in fields:
                post_text += f"   Categories: {post.get('categories', '')}\n"
            if 'content_preview' in post:
                post_text += f"   Content Preview: {post.get('content_preview', '')[:300]}...\n"
            formatted_posts.append(post_text)
        posts_text = "\n".join(formatted_posts)
        # Build prompt based on requested fields
        prompt_parts = ["Analyze these blog posts and provide recommendations.\n\n"]
        if 'site' in fields:
            prompt_parts.append("""Website Strategy:
 - mistergeek.net: High-value topics (VPN, Software, Gaming, General Tech, SEO, Content Marketing)
 - webscroll.fr: Torrenting, File-Sharing, Tracker guides
 - hellogeek.net: Low-traffic, experimental, off-brand content
 """)
        prompt_parts.append(posts_text)
        prompt_parts.append("\nFor EACH post, provide a JSON object with:\n{\n")
        if 'title' in fields:
            prompt_parts.append('  "proposed_title": "<Improved SEO title>",\n')
            prompt_parts.append('  "title_reason": "<Reason for title change>",\n')
        if 'meta_description' in fields:
            prompt_parts.append('  "proposed_meta_description": "<Improved meta description (120-160 chars)>",\n')
            prompt_parts.append('  "meta_reason": "<Reason for meta description change>",\n')
        if 'categories' in fields:
            prompt_parts.append('  "proposed_category": "<Best category>",\n')
            prompt_parts.append('  "category_reason": "<Reason for category change>",\n')
        if 'site' in fields:
            prompt_parts.append('  "proposed_site": "<Best site for this post>",\n')
            prompt_parts.append('  "site_reason": "<Reason for site recommendation>",\n')
        prompt_parts.append('  "confidence": "<High|Medium|Low>",\n')
        prompt_parts.append('  "priority": "<High|Medium|Low>"\n}')
        prompt_parts.append("\nReturn ONLY a JSON array of objects, one per post.")
        prompt = "".join(prompt_parts)
        try:
            logger.info(f"  Sending batch to AI for analysis...")
            response = requests.post(
                "https://openrouter.ai/api/v1/chat/completions",
                headers={
                    "Authorization": f"Bearer {self.openrouter_api_key}",
                    "Content-Type": "application/json",
                },
                json={
                    "model": self.ai_model,
                    "messages": [{"role": "user", "content": prompt}],
                    "temperature": 0.3,
                },
                timeout=60
            )
            response.raise_for_status()
            result = response.json()
            self.api_calls += 1
            # Track cost
            usage = result.get('usage', {})
            input_tokens = usage.get('prompt_tokens', 0)
            output_tokens = usage.get('completion_tokens', 0)
            self.ai_cost += (input_tokens * 3 + output_tokens * 15) / 1_000_000
            recommendations_text = result['choices'][0]['message']['content'].strip()
            logger.info(f"  ✓ Got recommendations (tokens: {input_tokens}+{output_tokens})")
            return recommendations_text
        except Exception as e:
            logger.error(f"Error getting AI recommendations: {e}")
            return None
    def parse_recommendations(self, recommendations_json: str) -> List[Dict]:
        """Parse JSON recommendations from AI."""
        try:
            start_idx = recommendations_json.find('[')
            end_idx = recommendations_json.rfind(']') + 1
            if start_idx == -1 or end_idx == 0:
                logger.error("Could not find JSON array in response")
                return []
            json_str = recommendations_json[start_idx:end_idx]
            recommendations = json.loads(json_str)
            return recommendations
        except json.JSONDecodeError as e:
            logger.error(f"Error parsing JSON recommendations: {e}")
            return []
    def analyze_posts(self, batch_size: int = 10) -> bool:
        """Analyze all posts in batches."""
        logger.info("\n" + "="*70)
        logger.info("ANALYZING POSTS WITH AI")
        logger.info("="*70 + "\n")
        batches = [self.posts[i:i + batch_size] for i in range(0, len(self.posts), batch_size)]
        logger.info(f"Processing {len(self.posts)} posts in {len(batches)} batches...\n")
        all_recommendations = {}
        for batch_num, batch in enumerate(batches, 1):
            logger.info(f"Batch {batch_num}/{len(batches)}: Analyzing {len(batch)} posts...")
            recommendations_json = self.get_ai_recommendations(batch, self.analyze_fields)
            if not recommendations_json:
                logger.error(f"  Failed to get recommendations for batch {batch_num}")
                continue
            recommendations = self.parse_recommendations(recommendations_json)
            for rec in recommendations:
                all_recommendations[str(rec.get('post_id', ''))] = rec
            logger.info(f"  ✓ Got {len(recommendations)} recommendations")
        logger.info(f"\n✓ Analysis complete!")
        logger.info(f"  Total recommendations: {len(all_recommendations)}")
        logger.info(f"  API calls: {self.api_calls}")
        logger.info(f"  Estimated cost: ${self.ai_cost:.4f}")
        # Map recommendations to posts
        for post in self.posts:
            post_id = str(post['post_id'])
            if post_id in all_recommendations:
                rec = all_recommendations[post_id]
                # Add only requested fields
                if 'title' in self.analyze_fields:
                    post['proposed_title'] = rec.get('proposed_title', post.get('title', ''))
                    post['title_reason'] = rec.get('title_reason', '')
                if 'meta_description' in self.analyze_fields:
                    post['proposed_meta_description'] = rec.get('proposed_meta_description', post.get('meta_description', ''))
                    post['meta_reason'] = rec.get('meta_reason', '')
                if 'categories' in self.analyze_fields:
                    post['proposed_category'] = rec.get('proposed_category', post.get('categories', ''))
                    post['category_reason'] = rec.get('category_reason', '')
                if 'site' in self.analyze_fields:
                    post['proposed_site'] = rec.get('proposed_site', post.get('site', ''))
                    post['site_reason'] = rec.get('site_reason', '')
                # Common fields
                post['ai_confidence'] = rec.get('confidence', 'Medium')
                post['ai_priority'] = rec.get('priority', 'Medium')
            else:
                # Add empty fields for consistency
                if 'title' in self.analyze_fields:
                    post['proposed_title'] = post.get('title', '')
                    post['title_reason'] = 'No AI recommendation'
                if 'meta_description' in self.analyze_fields:
                    post['proposed_meta_description'] = post.get('meta_description', '')
                    post['meta_reason'] = 'No AI recommendation'
                if 'categories' in self.analyze_fields:
                    post['proposed_category'] = post.get('categories', '')
                    post['category_reason'] = 'No AI recommendation'
                if 'site' in self.analyze_fields:
                    post['proposed_site'] = post.get('site', '')
                    post['site_reason'] = 'No AI recommendation'
                post['ai_confidence'] = 'Unknown'
                post['ai_priority'] = 'Medium'
            self.analyzed_posts.append(post)
        return len(self.analyzed_posts) > 0
    def export_results(self, output_file: Optional[str] = None, update_input: bool = False) -> str:
        """
        Export results to CSV.
        Args:
            output_file: Custom output path
            update_input: If True, update the input CSV file (creates backup)
        Returns:
            Path to exported file
        """
        if update_input:
            # Create backup of original file
            backup_file = self.csv_file.parent / f"{self.csv_file.stem}_backup_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv"
            import shutil
            shutil.copy2(self.csv_file, backup_file)
            logger.info(f"✓ Created backup: {backup_file}")
            output_file = self.csv_file
        elif not output_file:
            output_dir = Path(__file__).parent.parent / 'output'
            output_dir.mkdir(parents=True, exist_ok=True)
            timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
            output_file = output_dir / f'analyzed_posts_{timestamp}.csv'
        output_file = Path(output_file)
        output_file.parent.mkdir(parents=True, exist_ok=True)
        if not self.analyzed_posts:
            logger.error("No analyzed posts to export")
            return ""
        # Build fieldnames - original fields + new fields
        original_fields = list(self.analyzed_posts[0].keys())
        # Determine which new fields were added
        new_fields = []
        if 'title' in self.analyze_fields:
            new_fields.extend(['proposed_title', 'title_reason'])
        if 'meta_description' in self.analyze_fields:
            new_fields.extend(['proposed_meta_description', 'meta_reason'])
        if 'categories' in self.analyze_fields:
            new_fields.extend(['proposed_category', 'category_reason'])
        if 'site' in self.analyze_fields:
            new_fields.extend(['proposed_site', 'site_reason'])
        new_fields.extend(['ai_confidence', 'ai_priority'])
        fieldnames = original_fields + new_fields
        logger.info(f"\nExporting results to: {output_file}")
        with open(output_file, 'w', newline='', encoding='utf-8') as f:
            writer = csv.DictWriter(f, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerows(self.analyzed_posts)
        logger.info(f"✓ Exported {len(self.analyzed_posts)} posts")
        return str(output_file)
    def run(self, output_file: Optional[str] = None, update_input: bool = False, batch_size: int = 10) -> str:
        """Run complete analysis."""
        if not self.load_csv():
            sys.exit(1)
        if not self.analyze_posts(batch_size=batch_size):
            logger.error("Failed to analyze posts")
            sys.exit(1)
        return self.export_results(output_file=output_file, update_input=update_input)
 def main():
    """Main entry point with argument parsing."""
    import argparse
    parser = argparse.ArgumentParser(
        description='Enhanced AI analyzer with selective field analysis'
    )
    parser.add_argument('csv_file', help='Input CSV file')
    parser.add_argument('--output', '-o', help='Output CSV file (default: creates new file in output/)')
    parser.add_argument('--update', '-u', action='store_true', help='Update input CSV file (creates backup)')
    parser.add_argument('--fields', '-f', nargs='+', 
                       choices=['title', 'meta_description', 'categories', 'site'],
                       help='Fields to analyze (default: all fields)')
    parser.add_argument('--batch-size', type=int, default=10, help='Batch size for AI analysis')
    args = parser.parse_args()
    analyzer = EnhancedPostAnalyzer(args.csv_file, analyze_fields=args.fields)
    output_file = analyzer.run(
        output_file=args.output,
        update_input=args.update,
        batch_size=args.batch_size
    )
    logger.info(f"\n✓ Analysis complete! Results saved to: {output_file}")
 if __name__ == '__main__':
    main()
--- a/src/seo/cli.py
+++ b/src/seo/cli.py
@@ -41,6 +41,11 @@ Examples:
    parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
    parser.add_argument('--dry-run', action='store_true', help='Show what would be done')
    parser.add_argument('--top-n', type=int, default=10, help='Number of top posts for AI analysis')
    parser.add_argument('--fields', '-f', nargs='+', 
                       choices=['title', 'meta_description', 'categories', 'site'],
                       help='Fields to analyze (for analyze command)')
    parser.add_argument('--update', '-u', action='store_true', help='Update input file (creates backup)')
    parser.add_argument('--output', '-o', help='Output file path')
    args = parser.parse_args()
@@ -65,6 +70,7 @@ Examples:
        'recategorize': cmd_recategorize,
        'seo_check': cmd_seo_check,
        'categories': cmd_categories,
        'category_propose': cmd_category_propose,
        'approve': cmd_approve,
        'full_pipeline': cmd_full_pipeline,
        'status': cmd_status,
@@ -110,7 +116,33 @@ def cmd_analyze(app, args):
        return 0
    csv_file = args.args[0] if args.args else None
-    app.analyze(csv_file)
+    
    # Use enhanced analyzer if fields are specified or update flag is set
    if args.fields or args.update:
        from pathlib import Path
        import sys
        scripts_dir = Path(__file__).parent.parent.parent / 'scripts'
        sys.path.insert(0, str(scripts_dir))
        from enhanced_analyzer import EnhancedPostAnalyzer
        if not csv_file:
            csv_file = app._find_latest_export()
        if not csv_file:
            print("❌ No CSV file found. Provide one or run export first.")
            return 1
        print(f"Using enhanced analyzer with fields: {args.fields or 'all'}")
        analyzer = EnhancedPostAnalyzer(csv_file, analyze_fields=args.fields)
        output_file = analyzer.run(
            output_file=args.output,
            update_input=args.update
        )
        print(f"✅ Analysis completed! Results: {output_file}")
    else:
        app.analyze(csv_file)
    return 0
@@ -145,6 +177,37 @@ def cmd_categories(app, args):
    return 0
 def cmd_category_propose(app, args):
    """Propose categories for posts."""
    if args.dry_run:
        print("Would propose categories for posts using AI")
        return 0
    csv_file = args.args[0] if args.args else None
    if not csv_file:
        csv_file = app._find_latest_export()
    if not csv_file:
        print("❌ No CSV file found. Provide one or run export first.")
        print("   Usage: seo category_propose <csv_file>")
        return 1
    from pathlib import Path
    import sys
    scripts_dir = Path(__file__).parent.parent.parent / 'scripts'
    sys.path.insert(0, str(scripts_dir))
    from category_proposer import CategoryProposer
    print(f"Proposing categories for: {csv_file}")
    proposer = CategoryProposer(csv_file)
    output_file = proposer.run(output_file=args.output)
    print(f"✅ Category proposals saved to: {output_file}")
    return 0
 def cmd_approve(app, args):
    """Approve recommendations."""
    if args.dry_run:
@@ -192,10 +255,13 @@ SEO Automation CLI - Available Commands
 Basic Commands:
  export                    Export all posts from WordPress sites
-  analyze [csv_file]        Analyze posts with AI (optional CSV input)
+  analyze [csv_file]        Analyze posts with AI
-  recategorize [csv_file]   Recategorize posts with AI (optional CSV input)
+  analyze -f title categories  Analyze specific fields only
  analyze -u                Update input CSV with new columns
  recategorize [csv_file]   Recategorize posts with AI
  seo_check                 Check SEO quality of titles/descriptions
-  categories                Manage categories across all sites
+  categories                Manage categories across sites
  category_propose [csv]    Propose categories based on content
  approve [files...]        Review and approve recommendations
  full_pipeline             Run complete workflow: export → analyze → seo_check
@@ -207,12 +273,18 @@ Options:
  --verbose, -v             Enable verbose logging
  --dry-run                 Show what would be done without doing it
  --top-n N                 Number of top posts for AI analysis (default: 10)
  --fields, -f              Fields to analyze: title, meta_description, categories, site
  --update, -u              Update input CSV file (creates backup)
  --output, -o              Output file path
 Examples:
  seo export
  seo analyze
  seo analyze output/all_posts_2026-02-16.csv
-  seo approve output/category_assignments_*.csv
+  seo analyze -f title categories
  seo analyze -u -f meta_description
  seo category_propose
  seo approve output/category_proposals_*.csv
  seo full_pipeline
  seo status
    """)