🔍 Code Extractor

function main_v1

Maturity: 49

Orchestrates the conversion of an improved markdown file containing warranty disclosures into multiple tabular formats (CSV, Excel, Word) with timestamp-based file naming.

File:
/tf/active/vicechatdev/improved_convert_disclosures_to_table.py
Lines:
421 - 480
Complexity:
moderate

Purpose

This function serves as the main entry point for processing Project Victoria warranty disclosure data. It reads a markdown file, extracts warranty information and references, generates multiple output formats (summary CSV, detailed CSV, Excel workbook, Word document), and provides a comprehensive summary of the conversion process. The function handles file existence validation, error logging, and gracefully degrades when optional dependencies are unavailable.

Source Code

def main():
    """Main function to convert improved markdown to tabular formats."""
    # Input and output paths
    input_file = Path('/tf/active/project_victoria_disclosures_improved.md')
    output_dir = Path('/tf/active')
    
    # Check if input file exists
    if not input_file.exists():
        logger.error(f"Input file not found: {input_file}")
        return
    
    # Read markdown content
    logger.info(f"Reading improved markdown file: {input_file}")
    with open(input_file, 'r', encoding='utf-8') as f:
        content = f.read()
    
    # Extract warranty data with references
    logger.info("Extracting warranty data from improved format...")
    warranties, references_section = extract_warranty_data_improved(content)
    
    if not warranties:
        logger.error("No warranties extracted from markdown file")
        return
    
    # Generate timestamp for output files
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    
    # Create CSV report
    csv_file = output_dir / f'project_victoria_disclosures_improved_table_{timestamp}.csv'
    create_csv_report_improved(warranties, csv_file)
    
    # Create Excel report
    excel_file = output_dir / f'project_victoria_disclosures_improved_table_{timestamp}.xlsx'
    excel_created = create_excel_report_improved(warranties, references_section, excel_file)
    
    # Create Word report
    word_file = output_dir / f'project_victoria_disclosures_improved_{timestamp}.docx'
    word_created = create_word_report_improved(warranties, references_section, word_file)
    
    # Print summary
    print("\n" + "="*70)
    print("PROJECT VICTORIA - IMPROVED DISCLOSURE TABLE CONVERSION COMPLETE")
    print("="*70)
    print(f"Total warranties processed: {len(warranties)}")
    if references_section:
        ref_count = len(re.findall(r'\*\*\[(\d+)\]\*\*', references_section))
        print(f"Total references processed: {ref_count}")
    print(f"Output files created:")
    print(f"  - Summary CSV: {csv_file}")
    print(f"  - Detailed CSV: {csv_file.with_name(csv_file.stem + '_detailed.csv')}")
    if excel_created:
        print(f"  - Excel report: {excel_file}")
    else:
        print(f"  - Excel report: SKIPPED (openpyxl not available)")
    if word_created:
        print(f"  - Word report: {word_file}")
    else:
        print(f"  - Word report: SKIPPED (python-docx not available)")
    print("\nFiles are ready for review and analysis!")
    print("="*70)

Return Value

Returns None. The function performs side effects by creating output files and printing status information to console. Early returns occur if the input file is not found or if no warranties are extracted.

Dependencies

  • re
  • csv
  • pandas
  • pathlib
  • html
  • logging
  • datetime
  • openpyxl
  • python-docx

Required Imports

import re
import csv
import pandas as pd
from pathlib import Path
import html
import logging
from datetime import datetime

Conditional/Optional Imports

These imports are only needed under specific conditions:

from docx import Document

Condition: only if creating Word document reports (function degrades gracefully if unavailable)

Optional
from docx.shared import Inches

Condition: only if creating Word document reports with formatted tables

Optional
from docx.enum.style import WD_STYLE_TYPE

Condition: only if creating Word document reports with custom styles

Optional
from docx.enum.text import WD_PARAGRAPH_ALIGNMENT

Condition: only if creating Word document reports with text alignment

Optional
import openpyxl

Condition: only if creating Excel reports (function degrades gracefully if unavailable)

Optional

Usage Example

# Ensure logger is configured
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Ensure required helper functions are defined
# (extract_warranty_data_improved, create_csv_report_improved, etc.)

# Create input file at expected location
from pathlib import Path
input_file = Path('/tf/active/project_victoria_disclosures_improved.md')
input_file.parent.mkdir(parents=True, exist_ok=True)

# Run the main function
if __name__ == '__main__':
    main()

# Output files will be created in /tf/active/ with timestamps:
# - project_victoria_disclosures_improved_table_YYYYMMDD_HHMMSS.csv
# - project_victoria_disclosures_improved_table_YYYYMMDD_HHMMSS_detailed.csv
# - project_victoria_disclosures_improved_table_YYYYMMDD_HHMMSS.xlsx
# - project_victoria_disclosures_improved_YYYYMMDD_HHMMSS.docx

Best Practices

  • Ensure the input markdown file exists at the hardcoded path before calling this function
  • Verify write permissions on the output directory '/tf/active' before execution
  • Configure the module-level logger before calling main() to capture all log messages
  • Install optional dependencies (openpyxl, python-docx) for full functionality, though the function will work without them
  • The function uses hardcoded file paths - consider refactoring to accept parameters for production use
  • Review generated files promptly as timestamps prevent filename collisions but can create many files
  • Ensure all helper functions (extract_warranty_data_improved, create_csv_report_improved, etc.) are properly defined in the module
  • The function performs early returns on errors - check logs if output files are not created
  • Consider wrapping the function call in try-except blocks for production environments to handle unexpected errors

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function main_v5 94.2% similar

    Converts a markdown file containing warranty disclosure data into multiple tabular formats (CSV, Excel, Word) with timestamped output files.

    From: /tf/active/vicechatdev/convert_disclosures_to_table.py
  • function main_v2 82.5% similar

    Main orchestration function that reads an improved markdown file and converts it to an enhanced Word document with comprehensive formatting, including table of contents, proper heading hierarchy, and bibliography.

    From: /tf/active/vicechatdev/enhanced_word_converter_fixed.py
  • function create_enhanced_word_document 77.0% similar

    Converts markdown-formatted warranty disclosure content into a formatted Microsoft Word document with hierarchical headings, styled text, lists, and special formatting for block references.

    From: /tf/active/vicechatdev/improved_word_converter.py
  • function main_v12 76.8% similar

    Main entry point function that reads a markdown file, converts it to an enhanced Word document with preserved heading structure, and saves it with a timestamped filename.

    From: /tf/active/vicechatdev/improved_word_converter.py
  • function create_enhanced_word_document_v1 75.2% similar

    Converts markdown content into a formatted Microsoft Word document with proper styling, table of contents, warranty sections, and reference handling for Project Victoria warranty disclosures.

    From: /tf/active/vicechatdev/enhanced_word_converter_fixed.py
← Back to Browse