function main_v5
Converts a markdown file containing warranty disclosure data into multiple tabular formats (CSV, Excel, Word) with timestamped output files.
/tf/active/vicechatdev/convert_disclosures_to_table.py
373 - 429
moderate
Purpose
This function serves as the main entry point for a markdown-to-table conversion pipeline specifically designed for Project Victoria disclosure documents. It reads a markdown file, extracts warranty data using a helper function, and generates three types of reports: CSV (summary and detailed), Excel workbook, and Word document. The function includes error handling for missing files and missing data, logs progress, and provides a comprehensive summary of the conversion process.
Source Code
def main():
"""Main function to convert markdown to tabular formats."""
# Input and output paths
input_file = Path('/tf/active/project_victoria_disclosures.md')
output_dir = Path('/tf/active')
# Check if input file exists
if not input_file.exists():
logger.error(f"Input file not found: {input_file}")
return
# Read markdown content
logger.info(f"Reading markdown file: {input_file}")
with open(input_file, 'r', encoding='utf-8') as f:
content = f.read()
# Extract warranty data
logger.info("Extracting warranty data...")
warranties = extract_warranty_data(content)
if not warranties:
logger.error("No warranties extracted from markdown file")
return
# Generate timestamp for output files
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
# Create CSV report
csv_file = output_dir / f'project_victoria_disclosures_table_{timestamp}.csv'
create_csv_report(warranties, csv_file)
# Create Excel report
excel_file = output_dir / f'project_victoria_disclosures_table_{timestamp}.xlsx'
excel_created = create_excel_report(warranties, excel_file)
# Create Word report
word_file = output_dir / f'project_victoria_disclosures_{timestamp}.docx'
word_created = create_word_report(warranties, word_file)
# Print summary
print("\n" + "="*60)
print("PROJECT VICTORIA - DISCLOSURE TABLE CONVERSION COMPLETE")
print("="*60)
print(f"Total warranties processed: {len(warranties)}")
print(f"Output files created:")
print(f" - Summary CSV: {csv_file}")
print(f" - Detailed CSV: {csv_file.with_name(csv_file.stem + '_detailed.csv')}")
if excel_created:
print(f" - Excel report: {excel_file}")
else:
print(f" - Excel report: SKIPPED (openpyxl not available)")
if word_created:
print(f" - Word report: {word_file}")
else:
print(f" - Word report: SKIPPED (python-docx not available)")
print("\nFiles are ready for review and analysis!")
print("="*60)
Return Value
Returns None. The function performs side effects by creating output files and printing status messages to console. It may exit early (return None) if the input file is not found or if no warranties are extracted.
Dependencies
pathlibloggingdatetimerecsvpandashtmlopenpyxlpython-docx
Required Imports
import re
import csv
import pandas as pd
from pathlib import Path
import html
import logging
from datetime import datetime
Conditional/Optional Imports
These imports are only needed under specific conditions:
from docx import Document
Condition: required for Word document generation (create_word_report function)
Optionalfrom docx.shared import Inches
Condition: required for Word document formatting
Optionalfrom docx.enum.style import WD_STYLE_TYPE
Condition: required for Word document styling
Optionalimport openpyxl
Condition: required for Excel file generation (create_excel_report function)
OptionalUsage Example
# Ensure logger is configured
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# Define required helper functions (simplified examples)
def extract_warranty_data(content):
# Extract warranty data from markdown
return [{'id': '1', 'title': 'Sample Warranty'}]
def create_csv_report(warranties, csv_file):
# Create CSV report
pass
def create_excel_report(warranties, excel_file):
# Create Excel report
return True
def create_word_report(warranties, word_file):
# Create Word report
return True
# Ensure input file exists
from pathlib import Path
input_file = Path('/tf/active/project_victoria_disclosures.md')
input_file.parent.mkdir(parents=True, exist_ok=True)
input_file.write_text('# Sample markdown content')
# Run the main function
main()
Best Practices
- Ensure the input markdown file exists at the hardcoded path before calling this function
- Configure logging before calling main() to capture progress and error messages
- Verify that all helper functions (extract_warranty_data, create_csv_report, create_excel_report, create_word_report) are defined in the same module
- Install optional dependencies (openpyxl, python-docx) for full functionality; the function gracefully handles their absence
- Ensure sufficient disk space in the output directory as multiple files will be created
- The function uses hardcoded paths - consider refactoring to accept parameters for production use
- Review generated files promptly as timestamps prevent filename collisions but can lead to file accumulation
- The function does not clean up old output files - implement a cleanup strategy if running repeatedly
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function main_v1 94.2% similar
-
function main_v2 77.7% similar
-
function create_enhanced_word_document 76.1% similar
-
function create_enhanced_word_document_v1 75.7% similar
-
function main_v12 74.4% similar