generate_action_report - Code Extractor

function generate_action_report

Maturity: 44

Generates a comprehensive corrective action report for data quality issues in treatment records, categorizing actions by urgency and providing impact assessment.

File:
/tf/active/vicechatdev/data_quality_dashboard.py

Lines:
323 - 373

Complexity:
moderate

Purpose

This function analyzes data quality issues in veterinary treatment records and produces a formatted console report. It identifies immediate, short-term, and long-term corrective actions needed to address timing issues, invalid dates (particularly 1900-01-01 errors), and treatments recorded outside flock lifespans. The report includes specific recommendations, affected flock counts, error rates, and business impact assessments to guide data quality improvement efforts.

Source Code

def generate_action_report(before_start, after_end, severe_cases, flocks_issues):
    """Generate a corrective action report."""
    print("\nCORRECTIVE ACTION REPORT")
    print("=" * 40)
    
    # Immediate actions
    print("IMMEDIATE ACTIONS (Within 1 week):")
    
    # 1900 date fixes
    errors_1900 = before_start[before_start['AdministeredDate'].dt.year == 1900]
    if len(errors_1900) > 0:
        print(f"1. Fix {len(errors_1900)} treatments with 1900-01-01 dates")
        print("   Action: Update AdministeredDate to correct values")
        print("   Affected flocks:")
        for flock in errors_1900['FlockCD'].unique():
            print(f"     - {flock}")
    
    # Perfect timing issue flocks
    perfect_issues = flocks_issues[flocks_issues['TimingIssueRate'] == 1.0]
    print(f"\n2. Review {len(perfect_issues)} flocks with 100% timing issues")
    print("   Action: Check if flock dates or treatment dates are incorrect")
    
    # Short-term actions
    print(f"\nSHORT-TERM ACTIONS (Within 1 month):")
    print("1. Implement data validation rules")
    print("   - Treatment dates must be within flock lifespan ± 7 days")
    print("   - Flag dates before 2000 or after current date + 1 year")
    print("   - Require confirmation for treatments outside normal range")
    
    extreme_future = after_end[after_end['DaysAfterEnd'] > 365]
    if len(extreme_future) > 0:
        print(f"\n2. Review {len(extreme_future)} treatments >1 year after flock end")
        print("   Action: Verify if these are data entry errors")
    
    # Long-term actions
    print(f"\nLONG-TERM ACTIONS (Within 3 months):")
    print("1. Implement automated data quality monitoring")
    print("2. Create monthly data quality reports")
    print("3. Train staff on proper date entry procedures")
    print("4. Review and update data entry interfaces")
    
    # Cost/benefit analysis
    total_issues = len(before_start) + len(after_end)
    total_treatments = 247640  # From previous analysis
    error_rate = (total_issues / total_treatments) * 100
    
    print(f"\nIMPACT ASSESSMENT:")
    print(f"- Current error rate: {error_rate:.2f}% of all treatments")
    print(f"- Data quality improvement potential: High")
    print(f"- Estimated effort: Medium (primarily data validation setup)")
    print(f"- Business impact: Improved analysis accuracy and regulatory compliance")

Parameters

Name	Type	Default	Kind
`before_start`	-	-	positional_or_keyword
`after_end`	-	-	positional_or_keyword
`severe_cases`	-	-	positional_or_keyword
`flocks_issues`	-	-	positional_or_keyword

Parameter Details

before_start: A pandas DataFrame containing treatment records that occurred before the flock start date. Must have columns 'AdministeredDate' (datetime) and 'FlockCD' (flock identifier). Used to identify and report on pre-start timing issues.

after_end: A pandas DataFrame containing treatment records that occurred after the flock end date. Must have a 'DaysAfterEnd' column (numeric) indicating how many days after flock end the treatment occurred. Used to identify post-end timing issues and extreme future dates.

severe_cases: A pandas DataFrame containing severe data quality cases. This parameter is accepted but not currently used in the function implementation.

flocks_issues: A pandas DataFrame containing flock-level summary statistics. Must have columns 'TimingIssueRate' (float between 0-1) representing the proportion of timing issues per flock. Used to identify flocks with 100% timing issue rates.

Return Value

This function returns None. It produces output by printing a formatted report directly to the console (stdout). The report includes sections for immediate actions, short-term actions, long-term actions, and impact assessment.

Dependencies

pandas

Required Imports

import pandas as pd

Usage Example

import pandas as pd
from datetime import datetime

# Create sample data
before_start = pd.DataFrame({
    'AdministeredDate': pd.to_datetime(['1900-01-01', '2020-01-15', '1900-01-01']),
    'FlockCD': ['FLOCK001', 'FLOCK002', 'FLOCK003']
})

after_end = pd.DataFrame({
    'DaysAfterEnd': [10, 400, 500],
    'FlockCD': ['FLOCK004', 'FLOCK005', 'FLOCK006']
})

severe_cases = pd.DataFrame()  # Not used but required

flocks_issues = pd.DataFrame({
    'FlockCD': ['FLOCK001', 'FLOCK002', 'FLOCK003'],
    'TimingIssueRate': [1.0, 0.5, 1.0]
})

# Generate the report
generate_action_report(before_start, after_end, severe_cases, flocks_issues)

# Output will be printed to console with formatted sections

Best Practices

Ensure all input DataFrames have the required columns with correct data types before calling this function
The 'AdministeredDate' column in before_start must be datetime type for year extraction to work
The hardcoded total_treatments value (247,640) should be parameterized for reusability across different datasets
Consider redirecting output to a file or returning a string instead of printing directly for better testability
The severe_cases parameter is unused and could be removed or implemented in future versions
Validate that DaysAfterEnd values are numeric to avoid errors in the extreme_future calculation
This function is designed for console output; consider creating a version that returns structured data for programmatic use

Similar Components

AI-powered semantic similarity - components with related functionality:

function show_critical_errors 74.2% similar

Displays critical data quality errors in treatment records, focusing on date anomalies including 1900 dates, extreme future dates, and extreme past dates relative to flock lifecycles.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function create_data_quality_dashboard 70.8% similar

Creates an interactive command-line dashboard for analyzing data quality issues in treatment timing data, specifically focusing on treatments administered outside of flock lifecycle dates.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function create_data_quality_dashboard_v1 68.4% similar

Creates an interactive data quality dashboard for analyzing treatment timing issues in poultry flock management data by loading and processing CSV files containing timing anomalies.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function show_problematic_flocks 67.8% similar

Analyzes and displays problematic flocks from a dataset by identifying those with systematic timing issues in their treatment records, categorizing them by severity and volume.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function analyze_temporal_trends 66.6% similar

Analyzes and prints temporal trends in timing issues for treatments that occur before flock start dates or after flock end dates, breaking down occurrences by year and month.
From: /tf/active/vicechatdev/data_quality_dashboard.py

← Back to Browse

Assistant

Hi! I can help improve this code. Tell me what you'd like to enhance (e.g., "add error handling", "optimize performance", "improve readability", "add type hints").

Code Comparison

Original Code

                            def generate_action_report(before_start, after_end, severe_cases, flocks_issues):
    """Generate a corrective action report."""
    print("\nCORRECTIVE ACTION REPORT")
    print("=" * 40)
    
    # Immediate actions
    print("IMMEDIATE ACTIONS (Within 1 week):")
    
    # 1900 date fixes
    errors_1900 = before_start[before_start['AdministeredDate'].dt.year == 1900]
    if len(errors_1900) > 0:
        print(f"1. Fix {len(errors_1900)} treatments with 1900-01-01 dates")
        print("   Action: Update AdministeredDate to correct values")
        print("   Affected flocks:")
        for flock in errors_1900['FlockCD'].unique():
            print(f"     - {flock}")
    
    # Perfect timing issue flocks
    perfect_issues = flocks_issues[flocks_issues['TimingIssueRate'] == 1.0]
    print(f"\n2. Review {len(perfect_issues)} flocks with 100% timing issues")
    print("   Action: Check if flock dates or treatment dates are incorrect")
    
    # Short-term actions
    print(f"\nSHORT-TERM ACTIONS (Within 1 month):")
    print("1. Implement data validation rules")
    print("   - Treatment dates must be within flock lifespan ± 7 days")
    print("   - Flag dates before 2000 or after current date + 1 year")
    print("   - Require confirmation for treatments outside normal range")
    
    extreme_future = after_end[after_end['DaysAfterEnd'] > 365]
    if len(extreme_future) > 0:
        print(f"\n2. Review {len(extreme_future)} treatments >1 year after flock end")
        print("   Action: Verify if these are data entry errors")
    
    # Long-term actions
    print(f"\nLONG-TERM ACTIONS (Within 3 months):")
    print("1. Implement automated data quality monitoring")
    print("2. Create monthly data quality reports")
    print("3. Train staff on proper date entry procedures")
    print("4. Review and update data entry interfaces")
    
    # Cost/benefit analysis
    total_issues = len(before_start) + len(after_end)
    total_treatments = 247640  # From previous analysis
    error_rate = (total_issues / total_treatments) * 100
    
    print(f"\nIMPACT ASSESSMENT:")
    print(f"- Current error rate: {error_rate:.2f}% of all treatments")
    print(f"- Data quality improvement potential: High")
    print(f"- Estimated effort: Medium (primarily data validation setup)")
    print(f"- Business impact: Improved analysis accuracy and regulatory compliance")
                        

Improved Code

🔍 Code Extractor

function generate_action_report

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function show_critical_errors 74.2% similar

function create_data_quality_dashboard 70.8% similar

function create_data_quality_dashboard_v1 68.4% similar

function show_problematic_flocks 67.8% similar

function analyze_temporal_trends 66.6% similar

function generate_action_report

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function show_critical_errors 74.2% similar

function create_data_quality_dashboard 70.8% similar

function create_data_quality_dashboard_v1 68.4% similar

function show_problematic_flocks 67.8% similar

function analyze_temporal_trends 66.6% similar

✨ Improve Code: generate_action_report

Code Comparison