🔍 Code Extractor

function parse_log_line

Maturity: 44

Parses a structured log line string and extracts timestamp, logger name, log level, and message components into a dictionary.

File:
/tf/active/vicechatdev/SPFCsync/monitor.py
Lines:
15 - 34
Complexity:
simple

Purpose

This function is designed to parse log lines that follow a specific format (timestamp - logger_name - level - message) and convert them into structured data. It's useful for log analysis, monitoring systems, and log aggregation tools where raw log strings need to be converted into queryable data structures. The function handles malformed lines gracefully by returning None when the pattern doesn't match or timestamp parsing fails.

Source Code

def parse_log_line(line):
    """Parse a log line and extract information."""
    # Expected format: timestamp - name - level - message
    pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) - (.*?) - (\w+) - (.*)'
    match = re.match(pattern, line.strip())
    
    if match:
        timestamp_str, logger_name, level, message = match.groups()
        try:
            timestamp = datetime.strptime(timestamp_str, '%Y-%m-%d %H:%M:%S,%f')
            return {
                'timestamp': timestamp,
                'logger': logger_name,
                'level': level,
                'message': message
            }
        except ValueError:
            pass
    
    return None

Parameters

Name Type Default Kind
line - - positional_or_keyword

Parameter Details

line: A string representing a single log line. Expected format: 'YYYY-MM-DD HH:MM:SS,mmm - logger_name - LEVEL - message'. The function will strip whitespace from the line before processing. Can be any string, but will only successfully parse if it matches the expected log format.

Return Value

Returns a dictionary with keys 'timestamp' (datetime object), 'logger' (string), 'level' (string), and 'message' (string) if the line matches the expected format and timestamp is valid. Returns None if the line doesn't match the pattern or if timestamp parsing fails. The timestamp is converted from string to a datetime object using the format '%Y-%m-%d %H:%M:%S,%f'.

Dependencies

  • re
  • datetime

Required Imports

import re
from datetime import datetime

Usage Example

import re
from datetime import datetime

def parse_log_line(line):
    pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) - (.*?) - (\w+) - (.*)'
    match = re.match(pattern, line.strip())
    
    if match:
        timestamp_str, logger_name, level, message = match.groups()
        try:
            timestamp = datetime.strptime(timestamp_str, '%Y-%m-%d %H:%M:%S,%f')
            return {
                'timestamp': timestamp,
                'logger': logger_name,
                'level': level,
                'message': message
            }
        except ValueError:
            pass
    
    return None

# Example usage
log_line = '2024-01-15 14:30:45,123 - myapp.module - ERROR - Connection timeout'
result = parse_log_line(log_line)
if result:
    print(f"Timestamp: {result['timestamp']}")
    print(f"Logger: {result['logger']}")
    print(f"Level: {result['level']}")
    print(f"Message: {result['message']}")
else:
    print("Failed to parse log line")

# Example with invalid line
invalid_line = 'This is not a valid log line'
result = parse_log_line(invalid_line)
print(result)  # Output: None

Best Practices

  • Always check if the return value is None before accessing dictionary keys to avoid AttributeError
  • The function expects milliseconds in the timestamp (3 digits after comma), ensure your log format matches this
  • The regex pattern expects the logger name to be non-greedy (.*?) to avoid capturing the level and message
  • The function strips whitespace from input, so leading/trailing spaces are handled automatically
  • For batch processing of log files, consider wrapping this function in error handling to continue processing even if individual lines fail
  • The level field is expected to be a word character sequence (\w+), typically values like DEBUG, INFO, WARNING, ERROR, CRITICAL
  • If you need to parse logs with different formats, consider parameterizing the regex pattern or creating format-specific variants

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function parse_references_section 44.9% similar

    Parses a formatted references section string and extracts structured data including reference numbers, sources, and content previews using regular expressions.

    From: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py
  • function main 42.3% similar

    Command-line interface function that orchestrates pattern-based extraction of poultry flock data, including data loading, pattern classification, geocoding, and export functionality.

    From: /tf/active/vicechatdev/pattern_based_extraction.py
  • function analyze_logs 40.8% similar

    Parses and analyzes log files to extract synchronization statistics, error counts, and performance metrics for a specified time period.

    From: /tf/active/vicechatdev/SPFCsync/monitor.py
  • function tail_logs 39.2% similar

    Reads and displays the last N lines from a specified log file, with error handling for missing files and read failures.

    From: /tf/active/vicechatdev/SPFCsync/monitor.py
  • function main_v6 38.4% similar

    Command-line interface function that orchestrates the generation of meeting minutes from a transcript file using either GPT-4o or Gemini LLM models.

    From: /tf/active/vicechatdev/advanced_meeting_minutes_generator.py
← Back to Browse