🔍 Code Extractor

function parse_directory_listing_debug

Maturity: 46

A debug version of a directory listing parser that extracts and categorizes file entries with detailed console output for troubleshooting.

File:
/tf/active/vicechatdev/e-ink-llm/cloudtest/debug_rm_parsing.py
Lines:
8 - 66
Complexity:
moderate

Purpose

This function parses a directory listing string containing file entries in a specific format (hash:flags:uuid_component:type:size). It separates entries into child objects (pure UUIDs) and data components (UUIDs with file extensions), while printing detailed debug information about the parsing process. The function is specifically designed for debugging parsing logic, particularly for identifying .rm files and other component types.

Source Code

def parse_directory_listing_debug(content: str):
    """Debug version of parse_directory_listing"""
    result = {
        'child_objects': [],
        'data_components': []
    }
    
    lines = content.split('\n')
    if lines and lines[0].strip().isdigit():
        lines = lines[1:]  # Skip count line
    
    entry_pattern = r'^([a-f0-9]{64}):([0-9a-fA-F]+):([a-f0-9-/]+(?:\.[^:]+)?):(\d+):(\d+)$'
    
    print("Parsing lines:")
    for line in lines:
        line = line.strip()
        if not line:
            continue
            
        print(f"  Line: {repr(line)}")
        match = re.match(entry_pattern, line, re.IGNORECASE)
        if match:
            hash_val, flags, uuid_component, type_val, size_val = match.groups()
            print(f"    Match: hash={hash_val[:16]}..., uuid_component={repr(uuid_component)}")
            
            entry_info = {
                'hash': hash_val,
                'flags': flags,
                'uuid_component': uuid_component,
                'type': type_val,
                'size': int(size_val)
            }
            
            if '.' in uuid_component:
                # Data component (.content, .metadata, .pdf, .rm, etc.)
                component_type = uuid_component.split('.')[-1]
                print(f"      Initial component_type: {repr(component_type)}")
                
                if '/' in component_type:  # Handle .rm files like "uuid/filename.rm"
                    component_type = component_type.split('/')[-1]
                    print(f"      After slash handling: {repr(component_type)}")
                
                entry_info['component_type'] = component_type
                result['data_components'].append(entry_info)
                print(f"      Final component_type: {repr(component_type)}")
                
                # Test the matching condition
                if component_type == 'rm' or component_type.endswith('.rm'):
                    print(f"      ✅ WOULD MATCH .rm condition!")
                else:
                    print(f"      ❌ Would NOT match .rm condition")
            else:
                # Child object (pure UUID)
                result['child_objects'].append(entry_info)
                print(f"      Child object (no dot)")
        else:
            print(f"    NO MATCH for pattern")
    
    return result

Parameters

Name Type Default Kind
content str - positional_or_keyword

Parameter Details

content: A string containing a directory listing with entries in the format 'hash:flags:uuid_component:type:size', one per line. The first line may optionally be a count of entries (a single digit). Each entry consists of: a 64-character hex hash, hex flags, a UUID with optional file extension, a numeric type value, and a size value.

Return Value

Returns a dictionary with two keys: 'child_objects' (list of entries without file extensions, representing pure UUIDs) and 'data_components' (list of entries with file extensions like .content, .metadata, .pdf, .rm). Each entry in both lists is a dictionary containing 'hash', 'flags', 'uuid_component', 'type', and 'size' fields. Data component entries additionally include a 'component_type' field extracted from the file extension.

Dependencies

  • re

Required Imports

import re

Usage Example

import re

def parse_directory_listing_debug(content: str):
    # ... (function code as provided)
    pass

# Example usage
listing = """3
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa:0001:12345678-1234-1234-1234-123456789abc.content:1:1024
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb:0002:87654321-4321-4321-4321-cba987654321.rm:2:2048
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc:0003:abcdef12-3456-7890-abcd-ef1234567890:3:512"""

result = parse_directory_listing_debug(listing)
print(f"\nChild objects: {len(result['child_objects'])}")
print(f"Data components: {len(result['data_components'])}")
for component in result['data_components']:
    print(f"  - {component['component_type']}: {component['size']} bytes")

Best Practices

  • This is a debug function that prints extensive output to console - use only for troubleshooting, not in production code
  • The function expects a very specific entry format with 5 colon-separated fields; ensure input data matches the pattern
  • The regex pattern is case-insensitive for hex values but strict about format - validate input data structure before parsing
  • The function handles optional count lines at the beginning (single digit on first line)
  • Component types are extracted from file extensions after the last dot, with special handling for paths containing slashes
  • Empty lines in the input are automatically skipped
  • Consider using the non-debug version of this function in production to avoid performance overhead from print statements

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function show_directory_tree 57.7% similar

    Recursively displays a visual tree structure of a directory and its contents, showing files with sizes and subdirectories up to a specified depth.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/test_complete_suite.py
  • class FolderDebugger 55.7% similar

    A debugging utility class for analyzing and troubleshooting folder structure and visibility issues in the reMarkable cloud sync system.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/debug_gpt_in_folder.py
  • function test_extraction_debugging 54.2% similar

    A test function that validates the extraction debugging functionality of a DocumentProcessor by creating test files, simulating document extraction, and verifying debug log creation.

    From: /tf/active/vicechatdev/vice_ai/test_extraction_debug.py
  • function main_v85 52.5% similar

    Diagnostic function that debugs visibility issues with the 'gpt_in' folder in a reMarkable tablet's file system by analyzing folder metadata, document contents, and sync status.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/debug_gpt_in_folder.py
  • function test_root_finding 52.5% similar

    A test function that analyzes a reMarkable tablet replica database JSON file to identify and list all root-level entries (folders and documents without parent nodes).

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/debug_root.py
← Back to Browse