function parse_directory_listing_debug
A debug version of a directory listing parser that extracts and categorizes file entries with detailed console output for troubleshooting.
/tf/active/vicechatdev/e-ink-llm/cloudtest/debug_rm_parsing.py
8 - 66
moderate
Purpose
This function parses a directory listing string containing file entries in a specific format (hash:flags:uuid_component:type:size). It separates entries into child objects (pure UUIDs) and data components (UUIDs with file extensions), while printing detailed debug information about the parsing process. The function is specifically designed for debugging parsing logic, particularly for identifying .rm files and other component types.
Source Code
def parse_directory_listing_debug(content: str):
"""Debug version of parse_directory_listing"""
result = {
'child_objects': [],
'data_components': []
}
lines = content.split('\n')
if lines and lines[0].strip().isdigit():
lines = lines[1:] # Skip count line
entry_pattern = r'^([a-f0-9]{64}):([0-9a-fA-F]+):([a-f0-9-/]+(?:\.[^:]+)?):(\d+):(\d+)$'
print("Parsing lines:")
for line in lines:
line = line.strip()
if not line:
continue
print(f" Line: {repr(line)}")
match = re.match(entry_pattern, line, re.IGNORECASE)
if match:
hash_val, flags, uuid_component, type_val, size_val = match.groups()
print(f" Match: hash={hash_val[:16]}..., uuid_component={repr(uuid_component)}")
entry_info = {
'hash': hash_val,
'flags': flags,
'uuid_component': uuid_component,
'type': type_val,
'size': int(size_val)
}
if '.' in uuid_component:
# Data component (.content, .metadata, .pdf, .rm, etc.)
component_type = uuid_component.split('.')[-1]
print(f" Initial component_type: {repr(component_type)}")
if '/' in component_type: # Handle .rm files like "uuid/filename.rm"
component_type = component_type.split('/')[-1]
print(f" After slash handling: {repr(component_type)}")
entry_info['component_type'] = component_type
result['data_components'].append(entry_info)
print(f" Final component_type: {repr(component_type)}")
# Test the matching condition
if component_type == 'rm' or component_type.endswith('.rm'):
print(f" ✅ WOULD MATCH .rm condition!")
else:
print(f" ❌ Would NOT match .rm condition")
else:
# Child object (pure UUID)
result['child_objects'].append(entry_info)
print(f" Child object (no dot)")
else:
print(f" NO MATCH for pattern")
return result
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
content |
str | - | positional_or_keyword |
Parameter Details
content: A string containing a directory listing with entries in the format 'hash:flags:uuid_component:type:size', one per line. The first line may optionally be a count of entries (a single digit). Each entry consists of: a 64-character hex hash, hex flags, a UUID with optional file extension, a numeric type value, and a size value.
Return Value
Returns a dictionary with two keys: 'child_objects' (list of entries without file extensions, representing pure UUIDs) and 'data_components' (list of entries with file extensions like .content, .metadata, .pdf, .rm). Each entry in both lists is a dictionary containing 'hash', 'flags', 'uuid_component', 'type', and 'size' fields. Data component entries additionally include a 'component_type' field extracted from the file extension.
Dependencies
re
Required Imports
import re
Usage Example
import re
def parse_directory_listing_debug(content: str):
# ... (function code as provided)
pass
# Example usage
listing = """3
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa:0001:12345678-1234-1234-1234-123456789abc.content:1:1024
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb:0002:87654321-4321-4321-4321-cba987654321.rm:2:2048
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc:0003:abcdef12-3456-7890-abcd-ef1234567890:3:512"""
result = parse_directory_listing_debug(listing)
print(f"\nChild objects: {len(result['child_objects'])}")
print(f"Data components: {len(result['data_components'])}")
for component in result['data_components']:
print(f" - {component['component_type']}: {component['size']} bytes")
Best Practices
- This is a debug function that prints extensive output to console - use only for troubleshooting, not in production code
- The function expects a very specific entry format with 5 colon-separated fields; ensure input data matches the pattern
- The regex pattern is case-insensitive for hex values but strict about format - validate input data structure before parsing
- The function handles optional count lines at the beginning (single digit on first line)
- Component types are extracted from file extensions after the last dot, with special handling for paths containing slashes
- Empty lines in the input are automatically skipped
- Consider using the non-debug version of this function in production to avoid performance overhead from print statements
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function show_directory_tree 57.7% similar
-
class FolderDebugger 55.7% similar
-
function test_extraction_debugging 54.2% similar
-
function main_v85 52.5% similar
-
function test_root_finding 52.5% similar