šŸ” Code Extractor

function main_v113

Maturity: 34

Analyzes and compares .content files for PDF documents stored in reMarkable cloud storage, identifying differences between working and non-working documents.

File:
/tf/active/vicechatdev/e-ink-llm/cloudtest/analyze_content_files.py
Lines:
6 - 118
Complexity:
moderate

Purpose

This diagnostic function authenticates with the reMarkable cloud service, downloads .content metadata files for four specific PDF documents (two working, two broken), parses their JSON structure, and performs a detailed comparison to identify what makes some documents visible/working while others are not. It's designed for debugging document upload/visibility issues in the reMarkable ecosystem.

Source Code

def main():
    auth = RemarkableAuth()
    session = auth.get_authenticated_session()
    
    if not session:
        print("āŒ Authentication failed")
        return
    
    print("šŸ“„ COMPARING .CONTENT FILES FOR ALL 4 PDF DOCUMENTS")
    print("=" * 70)
    
    # Document info from the log file
    documents = {
        'invoice_poulpharm': {
            'name': 'invoice poulpharm june 2025',
            'content_hash': '4843f8d18f154198752eef85dbefb3c8d2d9984fe84e70d13857f5a7d61dcff3',
            'working': True,
            'size': 720
        },
        'pylontech': {
            'name': 'Pylontech force H3 datasheet',
            'content_hash': 'feb1654a645e7d42eea63bb8f87a1888026fd3ac197aa725fa3d77ae8b3e1e8c',
            'working': True,
            'size': 831
        },
        'upload_test_1': {
            'name': 'UploadTest_1753969395',
            'content_hash': '1ea64a7fb8fdd227cff533ea190a74d5111656f57699db714d33f69aba4404d5',
            'working': False,
            'size': 741
        },
        'upload_test_2': {
            'name': 'UploadTest_1753968602',
            'content_hash': 'ddc9459da5fc01058d854c85e3879b05c145e82189f4dd409bdc5d88014ad5e5',
            'working': False,
            'size': 741
        }
    }
    
    content_data = {}
    
    for doc_key, doc_info in documents.items():
        print(f"\nšŸ” {doc_info['name']} ({'āœ… WORKING' if doc_info['working'] else 'āŒ NOT VISIBLE'})")
        print(f"   Content hash: {doc_info['content_hash']}")
        print(f"   Expected size: {doc_info['size']} bytes")
        
        try:
            # Download the .content file
            content_response = session.get(f"https://eu.tectonic.remarkable.com/sync/v3/files/{doc_info['content_hash']}")
            content_response.raise_for_status()
            content_text = content_response.text
            
            print(f"   Actual size: {len(content_text)} bytes")
            print(f"   Size match: {'āœ…' if len(content_text) == doc_info['size'] else 'āŒ'}")
            
            # Parse JSON
            try:
                content_json = json.loads(content_text)
                content_data[doc_key] = content_json
                
                print(f"   šŸ“Š JSON Content:")
                print(f"      fileType: {content_json.get('fileType', 'MISSING')}")
                print(f"      pageCount: {content_json.get('pageCount', 'MISSING')}")
                print(f"      originalPageCount: {content_json.get('originalPageCount', 'MISSING')}")
                print(f"      sizeInBytes: {content_json.get('sizeInBytes', 'MISSING')}")
                print(f"      formatVersion: {content_json.get('formatVersion', 'MISSING')}")
                print(f"      orientation: {content_json.get('orientation', 'MISSING')}")
                print(f"      pages array: {len(content_json.get('pages', []))} items")
                if content_json.get('pages'):
                    print(f"         First page UUID: {content_json['pages'][0]}")
                print(f"      redirectionPageMap: {content_json.get('redirectionPageMap', 'MISSING')}")
                
            except json.JSONDecodeError as e:
                print(f"   āŒ Invalid JSON: {e}")
                print(f"   Raw content: {repr(content_text[:200])}")
                
        except Exception as e:
            print(f"   āŒ Failed to download: {e}")
        
        print("-" * 50)
    
    # Compare working vs non-working
    print("\nšŸ” DETAILED COMPARISON: WORKING vs NON-WORKING")
    print("=" * 70)
    
    working_docs = [k for k, v in documents.items() if v['working']]
    broken_docs = [k for k, v in documents.items() if not v['working']]
    
    print(f"Working documents: {[documents[k]['name'] for k in working_docs]}")
    print(f"Broken documents: {[documents[k]['name'] for k in broken_docs]}")
    
    if working_docs and broken_docs:
        print("\nšŸ” Key Differences Analysis:")
        
        # Compare first working vs first broken
        working_content = content_data.get(working_docs[0], {})
        broken_content = content_data.get(broken_docs[0], {})
        
        print(f"\nComparing {documents[working_docs[0]]['name']} (working) vs {documents[broken_docs[0]]['name']} (broken):")
        
        all_keys = set(working_content.keys()) | set(broken_content.keys())
        for key in sorted(all_keys):
            working_val = working_content.get(key, "MISSING")
            broken_val = broken_content.get(key, "MISSING")
            
            if working_val != broken_val:
                print(f"   šŸ”„ DIFFERENCE - {key}:")
                print(f"      Working: {working_val}")
                print(f"      Broken:  {broken_val}")
            else:
                print(f"   āœ… SAME - {key}: {working_val}")
    
    print(f"\nšŸ’¾ Content files analysis complete!")

Return Value

This function returns None (implicitly). It performs side effects by printing diagnostic information to stdout, including authentication status, document metadata, JSON content structure, and comparative analysis between working and broken documents.

Dependencies

  • auth
  • json

Required Imports

from auth import RemarkableAuth
import json

Usage Example

# Ensure auth.py is available with RemarkableAuth class
# from auth import RemarkableAuth
# import json

if __name__ == '__main__':
    main()

# Output will be printed to console showing:
# - Authentication status
# - Document metadata for 4 PDFs
# - JSON content structure for each document
# - Comparative analysis between working and broken documents

Best Practices

  • This function is hardcoded with specific document hashes and metadata - it's designed for a specific debugging scenario and should be adapted for general use
  • The function makes multiple HTTP requests sequentially; consider adding rate limiting or error handling for production use
  • Authentication credentials should be properly secured in the RemarkableAuth implementation
  • The function prints directly to stdout; consider using logging module for better control in production environments
  • Error handling uses broad exception catching which may hide specific issues; consider more granular exception handling for production code
  • The document dictionary is hardcoded and should be externalized to a configuration file for reusability
  • Consider adding retry logic for network requests to handle transient failures

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function analyze_pylontech_document 79.6% similar

    Performs deep forensic analysis of a specific Pylontech document stored in reMarkable Cloud, examining all document components (content, metadata, pagedata, PDF) to identify patterns and differences between app-uploaded and API-uploaded documents.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/analyze_pylontech_details.py
  • class RealAppUploadAnalyzer 72.3% similar

    Analyzes documents uploaded by the real reMarkable app by fetching and examining their structure, metadata, and components from the reMarkable cloud sync service.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/test_real_app_upload.py
  • function main_v15 71.9% similar

    A test function that uploads a PDF document to reMarkable cloud, syncs the local replica, and validates the upload with detailed logging and metrics.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/test_raw_upload.py
  • class DocumentComparator 71.6% similar

    A class that compares reMarkable cloud documents to analyze and identify structural differences between them, particularly useful for debugging document upload issues.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/compare_documents.py
  • function verify_document_status 70.1% similar

    Verifies the current status and metadata of a specific test document in the reMarkable cloud sync system by querying the sync API endpoints and analyzing the document's location and properties.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/verify_document_status.py
← Back to Browse