🔍 Code Extractor

function compute_crc32c_header

Maturity: 50

Computes a CRC32C checksum for binary content and returns it as a base64-encoded string formatted for Google Cloud Storage x-goog-hash headers.

File:
/tf/active/vicechatdev/e-ink-llm/cloudtest/force_web_app_refresh.py
Lines:
30 - 47
Complexity:
simple

Purpose

This function calculates a CRC32C checksum for data integrity verification, specifically formatted for Google Cloud Storage API requests. It attempts to use the proper crc32c library for accurate CRC32C computation, but falls back to standard CRC32 (zlib) if unavailable. The checksum is converted to a 4-byte big-endian representation, base64-encoded, and formatted as 'crc32c={encoded_value}' for use in HTTP headers.

Source Code

def compute_crc32c_header(content: bytes) -> str:
    """Compute CRC32C checksum and return as x-goog-hash header value"""
    try:
        # Use proper crc32c library if available
        if HAS_CRC32C:
            checksum = crc32c.crc32c(content)
        else:
            # Fallback to standard CRC32 (not ideal but better than nothing)
            checksum = zlib.crc32(content) & 0xffffffff
        
        # Convert to bytes and base64 encode
        checksum_bytes = checksum.to_bytes(4, byteorder='big')
        checksum_b64 = base64.b64encode(checksum_bytes).decode('ascii')
        
        return f"crc32c={checksum_b64}"
    except Exception as e:
        print(f"❌ Error computing CRC32C: {e}")
        return None

Parameters

Name Type Default Kind
content bytes - positional_or_keyword

Parameter Details

content: Binary data (bytes object) for which to compute the CRC32C checksum. This should be the complete content that needs integrity verification, such as file contents being uploaded to Google Cloud Storage.

Return Value

Type: str

Returns a string formatted as 'crc32c={base64_encoded_checksum}' suitable for use as the value of an x-goog-hash HTTP header. The base64-encoded portion represents the 4-byte CRC32C checksum in big-endian byte order. Returns None if an exception occurs during computation, with an error message printed to stdout.

Dependencies

  • base64
  • zlib
  • crc32c

Required Imports

import base64
import zlib

Conditional/Optional Imports

These imports are only needed under specific conditions:

import crc32c

Condition: Required for proper CRC32C computation. If not available, the function falls back to zlib.crc32 (standard CRC32, not CRC32C). The code checks for availability via HAS_CRC32C flag.

Optional

Usage Example

# Setup: Define HAS_CRC32C flag
try:
    import crc32c
    HAS_CRC32C = True
except ImportError:
    HAS_CRC32C = False

import base64
import zlib

# Example usage
file_content = b"Hello, World! This is test data."
header_value = compute_crc32c_header(file_content)
if header_value:
    print(f"x-goog-hash header: {header_value}")
    # Output: crc32c=iKQnvw==
    
    # Use in HTTP request headers
    headers = {
        'x-goog-hash': header_value,
        'Content-Type': 'application/octet-stream'
    }
else:
    print("Failed to compute checksum")

Best Practices

  • Install the 'crc32c' package (pip install crc32c) for proper CRC32C computation rather than relying on the zlib.crc32 fallback, as they produce different results
  • Always check if the return value is None before using it, as the function returns None on errors
  • The function prints errors to stdout; consider modifying for production use to use proper logging instead
  • Ensure HAS_CRC32C is properly initialized before calling this function, typically at module level with a try/except import block
  • This function is specifically designed for Google Cloud Storage API compatibility; the output format matches GCS expectations
  • For large files, consider reading and processing content in chunks if memory is a concern, though this function expects the full content as input

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function calculate_crc32c 78.3% similar

    Calculates a CRC32 checksum of input data and returns it as a base64-encoded string.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/simple_clean_root.py
  • function calculate_file_hash_v1 49.9% similar

    Calculates the MD5 hash of a file by reading it in chunks to handle large files efficiently.

    From: /tf/active/vicechatdev/mailsearch/enhanced_document_comparison.py
  • function generate_header_examples 45.6% similar

    Prints formatted examples of HTTP headers required for different types of file uploads to a reMarkable cloud sync service, including PDFs, metadata, content, and schema files.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/analyze_headers.py
  • function test_upload_endpoint 39.3% similar

    A test function that validates the reMarkable Cloud API file upload endpoint by attempting to upload a test JSON file and verifying the GET endpoint functionality.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/test_upload_endpoint.py
  • function calculate_file_hash 38.3% similar

    Calculates the MD5 hash of a file by reading it in chunks to handle large files efficiently.

    From: /tf/active/vicechatdev/mailsearch/compare_documents.py
← Back to Browse