🔍 Code Extractor

function convert_document_to_pdf

Maturity: 67

Converts a document version from an editable format (e.g., Word) to PDF without changing the document's status, uploading the result to FileCloud and updating the version record.

File:
/tf/active/vicechatdev/document_controller_backup.py
Lines:
1280 - 1436
Complexity:
complex

Purpose

This function provides a way to generate PDF versions of controlled documents for viewing and distribution purposes. It retrieves a specific document version (or the current version if not specified), downloads the editable file from FileCloud, converts it to PDF using a document converter, uploads the PDF back to FileCloud, and updates the document version record with the PDF path. The conversion is logged in the audit trail for compliance tracking.

Source Code

def convert_document_to_pdf(
    user: DocUser,
    document_uid: str,
    version_uid: Optional[str] = None
) -> Dict[str, Any]:
    """
    Convert a document version to PDF without changing status
    
    Parameters
    ----------
    user : DocUser
        User performing the conversion
    document_uid : str
        ID of the document
    version_uid : str, optional
        ID of a specific version (default is current version)
        
    Returns
    -------
    Dict[str, Any]
        Dictionary with conversion results
    """
    try:
        # Get document instance
        document = ControlledDocument(uid=document_uid)
        if not document.uid:
            raise ResourceNotFoundError(f"Document not found: {document_uid}")
            
        # Get version
        version = None
        if version_uid:
            version = DocumentVersion(uid=version_uid)
            if not version or version.document_uid != document_uid:
                raise ResourceNotFoundError(f"Version not found: {version_uid}")
        else:
            version = document.current_version
            if not version:
                raise ResourceNotFoundError(f"No versions found for document: {document_uid}")
                
        # Check if the version has an editable file
        if not version.word_file_path:
            raise BusinessRuleError("Version has no editable document to convert")
            
        # Check if PDF already exists
        if version.pdf_file_path:
            return {
                'success': True,
                'message': 'PDF version already exists',
                'document_uid': document_uid,
                'version_uid': version.uid,
                'pdf_path': version.pdf_file_path
            }
            
        # Create a temporary directory for processing
        temp_dir = tempfile.mkdtemp()
        
        try:
            # Download the editable file - without requiring user for direct file access
            editable_file_path = version.word_file_path
            
            # Use internal file download method without permission check
            # FIX: Get the FileCloud client properly
            try:
                # Initialize FileCloud client
                filecloud_client = get_filecloud_client()
                
                # Download file content
                file_content = filecloud_client.download_file(editable_file_path)
                if not isinstance(file_content, bytes):
                    raise BusinessRuleError("Failed to download editable document")
            except Exception as download_err:
                raise BusinessRuleError(f"Failed to download editable document: {str(download_err)}")
                
            # Save to temp file
            file_ext = os.path.splitext(editable_file_path)[1]
            temp_file_path = os.path.join(temp_dir, f"document{file_ext}")
            
            with open(temp_file_path, 'wb') as f:
                f.write(file_content)
                
            # Initialize the document converter
            converter = ControlledDocumentConverter()
            
            # Convert to PDF (simple conversion without signature page or audit trail)
            output_pdf_path = os.path.join(temp_dir, "document.pdf")
            
            try:
                converter.convert_to_pdf(temp_file_path, output_pdf_path)
            except Exception as convert_err:
                raise BusinessRuleError(f"Failed to convert document to PDF: {str(convert_err)}")
                
            # Upload PDF to FileCloud
            # Calculate the FileCloud path for the PDF
            editable_dir = os.path.dirname(editable_file_path)
            pdf_filename = f"{os.path.splitext(os.path.basename(editable_file_path))[0]}.pdf"
            pdf_file_path = os.path.join(editable_dir, pdf_filename)
            
            # Upload PDF to FileCloud
            with open(output_pdf_path, 'rb') as pdf_file:
                upload_result = upload_document_to_filecloud(
                    user=user,
                    file_content=pdf_file.read(),
                    document=document_uid,
                    file_path=pdf_file_path,
                    metadata={
                        'docNumber': document.doc_number,
                        'version': version.version_number,
                        'status': document.status,
                        'convertedBy': user.username,
                        'convertedDate': datetime.now().isoformat()
                    }
                )
                
            if not upload_result.get('success', False):
                raise BusinessRuleError(f"Failed to upload PDF to FileCloud: {upload_result.get('message', 'Unknown error')}")
                
            # Update document version with PDF path
            version.pdf_file_path = pdf_file_path
            
            # Log conversion event
            audit_trail.log_document_lifecycle_event(
                event_type="DOCUMENT_CONVERTED_TO_PDF",
                user=user,
                document_uid=document_uid,
                details={
                    'version_uid': version.uid,
                    'version_number': version.version_number,
                    'pdf_path': pdf_file_path
                }
            )
            
            return {
                'success': True,
                'message': 'Document successfully converted to PDF',
                'document_uid': document_uid,
                'version_uid': version.uid,
                'version_number': version.version_number,
                'pdf_path': pdf_file_path
            }
            
        except Exception as e:
            logger.error(f"Error in document conversion process: {str(e)}")
            raise BusinessRuleError(f"Failed to convert document to PDF: {str(e)}")
        finally:
            # Clean up temporary directory
            try:
                if os.path.exists(temp_dir):
                    shutil.rmtree(temp_dir)
            except:
                logger.warning(f"Failed to remove temporary directory: {temp_dir}")
                
    except (ResourceNotFoundError, ValidationError, PermissionError, BusinessRuleError) as e:
        # Re-raise known errors
        raise
    except Exception as e:
        logger.error(f"Error converting document to PDF: {str(e)}")
        raise BusinessRuleError(f"Failed to convert document to PDF: {str(e)}")

Parameters

Name Type Default Kind
user DocUser - positional_or_keyword
document_uid str - positional_or_keyword
version_uid Optional[str] None positional_or_keyword

Parameter Details

user: DocUser object representing the authenticated user performing the conversion. Used for permission checks, audit logging, and FileCloud operations. Must have 'CONVERT_DOCUMENT' permission (enforced by decorator).

document_uid: String identifier (UID) of the controlled document to convert. Must correspond to an existing ControlledDocument in the system. Used to retrieve the document and its versions.

version_uid: Optional string identifier (UID) of a specific document version to convert. If None, the current/latest version of the document will be converted. Must belong to the specified document if provided.

Return Value

Type: Dict[str, Any]

Returns a dictionary with conversion results. On success: {'success': True, 'message': str, 'document_uid': str, 'version_uid': str, 'version_number': str, 'pdf_path': str}. The 'pdf_path' contains the FileCloud path to the generated PDF. If PDF already exists, returns early with 'message': 'PDF version already exists'. On error, raises exceptions (ResourceNotFoundError, BusinessRuleError, ValidationError, PermissionError).

Dependencies

  • logging
  • uuid
  • os
  • tempfile
  • typing
  • datetime
  • io
  • panel
  • shutil
  • traceback
  • CDocs

Required Imports

import logging
import os
import tempfile
import shutil
from typing import Dict, Any, Optional
from datetime import datetime
from CDocs.models.document import ControlledDocument, DocumentVersion
from CDocs.models.user_extensions import DocUser
from CDocs.utils import audit_trail
from CDocs.controllers import require_permission, log_controller_action
from CDocs.controllers import ResourceNotFoundError, ValidationError, PermissionError, BusinessRuleError
from CDocs.controllers.filecloud_controller import upload_document_to_filecloud, get_filecloud_client
from CDocs.utils.document_converter import ControlledDocumentConverter

Usage Example

from CDocs.models.user_extensions import DocUser
from CDocs.controllers.document_controller import convert_document_to_pdf

# Get authenticated user
user = DocUser(uid='user123')

# Convert current version of a document
result = convert_document_to_pdf(
    user=user,
    document_uid='doc-12345'
)

if result['success']:
    print(f"PDF created at: {result['pdf_path']}")
    print(f"Version: {result['version_number']}")

# Convert a specific version
result = convert_document_to_pdf(
    user=user,
    document_uid='doc-12345',
    version_uid='version-67890'
)

if result['success']:
    print(f"Conversion complete: {result['message']}")

Best Practices

  • Ensure the user has 'CONVERT_DOCUMENT' permission before calling (enforced by decorator)
  • Handle all exception types appropriately - function raises ResourceNotFoundError, BusinessRuleError, ValidationError, and PermissionError
  • The function checks if PDF already exists before conversion to avoid redundant processing
  • Temporary files are automatically cleaned up in the finally block, but ensure sufficient disk space for conversion
  • The function does not change document status - it only creates a PDF version
  • All conversions are logged in the audit trail for compliance tracking
  • The editable file (word_file_path) must exist on the version before conversion can proceed
  • FileCloud must be accessible and properly configured for both download and upload operations
  • The function uses a temporary directory for processing - ensure write permissions in the system temp directory
  • PDF is uploaded to the same FileCloud directory as the source editable file with .pdf extension

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function download_document_version 75.9% similar

    Downloads a specific version of a controlled document from FileCloud storage, with optional audit trail and watermark inclusion, and logs the download event.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_document_download_url 73.8% similar

    Retrieves a download URL for a controlled document, automatically selecting between editable (Word) and PDF formats based on document status or explicit request.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function create_document_v1 71.2% similar

    Creates a new version of an existing document in a document management system, storing the file in FileCloud and tracking version metadata in Neo4j graph database.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_document_edit_url 68.1% similar

    Generates an online editing URL for a document stored in FileCloud, allowing users to edit documents that are in editable states.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function publish_document 68.0% similar

    Publishes an approved controlled document by converting it to PDF with signatures and audit trail, uploading to FileCloud, and updating the document status to PUBLISHED.

    From: /tf/active/vicechatdev/document_controller_backup.py
← Back to Browse