🔍 Code Extractor

function msg_to_eml_alternative

Maturity: 51

Converts Microsoft Outlook .msg files to .eml (email) format using extract_msg library, with support for headers, body content (plain text and HTML), and attachments.

File:
/tf/active/vicechatdev/msg_to_eml.py
Lines:
152 - 259
Complexity:
complex

Purpose

This function provides an alternative method for converting .msg files to .eml format by manually constructing a MIME multipart message. It first attempts to use extract_msg's built-in save_email method if available, then falls back to manually creating the EML file with proper MIME structure, including email headers, plain text and HTML bodies, and base64-encoded attachments. This is useful for email migration, archival, or when working with systems that require standard .eml format instead of proprietary .msg format.

Source Code

def msg_to_eml_alternative(msg_path, eml_path):
    """Alternative conversion approach using extract_msg's built-in functionality"""
    try:
        if not os.path.exists(msg_path):
            logger.error(f"Input file not found: {msg_path}")
            return False
            
        # Load the .msg file
        logger.info(f"Using alternative conversion method for: {msg_path}")
        msg = extract_msg.Message(msg_path)
        
        # Try direct raw EML content extraction if available
        if hasattr(msg, 'save_email'):
            msg.save_email(eml_path)
            logger.info(f"Successfully converted '{msg_path}' to '{eml_path}' using built-in save_email")
            return True
            
        # Use extract_msg's built-in properties to manually create the EML
        with open(eml_path, 'w', encoding='utf-8') as f:
            # Write basic headers
            f.write(f"From: {msg.sender}\n")
            f.write(f"To: {msg.to}\n")
            if msg.cc:
                f.write(f"Cc: {msg.cc}\n")
            f.write(f"Subject: {msg.subject or ''}\n")
            
            # Add date
            if hasattr(msg, 'date') and msg.date:
                try:
                    f.write(f"Date: {msg.date}\n")
                except:
                    f.write(f"Date: {formatdate(localtime=True)}\n")
            else:
                f.write(f"Date: {formatdate(localtime=True)}\n")
                
            # Add content type header for MIME message
            f.write("MIME-Version: 1.0\n")
            
            # Create a simple multipart message
            boundary = "----=_NextPart_" + os.urandom(16).hex()
            f.write(f'Content-Type: multipart/mixed; boundary="{boundary}"\n\n')
            
            # Add message separator
            f.write(f"--{boundary}\n")
            
            # Add plain text body
            f.write('Content-Type: text/plain; charset="utf-8"\n')
            f.write('Content-Transfer-Encoding: quoted-printable\n\n')
            f.write(msg.body or '')
            f.write(f"\n\n--{boundary}\n")
            
            # Add HTML body if available
            html_content = None
            if hasattr(msg, 'htmlBody') and msg.htmlBody:
                html_content = msg.htmlBody
            elif hasattr(msg, 'html') and msg.html:
                html_content = msg.html
                
            if html_content:
                f.write('Content-Type: text/html; charset="utf-8"\n')
                f.write('Content-Transfer-Encoding: quoted-printable\n\n')
                f.write(html_content)
                f.write(f"\n\n--{boundary}\n")
            
            # Add attachments
            for attachment in msg.attachments:
                try:
                    # Get filename
                    filename = getattr(attachment, 'longFilename', None) or getattr(attachment, 'shortFilename', None) or 'attachment'
                    
                    # Determine content type
                    content_type = None
                    if hasattr(attachment, 'mimetype') and attachment.mimetype:
                        content_type = attachment.mimetype
                    else:
                        content_type, _ = mimetypes.guess_type(filename)
                        
                    if not content_type:
                        content_type = 'application/octet-stream'
                        
                    # Write attachment headers
                    f.write(f'Content-Type: {content_type}; name="{filename}"\n')
                    f.write('Content-Transfer-Encoding: base64\n')
                    f.write(f'Content-Disposition: attachment; filename="{filename}"\n\n')
                    
                    # Write base64 encoded attachment data
                    import base64
                    if attachment.data:
                        encoded_data = base64.b64encode(attachment.data).decode('ascii')
                        # Write in chunks of 76 characters for proper base64 format
                        for i in range(0, len(encoded_data), 76):
                            f.write(encoded_data[i:i+76] + '\n')
                    
                    f.write(f"\n--{boundary}\n")
                    
                except Exception as e:
                    logger.error(f"Error processing attachment {filename}: {str(e)}")
            
            # Close the multipart message
            f.write(f"--{boundary}--\n")
            
        logger.info(f"Successfully converted '{msg_path}' to '{eml_path}' using manual alternative method")
        return True
            
    except Exception as e:
        logger.error(f"Error in alternative conversion of {msg_path} to EML: {str(e)}")
        logger.error(traceback.format_exc())
        return False

Parameters

Name Type Default Kind
msg_path - - positional_or_keyword
eml_path - - positional_or_keyword

Parameter Details

msg_path: String path to the input Microsoft Outlook .msg file to be converted. Must be a valid file path that exists on the filesystem.

eml_path: String path where the output .eml file will be saved. The directory must exist and be writable. If the file exists, it will be overwritten.

Return Value

Returns a boolean value: True if the conversion was successful (either through built-in save_email or manual construction), False if any error occurred during the conversion process (file not found, parsing errors, write errors, etc.).

Dependencies

  • extract_msg
  • os
  • mimetypes
  • logging
  • email
  • traceback
  • base64

Required Imports

import extract_msg
import os
import mimetypes
import logging
import traceback
from email.utils import formatdate

Conditional/Optional Imports

These imports are only needed under specific conditions:

import base64

Condition: only when processing attachments in the manual conversion method

Required (conditional)

Usage Example

import extract_msg
import os
import mimetypes
import logging
import traceback
from email.utils import formatdate

# Setup logger
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)

# Convert a .msg file to .eml
msg_file = '/path/to/email.msg'
eml_file = '/path/to/output.eml'

success = msg_to_eml_alternative(msg_file, eml_file)

if success:
    print(f'Successfully converted {msg_file} to {eml_file}')
else:
    print(f'Conversion failed for {msg_file}')

Best Practices

  • Ensure the logger object is properly configured before calling this function
  • Verify that the input .msg file exists and is readable before calling
  • Ensure the output directory for eml_path exists and has write permissions
  • Handle the boolean return value to determine if conversion succeeded
  • Be aware that this function writes files with UTF-8 encoding, which may affect special characters
  • The function creates MIME multipart/mixed messages with a random boundary string
  • Attachments are base64-encoded with 76-character line wrapping per RFC standards
  • If the msg file has both plain text and HTML bodies, both will be included in the output
  • Error handling is comprehensive but errors are logged rather than raised, so check return value
  • The function attempts a built-in method first (save_email) before falling back to manual construction

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function msg_to_eml 90.4% similar

    Converts Microsoft Outlook .msg files to standard .eml format, preserving email headers, body content (plain text and HTML), and attachments.

    From: /tf/active/vicechatdev/msg_to_eml.py
  • function msg_to_pdf_improved 80.0% similar

    Converts a Microsoft Outlook .msg file to PDF format using EML as an intermediate format for improved reliability, with fallback to direct conversion if needed.

    From: /tf/active/vicechatdev/msg_to_eml.py
  • function msg_to_pdf 75.5% similar

    Converts a Microsoft Outlook .msg email file to a single PDF document, including the email body and all attachments merged together.

    From: /tf/active/vicechatdev/msg_to_eml.py
  • function generate_simple_html_from_eml 66.9% similar

    Converts an email.message.Message object into a clean, styled HTML representation with embedded inline images and attachment listings.

    From: /tf/active/vicechatdev/msg_to_eml.py
  • function generate_html_from_msg 66.4% similar

    Converts an email message object into a formatted HTML representation with styling, headers, body content, and attachment information.

    From: /tf/active/vicechatdev/msg_to_eml.py
← Back to Browse