test_markdown_link_parsing

function test_markdown_link_parsing

Maturity: 42

A test function that validates markdown link parsing capabilities, specifically testing extraction and URL encoding of complex URLs containing special characters from Quill editor format.

File:
/tf/active/vicechatdev/test_complex_hyperlink.py

Lines:
50 - 80

Complexity:
simple

Purpose

This function serves as a unit test to verify that markdown links with complex URLs (containing special characters like &, commas, spaces, and URL fragments) can be correctly parsed, extracted, and encoded. It demonstrates the process of splitting markdown link syntax, extracting link text and URLs, and properly encoding URL paths while preserving query parameters and fragments.

Source Code

def test_markdown_link_parsing():
    """Test markdown link parsing with complex URLs"""
    print("\nTesting markdown link parsing...")
    
    # Test the exact format that would come from Quill editor
    markdown_text = "[3.5.1 Cost model for WBPK022&K024,K034_20240624.xlsx](https://filecloud.vicebio.com/ui/core/index.html?filter=3.5.1+Cost+model+for+WBPK022&K024,K034_20240624.xlsx#expl-tabl./SHARED/vicebio_shares/Wuxi/3%20WO-CO%20&%20invoice%20plan/3.5%20Cost%20Model/)"
    
    print(f"Input markdown: {markdown_text}")
    
    import re
    # Test URL extraction
    link_parts = re.split(r'\[([^\]]+)\]\(([^)]+)\)', markdown_text)
    print(f"Parsed parts: {link_parts}")
    
    if len(link_parts) >= 3:
        text = link_parts[1]
        url = link_parts[2] 
        print(f"Extracted text: '{text}'")
        print(f"Extracted URL: '{url}'")
        
        # Test URL encoding
        import urllib.parse
        if '://' in url:
            scheme_and_domain, path_part = url.split('://', 1)
            if '/' in path_part:
                domain, path = path_part.split('/', 1)
                encoded_path = urllib.parse.quote(path, safe='/?&=:#%')
                clean_url = f"{scheme_and_domain}://{domain}/{encoded_path}"
                print(f"Cleaned URL: '{clean_url}'")
    
    print("✅ URL parsing test completed")

Return Value

This function does not return any value (implicitly returns None). It prints test results and status messages to stdout, including the input markdown, parsed parts, extracted text and URL, and the cleaned/encoded URL.

Dependencies

re
urllib.parse

Required Imports

import re
import urllib.parse

Usage Example

import re
import urllib.parse

def test_markdown_link_parsing():
    """Test markdown link parsing with complex URLs"""
    print("\nTesting markdown link parsing...")
    
    markdown_text = "[3.5.1 Cost model for WBPK022&K024,K034_20240624.xlsx](https://filecloud.vicebio.com/ui/core/index.html?filter=3.5.1+Cost+model+for+WBPK022&K024,K034_20240624.xlsx#expl-tabl./SHARED/vicebio_shares/Wuxi/3%20WO-CO%20&%20invoice%20plan/3.5%20Cost%20Model/)"
    
    print(f"Input markdown: {markdown_text}")
    
    link_parts = re.split(r'\[([^\]]+)\]\(([^)]+)\)', markdown_text)
    print(f"Parsed parts: {link_parts}")
    
    if len(link_parts) >= 3:
        text = link_parts[1]
        url = link_parts[2] 
        print(f"Extracted text: '{text}'")
        print(f"Extracted URL: '{url}'")
        
        if '://' in url:
            scheme_and_domain, path_part = url.split('://', 1)
            if '/' in path_part:
                domain, path = path_part.split('/', 1)
                encoded_path = urllib.parse.quote(path, safe='/?&=:#%')
                clean_url = f"{scheme_and_domain}://{domain}/{encoded_path}"
                print(f"Cleaned URL: '{clean_url}'")
    
    print("✅ URL parsing test completed")

# Run the test
test_markdown_link_parsing()

Best Practices

This is a test function meant for validation purposes, not production use
The regex pattern r'\[([^\]]+)\]\(([^)]+)\)' assumes well-formed markdown links and may not handle nested brackets or escaped characters
The URL encoding preserves specific safe characters ('/?&=:#%') which may need adjustment based on specific URL requirements
The function assumes URLs contain '://' scheme separator and at least one path component
For production code, consider using a dedicated markdown parsing library instead of regex
The function prints directly to stdout; consider using logging or returning results for better testability

Similar Components

AI-powered semantic similarity - components with related functionality:

function test_complex_url_hyperlink 64.7% similar

A test function that validates the creation of Word documents with complex FileCloud URLs containing special characters, query parameters, and URL fragments as clickable hyperlinks.
From: /tf/active/vicechatdev/test_complex_hyperlink.py
function test_fixes 55.2% similar

A comprehensive test function that validates email template rendering and CDocs application link presence in a document management system's email notification templates.
From: /tf/active/vicechatdev/test_comprehensive_fixes.py
function test_document_extractor 53.7% similar

A test function that validates the DocumentExtractor class by testing file type support detection, text extraction from various document formats, and error handling.
From: /tf/active/vicechatdev/leexi/test_document_extractor.py
function test_multiple_files 52.8% similar

A test function that validates the extraction of text content from multiple document files using a DocumentExtractor instance, displaying extraction results and simulating combined content processing.
From: /tf/active/vicechatdev/leexi/test_multiple_files.py
function test_mixed_previous_reports 51.9% similar

A test function that validates the DocumentExtractor's ability to extract text content from multiple file formats (text and markdown) and combine them into a unified previous reports summary.
From: /tf/active/vicechatdev/leexi/test_enhanced_reports.py

🔍 Code Extractor

function test_markdown_link_parsing

Purpose

Source Code

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function test_complex_url_hyperlink 64.7% similar

function test_fixes 55.2% similar

function test_document_extractor 53.7% similar

function test_multiple_files 52.8% similar

function test_mixed_previous_reports 51.9% similar

function test_markdown_link_parsing

Purpose

Source Code

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function test_complex_url_hyperlink 64.7% similar

function test_fixes 55.2% similar

function test_document_extractor 53.7% similar

function test_multiple_files 52.8% similar

function test_mixed_previous_reports 51.9% similar

✨ Improve Code: test_markdown_link_parsing

Code Comparison