function main_v98
Command-line application that uploads PDF files without WUXI coding from a local directory to a FileCloud server, with support for dry-run mode and customizable file patterns.
/tf/active/vicechatdev/mailsearch/upload_non_wuxi_coded.py
157 - 246
moderate
Purpose
This is the main entry point for a file upload utility that filters PDF files based on WUXI coding patterns and uploads them to a specific FileCloud location. It's designed for document management workflows where files need to be categorized and uploaded to a shared cloud storage system. The function handles authentication, file filtering, batch uploading, and provides detailed progress reporting with success/error summaries.
Source Code
def main():
parser = argparse.ArgumentParser(
description="Upload non-WUXI coded files from output folder to FileCloud"
)
parser.add_argument(
'--source',
default='./output',
help='Source directory (default: ./output)'
)
parser.add_argument(
'--target',
default='/SHARED/vicebio_shares/03_CMC/e-sign - document to approve/Extract docusign - not Wuxi coded',
help='Target folder in FileCloud'
)
parser.add_argument(
'--dry-run',
action='store_true',
help='Show what would be uploaded without actually uploading'
)
parser.add_argument(
'--pattern',
default='*.pdf',
help='File pattern to match (default: *.pdf)'
)
args = parser.parse_args()
# Setup timezone
cet_timezone = ZoneInfo("Europe/Brussels")
# Find all PDF files without WUXI coding
source_path = Path(args.source)
all_files = list(source_path.glob(args.pattern))
non_wuxi_files = [f for f in all_files if not has_wuxi_coding(f.name)]
print(f"Found {len(all_files)} total files")
print(f"Filtered to {len(non_wuxi_files)} files without WUXI coding")
print(f"Target folder: {args.target}")
print("=" * 80)
if not non_wuxi_files:
print("No files to upload")
return
if args.dry_run:
print("\nDRY RUN MODE - No files will be uploaded")
print("=" * 80)
for file_path in sorted(non_wuxi_files):
print(f"\n{file_path.name}")
print(f" → Would upload to: {args.target}/{file_path.name}")
return
# Login to FileCloud
print("\nLogging in to FileCloud...")
Headers = {'Accept': 'application/json'}
Creds = {'userid': 'wim@vicebio.com', 'password': 'Studico01!'}
ServerURL = 'https://filecloud.vicebio.com/'
LoginEndPoint = 'core/loginguest'
s = requests.session()
LoginCall = s.post(ServerURL + LoginEndPoint, data=Creds, headers=Headers).json()
print("✓ Logged in successfully")
print("=" * 80)
# Upload files
success_count = 0
error_count = 0
for file_path in sorted(non_wuxi_files):
try:
if upload_file_to_filecloud(str(file_path), args.target, s, cet_timezone, args.dry_run):
success_count += 1
else:
error_count += 1
except Exception as e:
print(f"\n{file_path.name}")
print(f" ✗ Error: {e}")
error_count += 1
# Summary
print("\n" + "=" * 80)
print("SUMMARY")
print("=" * 80)
print(f"Total files: {len(non_wuxi_files)}")
print(f"Successful: {success_count}")
print(f"Errors: {error_count}")
Return Value
Returns None implicitly. The function performs side effects (file uploads, console output) and exits normally. Early returns occur when no files are found or in dry-run mode.
Dependencies
argparsepathlibrequestsxmltodictdatetimezoneinfoosre
Required Imports
import argparse
from pathlib import Path
import requests
import xmltodict
from datetime import datetime
from zoneinfo import ZoneInfo
import os
import re
Usage Example
# Run with default settings
if __name__ == '__main__':
main()
# Command-line usage examples:
# python script.py
# python script.py --source ./my_pdfs --pattern '*.pdf'
# python script.py --dry-run
# python script.py --target '/SHARED/custom_folder' --source ./docs
# python script.py --pattern '*.docx' --dry-run
Best Practices
- SECURITY WARNING: Credentials are hardcoded in the source code. Use environment variables or secure credential management instead.
- The function depends on external functions 'has_wuxi_coding()' and 'upload_file_to_filecloud()' which must be defined in the same module.
- Use --dry-run flag first to verify which files will be uploaded before performing actual uploads.
- Ensure the source directory exists and contains files matching the pattern before running.
- The function uses a persistent session object for FileCloud API calls to maintain authentication.
- Error handling is implemented per-file, so one failure won't stop the entire batch.
- The timezone is set to Europe/Brussels (CET) - adjust if needed for different regions.
- Consider implementing retry logic for network failures in production use.
- The function prints progress to stdout - redirect or capture if logging to file is needed.
Tags
Similar Components
AI-powered semantic similarity - components with related functionality: