class OneDriveProcessor
OneDriveProcessor is a class that monitors a OneDrive folder for new files, processes them using an E-Ink LLM Assistant, and uploads the results back to OneDrive.
/tf/active/vicechatdev/e-ink-llm/onedrive_client.py
510 - 625
moderate
Purpose
This class provides automated file processing integration with OneDrive for the E-Ink LLM Assistant. It continuously watches a specified OneDrive folder for new files (PDFs and images), downloads them, processes them through the LLM assistant, and uploads the processed results to an output folder. It supports configurable polling intervals, automatic folder creation, and optional deletion of processed files.
Source Code
class OneDriveProcessor:
"""OneDrive file processor for E-Ink LLM Assistant"""
def __init__(self, onedrive_config: Dict[str, Any], api_key: str):
"""
Initialize OneDrive processor
Args:
onedrive_config: OneDrive configuration dictionary
api_key: OpenAI API key
"""
self.client = OneDriveClient(onedrive_config)
self.api_key = api_key
self.config = onedrive_config
# Configuration
self.watch_folder = onedrive_config.get('watch_folder_path', '/E-Ink LLM Input')
self.output_folder = onedrive_config.get('output_folder_path', '/E-Ink LLM Output')
self.poll_interval = onedrive_config.get('poll_interval', 60)
self.processed_files = set()
# Supported file types
self.supported_extensions = ['.pdf', '.jpg', '.jpeg', '.png', '.gif', '.bmp', '.tiff', '.webp']
print(f"š OneDrive watch folder: {self.watch_folder}")
print(f"š OneDrive output folder: {self.output_folder}")
async def initialize(self) -> bool:
"""Initialize OneDrive connection"""
success = await self.client.authenticate()
if success:
# Ensure folders exist
await self.client.create_folder(self.watch_folder)
await self.client.create_folder(self.output_folder)
return success
async def start_watching(self) -> None:
"""Start watching OneDrive folder for new files"""
if not await self.initialize():
print("ā Failed to initialize OneDrive connection")
return
print(f"š Watching OneDrive folder: {self.watch_folder}")
print(f"ā±ļø Poll interval: {self.poll_interval} seconds")
print("š Press Ctrl+C to stop")
try:
while True:
await self._check_for_new_files()
await asyncio.sleep(self.poll_interval)
except KeyboardInterrupt:
print("\nš OneDrive watching stopped")
async def _check_for_new_files(self) -> None:
"""Check for new files in OneDrive watch folder"""
try:
files = await self.client.list_files_in_folder(
self.watch_folder,
self.supported_extensions
)
new_files = [f for f in files if f['id'] not in self.processed_files]
if new_files:
print(f"š Found {len(new_files)} new files in OneDrive")
for file_info in new_files:
await self._process_file(file_info)
self.processed_files.add(file_info['id'])
except Exception as e:
print(f"ā Error checking for new files: {e}")
async def _process_file(self, file_info: Dict[str, Any]) -> None:
"""Process a single file from OneDrive"""
print(f"š Processing OneDrive file: {file_info['name']}")
try:
# Create temporary directory for processing
temp_dir = Path("temp_onedrive")
temp_dir.mkdir(exist_ok=True)
# Download file
local_input_path = temp_dir / file_info['name']
if not await self.client.download_file(file_info, str(local_input_path)):
return
# Process with E-Ink LLM
from processor import process_single_file
result_path = await process_single_file(str(local_input_path), self.api_key)
if result_path:
# Upload result to OneDrive
result_file = Path(result_path)
upload_success = await self.client.upload_file(
str(result_file),
self.output_folder,
result_file.name
)
if upload_success:
print(f"ā
Processed and uploaded: {file_info['name']} -> {result_file.name}")
# Optional: delete original file from input folder
if self.config.get('delete_after_processing', False):
await self.client.delete_file(file_info)
# Clean up local files
local_input_path.unlink(missing_ok=True)
result_file.unlink(missing_ok=True)
else:
print(f"ā Failed to process: {file_info['name']}")
except Exception as e:
print(f"ā Error processing {file_info['name']}: {e}")
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
bases |
- | - |
Parameter Details
onedrive_config: Dictionary containing OneDrive configuration settings. Expected keys include: 'watch_folder_path' (default: '/E-Ink LLM Input'), 'output_folder_path' (default: '/E-Ink LLM Output'), 'poll_interval' (default: 60 seconds), 'delete_after_processing' (boolean, optional). Also contains authentication credentials passed to OneDriveClient.
api_key: OpenAI API key string used for processing files through the LLM. Required for the process_single_file function to work.
Return Value
The constructor returns an instance of OneDriveProcessor. The initialize() method returns a boolean indicating success/failure of OneDrive connection. The start_watching() method returns None and runs indefinitely until interrupted. Internal methods _check_for_new_files() and _process_file() return None and perform side effects (file processing and uploads).
Class Interface
Methods
__init__(self, onedrive_config: Dict[str, Any], api_key: str)
Purpose: Initialize the OneDrive processor with configuration and API credentials
Parameters:
onedrive_config: Dictionary containing OneDrive settings and authentication credentialsapi_key: OpenAI API key for LLM processing
Returns: None (constructor)
async initialize(self) -> bool
Purpose: Authenticate with OneDrive and ensure required folders exist
Returns: Boolean indicating whether initialization was successful
async start_watching(self) -> None
Purpose: Start continuous monitoring of OneDrive folder for new files to process
Returns: None (runs indefinitely until KeyboardInterrupt)
async _check_for_new_files(self) -> None
Purpose: Check OneDrive watch folder for new files and process any found
Returns: None (performs side effects: processes files and updates processed_files set)
async _process_file(self, file_info: Dict[str, Any]) -> None
Purpose: Download a file from OneDrive, process it through LLM, and upload result
Parameters:
file_info: Dictionary containing file metadata including 'id', 'name', and download information
Returns: None (performs side effects: downloads, processes, uploads files)
Attributes
| Name | Type | Description | Scope |
|---|---|---|---|
client |
OneDriveClient | OneDrive client instance for API interactions | instance |
api_key |
str | OpenAI API key for LLM processing | instance |
config |
Dict[str, Any] | Full OneDrive configuration dictionary | instance |
watch_folder |
str | OneDrive folder path to monitor for new files (default: '/E-Ink LLM Input') | instance |
output_folder |
str | OneDrive folder path where processed files are uploaded (default: '/E-Ink LLM Output') | instance |
poll_interval |
int | Seconds between checks for new files (default: 60) | instance |
processed_files |
set | Set of file IDs that have already been processed to avoid reprocessing | instance |
supported_extensions |
List[str] | List of file extensions that can be processed: ['.pdf', '.jpg', '.jpeg', '.png', '.gif', '.bmp', '.tiff', '.webp'] | instance |
Dependencies
msalrequestsasynciopathlib
Required Imports
import os
import json
import time
import asyncio
from pathlib import Path
from typing import Dict, List, Optional, Any
import hashlib
import msal
import requests
from datetime import datetime, timedelta
Conditional/Optional Imports
These imports are only needed under specific conditions:
from processor import process_single_file
Condition: Required when _process_file method is called to process downloaded files through the E-Ink LLM Assistant
Required (conditional)Usage Example
import asyncio
from onedrive_processor import OneDriveProcessor
# Configuration
onedrive_config = {
'client_id': 'your-client-id',
'client_secret': 'your-client-secret',
'tenant_id': 'your-tenant-id',
'watch_folder_path': '/E-Ink LLM Input',
'output_folder_path': '/E-Ink LLM Output',
'poll_interval': 60,
'delete_after_processing': False
}
api_key = 'your-openai-api-key'
# Create processor instance
processor = OneDriveProcessor(onedrive_config, api_key)
# Start watching (runs indefinitely)
async def main():
await processor.start_watching()
asyncio.run(main())
Best Practices
- Always call initialize() or start_watching() (which calls initialize internally) before attempting to process files
- The class maintains state through processed_files set to avoid reprocessing the same files
- Use start_watching() for continuous monitoring; it handles initialization automatically
- Ensure OneDriveClient is properly configured with valid authentication credentials before instantiation
- The class creates temporary files in 'temp_onedrive' directory which are cleaned up after processing
- Handle KeyboardInterrupt gracefully when using start_watching() for clean shutdown
- The poll_interval should be set appropriately to balance responsiveness and API rate limits
- Supported file extensions are hardcoded but can be modified by accessing the supported_extensions attribute
- Set delete_after_processing to True in config only if you want original files removed after successful processing
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
class MixedCloudProcessor 68.9% similar
-
class OneDriveClient 61.1% similar
-
class RemarkableEInkProcessor 60.9% similar
-
class EInkLLMProcessor 56.7% similar
-
class DocumentProcessor_v2 55.7% similar