šŸ” Code Extractor

function search_and_locate

Maturity: 47

Searches for specific numbered folders (01-08) in a SharePoint site and traces their locations, contents, and file distributions by type.

File:
/tf/active/vicechatdev/SPFCsync/search_detailed.py
Lines:
10 - 125
Complexity:
moderate

Purpose

This diagnostic function performs comprehensive searches across a SharePoint site to locate expected organizational folders (Research, Toxicology, CMC, Quality, Clinical, Regulatory, Marketing, Manufacturing) and analyze file distributions. It provides detailed output about folder locations, contents (subfolder and file counts), and searches for common document types (docx, pdf, xlsx, pptx) to understand the site structure. Primarily used for troubleshooting missing folders or understanding SharePoint site organization.

Source Code

def search_and_locate():
    """Search for specific folders and trace their locations"""
    config = Config()
    
    try:
        client = SharePointGraphClient(
            site_url=config.SHAREPOINT_SITE_URL,
            client_id=config.AZURE_CLIENT_ID,
            client_secret=config.AZURE_CLIENT_SECRET
        )
        
        print("āœ… SharePoint Graph client initialized successfully")
        
    except Exception as e:
        print(f"āŒ Failed to initialize client: {e}")
        return
    
    print("šŸ” DETAILED FOLDER LOCATION SEARCH")
    print("=" * 60)
    
    # List of folders we expect to find
    expected_folders = [
        "01 UCJ Research",
        "02 Toxicology", 
        "03 CMC",
        "04 Quality",
        "05 Clinical",
        "06 Regulatory",
        "07 Marketing",
        "08 Manufacturing"
    ]
    
    for folder_name in expected_folders:
        print(f"\nšŸ“ Searching for: '{folder_name}'")
        print("-" * 40)
        
        # Search using Graph API
        search_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/root/search(q='{folder_name}')"
        
        try:
            response = client.session.get(search_url)
            if response.status_code == 200:
                results = response.json()
                
                if 'value' in results and results['value']:
                    print(f"āœ… Found {len(results['value'])} items")
                    
                    for item in results['value']:
                        item_type = "šŸ“ Folder" if 'folder' in item else "šŸ“„ File"
                        name = item.get('name', 'Unknown')
                        parent_path = item.get('parentReference', {}).get('path', 'Unknown path')
                        web_url = item.get('webUrl', 'No URL')
                        
                        print(f"  {item_type}: {name}")
                        print(f"    Path: {parent_path}")
                        print(f"    URL: {web_url}")
                        
                        # If it's a folder, try to get its contents
                        if 'folder' in item:
                            folder_id = item['id']
                            children_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/items/{folder_id}/children"
                            try:
                                children_response = client.session.get(children_url)
                                if children_response.status_code == 200:
                                    children = children_response.json()
                                    if 'value' in children:
                                        child_count = len(children['value'])
                                        folders = sum(1 for child in children['value'] if 'folder' in child)
                                        files = sum(1 for child in children['value'] if 'file' in child)
                                        print(f"    Contents: {folders} folders, {files} files (total: {child_count})")
                            except Exception as e:
                                print(f"    āš ļø  Couldn't access folder contents: {e}")
                        
                        print()
                else:
                    print("āŒ No results found")
            else:
                print(f"āŒ Search failed: {response.status_code} - {response.text}")
                
        except Exception as e:
            print(f"āŒ Search error: {e}")
    
    # Also try searching for common file types to see if we find files from missing folders
    print("\n\nšŸ” SEARCHING FOR FILES BY TYPE")
    print("=" * 40)
    
    file_types = ['docx', 'pdf', 'xlsx', 'pptx']
    
    for file_type in file_types:
        print(f"\nšŸ“„ Searching for .{file_type} files...")
        search_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/root/search(q='*.{file_type}')"
        
        try:
            response = client.session.get(search_url)
            if response.status_code == 200:
                results = response.json()
                
                if 'value' in results and results['value']:
                    print(f"āœ… Found {len(results['value'])} .{file_type} files")
                    
                    # Group by parent path to see folder distribution
                    path_counts = {}
                    for item in results['value']:
                        parent_path = item.get('parentReference', {}).get('path', 'Unknown')
                        path_counts[parent_path] = path_counts.get(parent_path, 0) + 1
                    
                    print("  Distribution by folder:")
                    for path, count in sorted(path_counts.items()):
                        print(f"    {count} files in: {path}")
                else:
                    print("āŒ No files found")
            else:
                print(f"āŒ Search failed: {response.status_code}")
                
        except Exception as e:
            print(f"āŒ Search error: {e}")

Return Value

This function returns None. It performs side effects by printing detailed search results to stdout, including folder locations, web URLs, content counts, and file type distributions across the SharePoint site.

Dependencies

  • json
  • sharepoint_graph_client
  • config

Required Imports

import json
from sharepoint_graph_client import SharePointGraphClient
from config import Config

Usage Example

# Ensure config.py exists with required settings:
# class Config:
#     SHAREPOINT_SITE_URL = 'https://yourtenant.sharepoint.com/sites/yoursite'
#     AZURE_CLIENT_ID = 'your-client-id'
#     AZURE_CLIENT_SECRET = 'your-client-secret'

# Ensure sharepoint_graph_client.py exists with SharePointGraphClient class

import json
from sharepoint_graph_client import SharePointGraphClient
from config import Config

def search_and_locate():
    # ... function code ...
    pass

# Run the search and diagnostic
search_and_locate()

# Output will be printed to console with:
# - Folder search results for 8 expected folders
# - Folder contents (subfolder and file counts)
# - File type distribution analysis
# - Web URLs for located items

Best Practices

  • Ensure Azure AD app has appropriate SharePoint permissions (Sites.Read.All or Sites.ReadWrite.All)
  • This function is designed for diagnostic/troubleshooting purposes and prints extensive output to console
  • The function searches for hardcoded folder names - modify the expected_folders list to match your SharePoint structure
  • Handle rate limiting when searching large SharePoint sites - consider adding delays between requests
  • The function does not handle pagination - large result sets may be truncated
  • Error handling is present but errors are printed rather than raised, making this unsuitable for automated pipelines
  • Consider redirecting output to a log file when running in production environments
  • The search uses Microsoft Graph API search syntax - folder names with special characters may need escaping

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function search_for_folders 80.4% similar

    Searches for specific predefined folders in a SharePoint site using Microsoft Graph API and prints the search results with their locations.

    From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
  • function analyze_structure 77.2% similar

    Analyzes and reports on the folder structure of a SharePoint site, displaying folder paths, file counts, and searching for expected folder patterns.

    From: /tf/active/vicechatdev/SPFCsync/analyze_structure.py
  • function test_folder_structure 71.7% similar

    Tests SharePoint folder structure by listing root-level folders, displaying their contents, and providing a summary of total folders and documents.

    From: /tf/active/vicechatdev/SPFCsync/test_folder_structure.py
  • function main_v24 71.7% similar

    A diagnostic function that explores SharePoint site structure to investigate why only 2 folders are visible when more are expected in the web interface.

    From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
  • function explore_alternative_endpoints 69.6% similar

    Tests multiple Microsoft Graph API endpoints to locate missing folders in a SharePoint drive by trying different URL patterns and searching for expected folders.

    From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
← Back to Browse