function search_documents
Searches for documents in a Neo4j graph database based on multiple optional filter criteria including text query, document type, department, status, and owner.
/tf/active/vicechatdev/document_controller_backup.py
308 - 402
moderate
Purpose
This function provides a flexible document search capability for a controlled document management system. It constructs and executes a Cypher query against a Neo4j database to retrieve documents matching specified criteria. The function supports text-based searching across document titles and descriptions, as well as filtering by metadata fields. It includes logging via a decorator and returns results as a list of dictionaries, making it suitable for API endpoints or UI search interfaces.
Source Code
def search_documents(query=None, doc_type=None, department=None, status=None, owner=None, limit=100, user=None):
"""
Search for documents based on criteria.
Parameters
----------
query : str, optional
Text search query
doc_type : str, optional
Document type to filter by
department : str, optional
Department to filter by
status : str, optional
Status to filter by
owner : str, optional
Owner UID to filter by
limit : int, optional
Maximum number of results to return
user : DocUser, optional
The current user (for permission filtering)
Returns
-------
List[Dict[str, Any]]
List of document dictionaries matching the search criteria
"""
try:
from CDocs.db import db_operations
logger.info("Controller action: search_documents")
# Build the Cypher query
cypher_query = """
MATCH (d:Document)
"""
# Add optional filters
where_clauses = []
params = {}
if query:
where_clauses.append("(d.title CONTAINS $query OR d.description CONTAINS $query)")
params["query"] = query
if doc_type:
where_clauses.append("d.doc_type = $doc_type")
params["doc_type"] = doc_type
if department:
where_clauses.append("d.department = $department")
params["department"] = department
if status:
where_clauses.append("d.status = $status")
params["status"] = status
if owner:
where_clauses.append("d.owner_id = $owner")
params["owner"] = owner
# Add WHERE clause if we have any conditions
if where_clauses:
cypher_query += "WHERE " + " AND ".join(where_clauses)
# Add permission filtering if user is provided
# This is commented out for now as it depends on schema details
# if user and hasattr(user, 'uid') and user.role != 'ADMIN':
# # Only add more WHERE conditions if we already have some
# connector = "AND" if where_clauses else "WHERE"
# cypher_query += f" {connector} (d.owner_id = $user_id OR d.is_public = true)"
# params["user_id"] = user.uid
# Add RETURN clause with LIMIT
cypher_query += f"""
RETURN d
ORDER BY d.created_date DESC
LIMIT {int(limit)}
"""
# Execute query
result = db_operations.run_query(cypher_query, params)
# Process results into a list of document dictionaries
documents = []
if result:
for record in result:
if 'd' in record:
document = dict(record['d'])
documents.append(document)
return documents
except Exception as e:
logger.error(f"Error in controller action search_documents: {e}")
raise e
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
query |
- | None | positional_or_keyword |
doc_type |
- | None | positional_or_keyword |
department |
- | None | positional_or_keyword |
status |
- | None | positional_or_keyword |
owner |
- | None | positional_or_keyword |
limit |
- | 100 | positional_or_keyword |
user |
- | None | positional_or_keyword |
Parameter Details
query: Optional text string to search within document titles and descriptions using case-sensitive CONTAINS matching. Can be any string value or None to skip text search filtering.
doc_type: Optional string to filter documents by their type classification (e.g., 'policy', 'procedure', 'form'). Must match the exact value stored in the document's doc_type property.
department: Optional string to filter documents by the department they belong to. Must match the exact department name stored in the document's department property.
status: Optional string to filter documents by their current status (e.g., 'DRAFT', 'PUBLISHED', 'ARCHIVED'). Must match one of the valid status values defined in the system.
owner: Optional string representing the owner's unique identifier (UID) to filter documents by ownership. Must match the owner_id property stored on document nodes.
limit: Integer specifying the maximum number of documents to return. Defaults to 100. Must be a positive integer that will be cast to int for safety.
user: Optional DocUser object representing the current user making the search request. Intended for permission-based filtering (currently commented out in implementation). Should have 'uid' and 'role' attributes.
Return Value
Returns a List[Dict[str, Any]] containing document dictionaries. Each dictionary represents a document node from the database with all its properties (e.g., title, description, doc_type, department, status, owner_id, created_date). Returns an empty list if no documents match the criteria or if an error occurs during query execution. Documents are ordered by created_date in descending order (newest first).
Dependencies
loggingCDocs.db.db_operationsCDocs.controllers (for log_controller_action decorator)CDocs.models.user_extensions (for DocUser type hint)
Required Imports
import logging
from CDocs.controllers import log_controller_action
Conditional/Optional Imports
These imports are only needed under specific conditions:
from CDocs.db import db_operations
Condition: imported lazily inside the function at runtime, required for all executions
Required (conditional)Usage Example
# Basic text search
from CDocs.controllers import log_controller_action
import logging
logger = logging.getLogger(__name__)
@log_controller_action('search_documents')
def search_documents(query=None, doc_type=None, department=None, status=None, owner=None, limit=100, user=None):
# ... function implementation ...
pass
# Search for documents with 'safety' in title or description
results = search_documents(query='safety')
# Search for published policies in Engineering department
results = search_documents(
doc_type='policy',
department='Engineering',
status='PUBLISHED',
limit=50
)
# Search for documents owned by specific user
results = search_documents(owner='user-123-uid', limit=20)
# Combined search with multiple filters
results = search_documents(
query='quality',
doc_type='procedure',
status='EFFECTIVE',
department='QA',
limit=10
)
# Process results
for doc in results:
print(f"Title: {doc.get('title')}, Status: {doc.get('status')}")
Best Practices
- Always handle the returned list safely as it may be empty if no documents match the criteria
- Be aware that the text search using CONTAINS is case-sensitive; consider normalizing query strings if case-insensitive search is needed
- The limit parameter is cast to int for SQL injection protection, but validate it before calling if accepting user input
- The user-based permission filtering is currently commented out; implement it if role-based access control is required
- Catch and handle exceptions appropriately as the function re-raises any errors encountered
- Consider the performance impact of text searches on large document collections; indexing title and description fields in Neo4j is recommended
- The function uses lazy import of db_operations which may hide import errors until runtime; ensure the module is available
- When using multiple filters, understand they are combined with AND logic, which may return fewer results than expected
- The ORDER BY created_date DESC ensures newest documents appear first, but this may not be suitable for all use cases
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function get_documents 82.1% similar
-
function get_document_audit_trail 55.1% similar
-
function create_document_v2 50.3% similar
-
function get_document 50.2% similar
-
function run_query 49.1% similar