class SessionInfo
A dataclass that stores session information extracted from PDF documents, including conversation ID, exchange number, confidence level, and source of extraction.
/tf/active/vicechatdev/e-ink-llm/session_detector.py
26 - 31
simple
Purpose
SessionInfo serves as a structured data container for metadata about conversation sessions extracted from PDF files. It tracks the conversation identifier, the exchange number within that conversation, the confidence level of the extraction (0.0 to 1.0), and the source method used to extract this information (metadata, footer, filename, or content). This class is typically used in PDF parsing workflows to maintain structured session data with quality metrics.
Source Code
class SessionInfo:
"""Session information extracted from PDF"""
conversation_id: str
exchange_number: int
confidence: float # 0.0 to 1.0
source: str # 'metadata', 'footer', 'filename', 'content'
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
bases |
- | - |
Parameter Details
conversation_id: A string identifier for the conversation session. This uniquely identifies a conversation thread or session within the PDF document.
exchange_number: An integer representing the sequential number of the exchange within the conversation. Used to order multiple exchanges in a single conversation.
confidence: A float value between 0.0 and 1.0 indicating the confidence level of the extraction. Higher values indicate more reliable extraction. 1.0 represents complete confidence, 0.0 represents no confidence.
source: A string indicating where the session information was extracted from. Valid values are 'metadata' (PDF metadata fields), 'footer' (page footer text), 'filename' (PDF filename parsing), or 'content' (document body content).
Return Value
Instantiation returns a SessionInfo object with all four attributes set. As a dataclass, it automatically provides __init__, __repr__, __eq__, and other standard methods. The object is immutable by default unless frozen=False is specified in the dataclass decorator.
Class Interface
Methods
__init__(conversation_id: str, exchange_number: int, confidence: float, source: str) -> None
Purpose: Initialize a SessionInfo instance with conversation metadata. Automatically generated by @dataclass decorator.
Parameters:
conversation_id: String identifier for the conversationexchange_number: Integer representing the exchange sequence numberconfidence: Float between 0.0 and 1.0 indicating extraction confidencesource: String indicating extraction source ('metadata', 'footer', 'filename', or 'content')
Returns: None - initializes the instance
__repr__() -> str
Purpose: Return a string representation of the SessionInfo instance. Automatically generated by @dataclass decorator.
Returns: String representation in format: SessionInfo(conversation_id='...', exchange_number=..., confidence=..., source='...')
__eq__(other: object) -> bool
Purpose: Compare two SessionInfo instances for equality. Automatically generated by @dataclass decorator.
Parameters:
other: Another object to compare with
Returns: True if all attributes match, False otherwise
Attributes
| Name | Type | Description | Scope |
|---|---|---|---|
conversation_id |
str | Unique identifier for the conversation session extracted from the PDF | instance |
exchange_number |
int | Sequential number of the exchange within the conversation, used for ordering | instance |
confidence |
float | Confidence level of the extraction ranging from 0.0 (no confidence) to 1.0 (complete confidence) | instance |
source |
str | Source of the extraction, valid values are 'metadata', 'footer', 'filename', or 'content' | instance |
Dependencies
dataclasses
Required Imports
from dataclasses import dataclass
Usage Example
from dataclasses import dataclass
@dataclass
class SessionInfo:
conversation_id: str
exchange_number: int
confidence: float
source: str
# Create a SessionInfo instance
session = SessionInfo(
conversation_id="conv_12345",
exchange_number=3,
confidence=0.95,
source="metadata"
)
# Access attributes
print(session.conversation_id) # Output: conv_12345
print(session.exchange_number) # Output: 3
print(session.confidence) # Output: 0.95
print(session.source) # Output: metadata
# Dataclass provides automatic __repr__
print(session) # Output: SessionInfo(conversation_id='conv_12345', exchange_number=3, confidence=0.95, source='metadata')
# Dataclass provides automatic equality comparison
session2 = SessionInfo("conv_12345", 3, 0.95, "metadata")
print(session == session2) # Output: True
Best Practices
- Always ensure confidence values are between 0.0 and 1.0 when creating instances
- Use one of the four valid source values: 'metadata', 'footer', 'filename', or 'content'
- Consider adding validation logic if extending this class to enforce confidence range and source value constraints
- This is a data container class - it should not contain business logic, only data storage
- The dataclass decorator automatically generates __init__, __repr__, and __eq__ methods, so no need to define them manually
- If immutability is desired, add frozen=True to the @dataclass decorator
- Exchange numbers should typically start at 1 or 0 and increment sequentially
- When comparing SessionInfo objects, all four attributes must match for equality
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
class SessionDetector 74.0% similar
-
class DataAnalysisSession 64.2% similar
-
class StatisticalSession 64.1% similar
-
class AnnotationInfo 63.4% similar
-
class DataAnalysisSession_v1 63.2% similar