🔍 Code Extractor

class SessionInfo

Maturity: 41

A dataclass that stores session information extracted from PDF documents, including conversation ID, exchange number, confidence level, and source of extraction.

File:
/tf/active/vicechatdev/e-ink-llm/session_detector.py
Lines:
26 - 31
Complexity:
simple

Purpose

SessionInfo serves as a structured data container for metadata about conversation sessions extracted from PDF files. It tracks the conversation identifier, the exchange number within that conversation, the confidence level of the extraction (0.0 to 1.0), and the source method used to extract this information (metadata, footer, filename, or content). This class is typically used in PDF parsing workflows to maintain structured session data with quality metrics.

Source Code

class SessionInfo:
    """Session information extracted from PDF"""
    conversation_id: str
    exchange_number: int
    confidence: float  # 0.0 to 1.0
    source: str  # 'metadata', 'footer', 'filename', 'content'

Parameters

Name Type Default Kind
bases - -

Parameter Details

conversation_id: A string identifier for the conversation session. This uniquely identifies a conversation thread or session within the PDF document.

exchange_number: An integer representing the sequential number of the exchange within the conversation. Used to order multiple exchanges in a single conversation.

confidence: A float value between 0.0 and 1.0 indicating the confidence level of the extraction. Higher values indicate more reliable extraction. 1.0 represents complete confidence, 0.0 represents no confidence.

source: A string indicating where the session information was extracted from. Valid values are 'metadata' (PDF metadata fields), 'footer' (page footer text), 'filename' (PDF filename parsing), or 'content' (document body content).

Return Value

Instantiation returns a SessionInfo object with all four attributes set. As a dataclass, it automatically provides __init__, __repr__, __eq__, and other standard methods. The object is immutable by default unless frozen=False is specified in the dataclass decorator.

Class Interface

Methods

__init__(conversation_id: str, exchange_number: int, confidence: float, source: str) -> None

Purpose: Initialize a SessionInfo instance with conversation metadata. Automatically generated by @dataclass decorator.

Parameters:

  • conversation_id: String identifier for the conversation
  • exchange_number: Integer representing the exchange sequence number
  • confidence: Float between 0.0 and 1.0 indicating extraction confidence
  • source: String indicating extraction source ('metadata', 'footer', 'filename', or 'content')

Returns: None - initializes the instance

__repr__() -> str

Purpose: Return a string representation of the SessionInfo instance. Automatically generated by @dataclass decorator.

Returns: String representation in format: SessionInfo(conversation_id='...', exchange_number=..., confidence=..., source='...')

__eq__(other: object) -> bool

Purpose: Compare two SessionInfo instances for equality. Automatically generated by @dataclass decorator.

Parameters:

  • other: Another object to compare with

Returns: True if all attributes match, False otherwise

Attributes

Name Type Description Scope
conversation_id str Unique identifier for the conversation session extracted from the PDF instance
exchange_number int Sequential number of the exchange within the conversation, used for ordering instance
confidence float Confidence level of the extraction ranging from 0.0 (no confidence) to 1.0 (complete confidence) instance
source str Source of the extraction, valid values are 'metadata', 'footer', 'filename', or 'content' instance

Dependencies

  • dataclasses

Required Imports

from dataclasses import dataclass

Usage Example

from dataclasses import dataclass

@dataclass
class SessionInfo:
    conversation_id: str
    exchange_number: int
    confidence: float
    source: str

# Create a SessionInfo instance
session = SessionInfo(
    conversation_id="conv_12345",
    exchange_number=3,
    confidence=0.95,
    source="metadata"
)

# Access attributes
print(session.conversation_id)  # Output: conv_12345
print(session.exchange_number)  # Output: 3
print(session.confidence)  # Output: 0.95
print(session.source)  # Output: metadata

# Dataclass provides automatic __repr__
print(session)  # Output: SessionInfo(conversation_id='conv_12345', exchange_number=3, confidence=0.95, source='metadata')

# Dataclass provides automatic equality comparison
session2 = SessionInfo("conv_12345", 3, 0.95, "metadata")
print(session == session2)  # Output: True

Best Practices

  • Always ensure confidence values are between 0.0 and 1.0 when creating instances
  • Use one of the four valid source values: 'metadata', 'footer', 'filename', or 'content'
  • Consider adding validation logic if extending this class to enforce confidence range and source value constraints
  • This is a data container class - it should not contain business logic, only data storage
  • The dataclass decorator automatically generates __init__, __repr__, and __eq__ methods, so no need to define them manually
  • If immutability is desired, add frozen=True to the @dataclass decorator
  • Exchange numbers should typically start at 1 or 0 and increment sequentially
  • When comparing SessionInfo objects, all four attributes must match for equality

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class SessionDetector 74.0% similar

    Detects session information (conversation ID and exchange number) from PDF files using multiple detection methods including metadata, filename, footer, and content analysis.

    From: /tf/active/vicechatdev/e-ink-llm/session_detector.py
  • class DataAnalysisSession 64.2% similar

    A dataclass representing a data analysis session that is linked to a specific text section within a document, managing conversation messages, analysis results, plots, and configuration.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class StatisticalSession 64.1% similar

    A dataclass representing a statistical analysis session that tracks metadata, configuration, and status of data analysis operations.

    From: /tf/active/vicechatdev/vice_ai/smartstat_models.py
  • class AnnotationInfo 63.4% similar

    A dataclass that stores comprehensive information about a detected annotation in a PDF document, including its type, visual properties, location, and associated text content.

    From: /tf/active/vicechatdev/e-ink-llm/annotation_detector.py
  • class DataAnalysisSession_v1 63.2% similar

    A dataclass representing a statistical analysis session that is linked to specific document sections, managing analysis state, messages, plots, and configuration.

    From: /tf/active/vicechatdev/vice_ai/models.py
← Back to Browse