class Document_v1
Document class represents a reMarkable document file, extending the Item class to provide document-specific operations like content extraction, uploading, and rendering with annotations.
/tf/active/vicechatdev/rmcl/items.py
214 - 281
complex
Purpose
The Document class manages reMarkable document files (PDF, EPUB, notes). It handles document content extraction from ZIP archives, uploading new documents with proper metadata structure, and rendering annotated versions using the rmrl library. It caches annotated document sizes for performance optimization and provides both synchronous and asynchronous interfaces for all operations.
Source Code
class Document(Item):
def __init__(self, *args, **kw):
super().__init__(*args, **kw)
self._annotated_size = datacache.get_property(self.id, self.version, 'annotated_size')
@add_sync
async def contents(self):
if await self.type() in (FileType.notes, FileType.unknown):
return await self.raw()
zf = zipfile.ZipFile(await self.raw(), 'r')
for f in zf.filelist:
if f.filename.endswith(str(await self.type())):
return zf.open(f)
return io.BytesIO(b'Unable to load file contents')
@add_sync
async def upload(self, new_contents, type_):
if type_ not in (FileType.pdf, FileType.epub):
raise TypeError(f"Cannot upload file of type {type_}")
content = {
'extraMetadata': {},
'fileType': str(type_),
'lastOpenedPage': 0,
'lineHeight': -1,
'margins': 100,
'pageCount': 0,
'textScale': 1,
'transform': {},
}
f = io.BytesIO()
with zipfile.ZipFile(f, 'w', zipfile.ZIP_DEFLATED) as zf:
zf.writestr(f'{self.id}.pagedata','')
zf.writestr(f'{self.id}.content', json.dumps(content))
zf.writestr(f'{self.id}.{type_}', new_contents.read())
f.seek(0)
return await self.upload_raw(f)
@add_sync
async def annotated(self, **render_kw):
if render is None:
raise ImportError("rmrl must be installed to get annotated documents")
if 'progress_cb' not in render_kw:
render_kw['progress_cb'] = (
lambda pct: log.info(f"Rendering {self}: {pct:0.1f}%"))
zf = zipfile.ZipFile(await self.raw(), 'r')
# run_sync doesn't accept keyword arguments to be passed to the sync
# function, so we'll assemble to function to call out here.
render_func = lambda: render(sources.ZipSource(zf), **render_kw)
contents = (await trio.to_thread.run_sync(render_func))
# Seek to end to get the length of this file.
contents.seek(0, 2)
self._annotated_size = contents.tell()
datacache.set_property(self.id, self.version, 'annotated_size', self._annotated_size)
contents.seek(0)
return contents
@add_sync
async def annotated_size(self):
if self._annotated_size is not None:
return self._annotated_size
return await self.size()
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
bases |
Item | - |
Parameter Details
*args: Variable positional arguments passed to the parent Item class constructor for basic item initialization
**kw: Variable keyword arguments passed to the parent Item class constructor, typically including item metadata like id, version, and other Item properties
Return Value
Instantiation returns a Document object that represents a reMarkable document. Key method returns: contents() returns a file-like object with document contents; upload() returns the result of upload_raw(); annotated() returns a BytesIO object containing the rendered PDF with annotations; annotated_size() returns an integer representing the size in bytes.
Class Interface
Methods
__init__(self, *args, **kw)
Purpose: Initialize a Document instance, loading cached annotated size from datacache
Parameters:
*args: Positional arguments passed to parent Item class**kw: Keyword arguments passed to parent Item class
Returns: None (constructor)
async contents(self) -> io.BytesIO
Purpose: Extract and return the document contents from the ZIP archive, handling different file types
Returns: File-like object containing the document contents (raw for notes/unknown types, extracted from ZIP for PDF/EPUB)
contents_sync(self) -> io.BytesIO
Purpose: Synchronous version of contents() method
Returns: File-like object containing the document contents
async upload(self, new_contents, type_: FileType)
Purpose: Upload a new document file (PDF or EPUB) with proper reMarkable metadata structure
Parameters:
new_contents: File-like object with read() method containing the document data to uploadtype_: FileType enum value, must be FileType.pdf or FileType.epub
Returns: Result from upload_raw() method call
upload_sync(self, new_contents, type_: FileType)
Purpose: Synchronous version of upload() method
Parameters:
new_contents: File-like object with read() method containing the document datatype_: FileType enum value (PDF or EPUB)
Returns: Result from upload_raw() method call
async annotated(self, **render_kw) -> io.BytesIO
Purpose: Render the document with annotations using rmrl library, caching the resulting size
Parameters:
**render_kw: Keyword arguments passed to rmrl.render(), such as progress_cb for progress callbacks
Returns: BytesIO object containing the rendered PDF with annotations, seeked to position 0
annotated_sync(self, **render_kw) -> io.BytesIO
Purpose: Synchronous version of annotated() method
Parameters:
**render_kw: Keyword arguments for rendering
Returns: BytesIO object containing the rendered PDF with annotations
async annotated_size(self) -> int
Purpose: Get the size of the annotated document, using cached value if available or falling back to regular size
Returns: Integer representing the size in bytes of the annotated document
annotated_size_sync(self) -> int
Purpose: Synchronous version of annotated_size() method
Returns: Integer representing the size in bytes
Attributes
| Name | Type | Description | Scope |
|---|---|---|---|
_annotated_size |
int or None | Cached size in bytes of the annotated document, loaded from datacache on initialization and updated when annotated() is called | instance |
id |
str | Unique identifier for the document, inherited from Item class | instance |
version |
int | Version number of the document, inherited from Item class | instance |
Dependencies
triormrl
Required Imports
import functools
import io
import json
import logging
import uuid
import zipfile
import trio
from const import ROOT_ID
from const import TRASH_ID
from const import FileType
from datacache import datacache
from exceptions import DocumentNotFound
from exceptions import VirtualItemError
from sync import add_sync
from utils import now
from utils import parse_datetime
Conditional/Optional Imports
These imports are only needed under specific conditions:
from rmrl import render
Condition: only needed when calling the annotated() method to render documents with annotations
Optionalfrom rmrl import sources
Condition: only needed when calling the annotated() method to provide ZIP source for rendering
OptionalUsage Example
# Instantiate a Document (typically done internally by the library)
doc = Document(id='some-uuid', version=1, parent='parent-id')
# Get document contents (async)
contents = await doc.contents()
# Or use synchronous version
contents = doc.contents_sync()
# Upload a new PDF document
with open('new_document.pdf', 'rb') as f:
await doc.upload(f, FileType.pdf)
# Get annotated version with custom rendering options
annotated_pdf = await doc.annotated(progress_cb=lambda pct: print(f'{pct}%'))
# Get size of annotated document
size = await doc.annotated_size()
# Synchronous versions are also available
annotated_pdf = doc.annotated_sync()
size = doc.annotated_size_sync()
Best Practices
- Always check document type before calling upload() - only PDF and EPUB are supported
- The annotated() method requires rmrl library to be installed; handle ImportError appropriately
- Use async methods (await) in async contexts, or use the _sync versions in synchronous code
- The class caches annotated_size in datacache for performance; this persists across instances
- When uploading documents, provide a file-like object with a read() method
- The annotated() method can be resource-intensive; consider providing a progress callback
- Document contents are stored in ZIP format internally; the class handles extraction automatically
- The _annotated_size attribute is lazily loaded from datacache on initialization
- Methods decorated with @add_sync automatically generate synchronous versions with _sync suffix
- Always seek file-like objects to position 0 before reading if you need to read multiple times
Similar Components
AI-powered semantic similarity - components with related functionality:
-
class Item 69.2% similar
-
class Document 65.2% similar
-
class RemarkableCloudManager 64.7% similar
-
class RemarkableNode 64.5% similar
-
class RemarkableCloudWatcher 63.7% similar