Text Extractor

A tool for extracting the fulltext from a digital document.

To use in an API class, follow the standard convention accessing the tools property that Sermos injects.

Example:

class DemoApiClass(object):
    def post(self):
        TODO: Example

To use in a worker method, follow the standard convention accessing the tools argument that Sermos injects.

Example:

def demo_worker_task(event, tools):
    TODO: Example
class sermos_tools.catalog.text_extractor.text_extractor.TextExtractor(document_bytes: bytes, mimetype: Optional[str] = None, full_text: str = '', word_count: int = 0, page_count: int = 0)

Extract fulltext from supported filetypes.

Usage:

extractor = TextExtractor(
    document_bytes=blob_bytes
)
fulltext = extractor.full_text