Text Extractor¶
A tool for extracting the fulltext from a digital document.
To use in an API class, follow the standard convention accessing the
tools
property that Sermos injects.
Example:
class DemoApiClass(object):
def post(self):
TODO: Example
To use in a worker method, follow the standard convention accessing the
tools
argument that Sermos injects.
Example:
def demo_worker_task(event, tools):
TODO: Example
-
class
sermos_tools.catalog.text_extractor.text_extractor.
TextExtractor
(document_bytes: bytes, mimetype: Optional[str] = None, full_text: str = '', word_count: int = 0, page_count: int = 0)¶ Extract fulltext from supported filetypes.
Usage:
extractor = TextExtractor( document_bytes=blob_bytes ) fulltext = extractor.full_text