Skip to content

Text

text

Utilities for working text, markdown & rich text in Notion.

BASE_URL_PATTERN = 'https://(www.)?notion.so/' module-attribute

BLOCK_URL_LONG_RE = re.compile(f'^{BASE_URL_PATTERN}(?P<username>.*)/(?P<title>.*)-(?P<page_id>{UUID_PATTERN})\#(?P<block_id>{UUID_PATTERN})$', flags=re.IGNORECASE | re.VERBOSE) module-attribute

MAX_TEXT_OBJECT_SIZE = 2000 module-attribute

The max text size according to the Notion API is 2000 characters.

MD_STYLES = ('bold', 'italic', 'strikethrough', 'code', 'link') module-attribute

Markdown styles supported by Notion.

MD_STYLE_MAP = {'bold': '**', 'italic': '*', 'strikethrough': '~~', 'code': '`'} module-attribute

Mapping from markdown style to markdown symbol.

PAGE_URL_LONG_RE = re.compile(f'^{BASE_URL_PATTERN}(?P<title>.*)-(?P<page_id>{UUID_PATTERN})$', flags=re.IGNORECASE | re.VERBOSE) module-attribute

PAGE_URL_SHORT_RE = re.compile(f'^{BASE_URL_PATTERN}(?P<page_id>{UUID_PATTERN})$', flags=re.IGNORECASE | re.VERBOSE) module-attribute

UUID_PATTERN = '[0-9a-f]{8}-?[0-9a-f]{4}-?[0-9a-f]{4}-?[0-9a-f]{4}-?[0-9a-f]{12}' module-attribute

UUID_RE = re.compile(f'^(?P<id>{UUID_PATTERN})$') module-attribute

camel_case(string: str) -> str

Make a Python identifier in CamelCase.

Attention: This may result in an empty string and a CamelCase sting will be capitalized!

chunky(text: str, length: int = MAX_TEXT_OBJECT_SIZE) -> Iterator[str]

Break the given text into chunks of at most length size.

decapitalize(string: str) -> str

Inverse of capitalize.

extract_id(text: str) -> str | None

Examine the given text to find a valid Notion object ID.

html_img(url: str, size: float) -> str

Create a img tag in HTML.

md_comment(text: str) -> str

Create a markdown comment.

md_renderer() -> Markdown

Create a markdown renderer.

md_spans(rich_texts: list[RichTextBase]) -> np.ndarray

Convert rich text to markdown spans.

An span is a sequence of rich texts with the same markdown style expressed as a row in the returned array. The value k of the j-th column corresponds to the length of the current span richt_texts[j-k:j].

python_identifier(string: str) -> str

Make a valid Python identifier.

This will remove any leading characters that are not valid and change all invalid interior sequences to underscore.

Attention: This may result in an empty string!

rich_texts_to_markdown(rich_texts: list[RichTextBase]) -> str

Convert a list of rich texts to markdown.

snake_case(string: str) -> str

Make a Python identifier in snake_case.

Attention: This may result in an empty string!

sorted_md_spans(md_spans: np.ndarray) -> Iterator[tuple[int, int, str]]

Sort the spans of the given markdown spans in the right order.

We have to iterate from the smallest spans to the largest spans and from left to right.