app.reporter.markdown
Markdown-to-HTML conversion and confidence-token highlighting.
Provides functions for converting AI-produced Markdown (headings, lists, bold, italic, code spans, tables, fenced code blocks) into safe HTML suitable for embedding in forensic reports. Also includes Jinja2 filter wrappers for direct use in templates.
Key capabilities:
- Inline formatting -- bold, italic, backtick code spans.
- Block elements -- headings, ordered/unordered lists, fenced code blocks,
paragraphs with
<br>line breaks. - Tables -- pipe-delimited Markdown tables rendered as
<table>elements. - Confidence highlighting -- severity tokens (
CRITICAL,HIGH,MEDIUM,LOW) wrapped in coloured<span>elements.
Attributes:
- CONFIDENCE_PATTERN: Regex matching severity tokens (case-insensitive).
- CONFIDENCE_CLASS_MAP: Maps severity label strings to CSS class names.
- MARKDOWN_HEADING_PATTERN: Regex matching ATX-style headings (
#--######). - MARKDOWN_ORDERED_LIST_PATTERN: Regex matching ordered list items.
- MARKDOWN_UNORDERED_LIST_PATTERN: Regex matching unordered list items.
- MARKDOWN_BOLD_STAR_PATTERN: Regex matching
**bold**syntax. - MARKDOWN_BOLD_UNDERSCORE_PATTERN: Regex matching
__bold__syntax. - MARKDOWN_ITALIC_STAR_PATTERN: Regex matching
*italic*syntax. - MARKDOWN_ITALIC_UNDERSCORE_PATTERN: Regex matching
_italic_syntax. - MARKDOWN_TABLE_SEPARATOR_CELL_PATTERN: Regex matching table separator cells.
1"""Markdown-to-HTML conversion and confidence-token highlighting. 2 3Provides functions for converting AI-produced Markdown (headings, lists, bold, 4italic, code spans, tables, fenced code blocks) into safe HTML suitable for 5embedding in forensic reports. Also includes Jinja2 filter wrappers for 6direct use in templates. 7 8Key capabilities: 9 10* **Inline formatting** -- bold, italic, backtick code spans. 11* **Block elements** -- headings, ordered/unordered lists, fenced code blocks, 12 paragraphs with ``<br>`` line breaks. 13* **Tables** -- pipe-delimited Markdown tables rendered as ``<table>`` elements. 14* **Confidence highlighting** -- severity tokens (``CRITICAL``, ``HIGH``, 15 ``MEDIUM``, ``LOW``) wrapped in coloured ``<span>`` elements. 16 17Attributes: 18 CONFIDENCE_PATTERN: Regex matching severity tokens (case-insensitive). 19 CONFIDENCE_CLASS_MAP: Maps severity label strings to CSS class names. 20 MARKDOWN_HEADING_PATTERN: Regex matching ATX-style headings (``#``--``######``). 21 MARKDOWN_ORDERED_LIST_PATTERN: Regex matching ordered list items. 22 MARKDOWN_UNORDERED_LIST_PATTERN: Regex matching unordered list items. 23 MARKDOWN_BOLD_STAR_PATTERN: Regex matching ``**bold**`` syntax. 24 MARKDOWN_BOLD_UNDERSCORE_PATTERN: Regex matching ``__bold__`` syntax. 25 MARKDOWN_ITALIC_STAR_PATTERN: Regex matching ``*italic*`` syntax. 26 MARKDOWN_ITALIC_UNDERSCORE_PATTERN: Regex matching ``_italic_`` syntax. 27 MARKDOWN_TABLE_SEPARATOR_CELL_PATTERN: Regex matching table separator cells. 28""" 29 30from __future__ import annotations 31 32from collections.abc import Sequence 33import html 34import re 35from typing import Any 36 37from markupsafe import Markup, escape 38 39__all__ = [ 40 "CONFIDENCE_CLASS_MAP", 41 "CONFIDENCE_PATTERN", 42 "format_block", 43 "format_markdown_block", 44 "highlight_confidence_tokens", 45 "markdown_to_html", 46 "render_inline_markdown", 47] 48 49CONFIDENCE_PATTERN = re.compile(r"\b(CRITICAL|HIGH|MEDIUM|LOW)\b", re.IGNORECASE) 50MARKDOWN_HEADING_PATTERN = re.compile(r"^(#{1,6})\s+(.*)$") 51MARKDOWN_ORDERED_LIST_PATTERN = re.compile(r"^\d+\.\s+(.*)$") 52MARKDOWN_UNORDERED_LIST_PATTERN = re.compile(r"^[-*]\s+(.*)$") 53MARKDOWN_BOLD_STAR_PATTERN = re.compile(r"\*\*(.+?)\*\*") 54MARKDOWN_BOLD_UNDERSCORE_PATTERN = re.compile(r"__(.+?)__") 55MARKDOWN_ITALIC_STAR_PATTERN = re.compile(r"(?<!\*)\*(?!\*)(.+?)(?<!\*)\*(?!\*)") 56MARKDOWN_ITALIC_UNDERSCORE_PATTERN = re.compile(r"(?<!_)_(?!_)(.+?)(?<!_)_(?!_)") 57MARKDOWN_TABLE_SEPARATOR_CELL_PATTERN = re.compile(r"^:?-{3,}:?$") 58 59CONFIDENCE_CLASS_MAP = { 60 "CRITICAL": "confidence-critical", 61 "HIGH": "confidence-high", 62 "MEDIUM": "confidence-medium", 63 "LOW": "confidence-low", 64} 65 66 67def _stringify(value: Any, default: str = "") -> str: 68 """Convert *value* to a stripped string, returning *default* if empty. 69 70 Args: 71 value: Any value to convert to string. 72 default: Fallback when *value* is None or empty after stripping. 73 74 Returns: 75 The stripped string representation, or *default*. 76 """ 77 if value is None: 78 return default 79 text = str(value).strip() 80 return text if text else default 81 82 83def highlight_confidence_tokens(text: str) -> str: 84 """Wrap severity tokens in coloured ``<span>`` elements. 85 86 Matches ``CRITICAL``, ``HIGH``, ``MEDIUM``, and ``LOW`` 87 (case-insensitive) and wraps each in a ``<span>`` with the 88 corresponding CSS class from :data:`CONFIDENCE_CLASS_MAP`. 89 90 Args: 91 text: Pre-escaped HTML string to scan for severity tokens. 92 93 Returns: 94 The input string with severity tokens wrapped in spans. 95 """ 96 def _replace_confidence(match: re.Match[str]) -> str: 97 """Replace a confidence token match with a styled span.""" 98 token = match.group(1).upper() 99 css_class = CONFIDENCE_CLASS_MAP.get(token, "confidence-unknown") 100 return f'<span class="confidence-inline {css_class}">{token}</span>' 101 102 return CONFIDENCE_PATTERN.sub(_replace_confidence, text) 103 104 105def render_inline_markdown(value: str) -> str: 106 """Render inline Markdown formatting to HTML. 107 108 Handles backtick code spans, bold (``**`` and ``__``), italic 109 (``*`` and ``_``), and confidence-token highlighting. Code spans 110 are preserved verbatim; all other text is HTML-escaped first. 111 112 Args: 113 value: Raw inline Markdown text. 114 115 Returns: 116 An HTML string with inline formatting applied. 117 """ 118 source = str(value or "") 119 if not source: 120 return "" 121 122 parts = re.split(r"(`[^`\n]*`)", source) 123 output: list[str] = [] 124 for part in parts: 125 if not part: 126 continue 127 if part.startswith("`") and part.endswith("`"): 128 output.append(f"<code>{part[1:-1]}</code>") 129 continue 130 131 escaped = part 132 escaped = MARKDOWN_BOLD_STAR_PATTERN.sub(r"<strong>\1</strong>", escaped) 133 escaped = MARKDOWN_BOLD_UNDERSCORE_PATTERN.sub(r"<strong>\1</strong>", escaped) 134 escaped = MARKDOWN_ITALIC_STAR_PATTERN.sub(r"<em>\1</em>", escaped) 135 escaped = MARKDOWN_ITALIC_UNDERSCORE_PATTERN.sub(r"<em>\1</em>", escaped) 136 escaped = highlight_confidence_tokens(escaped) 137 output.append(escaped) 138 return "".join(output) 139 140 141def _split_table_row(value: str) -> list[str]: 142 """Split a Markdown table row into cell strings. 143 144 Strips leading and trailing pipe characters before splitting on 145 the remaining pipes. Returns an empty list when *value* does 146 not contain a pipe. 147 148 Args: 149 value: A single Markdown table row line. 150 151 Returns: 152 A list of stripped cell strings, or an empty list. 153 """ 154 row_text = str(value or "") 155 if "|" not in row_text: 156 return [] 157 158 trimmed = row_text.strip() 159 if not trimmed or "|" not in trimmed: 160 return [] 161 162 if trimmed.startswith("|"): 163 trimmed = trimmed[1:] 164 if trimmed.endswith("|"): 165 trimmed = trimmed[:-1] 166 167 return [cell.strip() for cell in trimmed.split("|")] 168 169 170def _is_table_separator_row(cells: Sequence[str]) -> bool: 171 """Determine whether *cells* represent a Markdown table separator row. 172 173 A separator row consists entirely of cells matching the pattern 174 ``:?-+:?`` (e.g. ``---``, ``:---:``, ``---:``). 175 176 Args: 177 cells: List of cell strings from a split table row. 178 179 Returns: 180 *True* when every cell matches the separator pattern. 181 """ 182 if not cells: 183 return False 184 return all(MARKDOWN_TABLE_SEPARATOR_CELL_PATTERN.match(str(cell).strip()) for cell in cells) 185 186 187def _normalize_table_row_cells(cells: Sequence[str], expected_count: int) -> list[str]: 188 """Pad or truncate *cells* to exactly *expected_count* entries. 189 190 Cells beyond *expected_count* are discarded; missing cells are 191 filled with empty strings. 192 193 Args: 194 cells: Raw cell values from a split table row. 195 expected_count: The desired number of columns. 196 197 Returns: 198 A list of exactly *expected_count* stripped cell strings. 199 """ 200 normalized = [str(cell).strip() for cell in cells[:expected_count]] 201 if len(normalized) < expected_count: 202 normalized.extend([""] * (expected_count - len(normalized))) 203 return normalized 204 205 206def _render_table_html(header_cells: Sequence[str], body_rows: Sequence[Sequence[str]]) -> str: 207 """Render a parsed Markdown table as an HTML ``<table>`` element. 208 209 Each cell value is processed through :func:`render_inline_markdown` 210 so that inline formatting (bold, italic, code spans) is preserved. 211 212 Args: 213 header_cells: List of header cell strings. 214 body_rows: List of body row lists, each containing cell 215 strings matching the header column count. 216 217 Returns: 218 An HTML string containing the complete ``<table>`` element. 219 """ 220 header_html = "".join(f"<th>{render_inline_markdown(cell)}</th>" for cell in header_cells) 221 table_html = [f"<table><thead><tr>{header_html}</tr></thead>"] 222 223 if body_rows: 224 rows_html: list[str] = [] 225 for row in body_rows: 226 row_html = "".join(f"<td>{render_inline_markdown(cell)}</td>" for cell in row) 227 rows_html.append(f"<tr>{row_html}</tr>") 228 table_html.append(f"<tbody>{''.join(rows_html)}</tbody>") 229 230 table_html.append("</table>") 231 return "".join(table_html) 232 233 234def markdown_to_html(value: str) -> str: 235 """Convert a complete Markdown text block to HTML. 236 237 Supports headings (``#`` through ``######``), ordered and 238 unordered lists, fenced code blocks (triple backticks), tables, 239 inline formatting (bold, italic, code spans), and 240 confidence-token highlighting. Paragraphs are wrapped in 241 ``<p>`` tags with ``<br>`` line breaks. 242 243 Args: 244 value: Raw Markdown text (may contain multiple blocks). 245 246 Returns: 247 An HTML string with all recognised Markdown constructs 248 converted to their HTML equivalents. 249 """ 250 value = html.escape(str(value)) 251 lines = value.replace("\r\n", "\n").replace("\r", "\n").split("\n") 252 blocks: list[str] = [] 253 paragraph_lines: list[str] = [] 254 list_items: list[str] = [] 255 list_type = "" 256 in_code_fence = False 257 code_lines: list[str] = [] 258 259 def flush_paragraph() -> None: 260 """Flush accumulated paragraph lines into a ``<p>`` block.""" 261 nonlocal paragraph_lines 262 if not paragraph_lines: 263 return 264 paragraph_text = "\n".join(paragraph_lines) 265 rendered = render_inline_markdown(paragraph_text).replace("\n", "<br>\n") 266 blocks.append(f"<p>{rendered}</p>") 267 paragraph_lines = [] 268 269 def flush_list() -> None: 270 """Flush accumulated list items into an ``<ol>`` or ``<ul>`` block.""" 271 nonlocal list_items, list_type 272 if not list_items or not list_type: 273 list_items = [] 274 list_type = "" 275 return 276 items_html = "".join(f"<li>{item}</li>" for item in list_items) 277 blocks.append(f"<{list_type}>{items_html}</{list_type}>") 278 list_items = [] 279 list_type = "" 280 281 def flush_code_fence() -> None: 282 """Flush accumulated code lines into a ``<pre><code>`` block.""" 283 nonlocal code_lines 284 code_text = "\n".join(code_lines) 285 blocks.append(f"<pre><code>{code_text}</code></pre>") 286 code_lines = [] 287 288 index = 0 289 while index < len(lines): 290 line = lines[index] 291 stripped = line.strip() 292 293 if in_code_fence: 294 if stripped.startswith("```"): 295 in_code_fence = False 296 flush_code_fence() 297 else: 298 code_lines.append(line) 299 index += 1 300 continue 301 302 if stripped.startswith("```"): 303 flush_paragraph() 304 flush_list() 305 in_code_fence = True 306 code_lines = [] 307 index += 1 308 continue 309 310 if not stripped: 311 flush_paragraph() 312 flush_list() 313 index += 1 314 continue 315 316 header_cells = _split_table_row(line) 317 if header_cells and index + 1 < len(lines): 318 separator_cells = _split_table_row(lines[index + 1]) 319 if ( 320 separator_cells 321 and len(header_cells) == len(separator_cells) 322 and _is_table_separator_row(separator_cells) 323 ): 324 flush_paragraph() 325 flush_list() 326 327 expected_columns = len(header_cells) 328 normalized_header = _normalize_table_row_cells(header_cells, expected_columns) 329 body_rows: list[list[str]] = [] 330 331 index += 2 332 while index < len(lines): 333 body_line = lines[index] 334 body_stripped = body_line.strip() 335 if not body_stripped: 336 break 337 338 parsed_cells = _split_table_row(body_line) 339 if not parsed_cells: 340 break 341 342 body_rows.append(_normalize_table_row_cells(parsed_cells, expected_columns)) 343 index += 1 344 345 blocks.append(_render_table_html(normalized_header, body_rows)) 346 continue 347 348 heading_match = MARKDOWN_HEADING_PATTERN.match(stripped) 349 if heading_match: 350 flush_paragraph() 351 flush_list() 352 level = len(heading_match.group(1)) 353 heading_text = render_inline_markdown(heading_match.group(2)) 354 blocks.append(f"<h{level}>{heading_text}</h{level}>") 355 index += 1 356 continue 357 358 ordered_match = MARKDOWN_ORDERED_LIST_PATTERN.match(stripped) 359 if ordered_match: 360 flush_paragraph() 361 if list_type != "ol": 362 flush_list() 363 list_type = "ol" 364 list_items = [] 365 list_items.append(render_inline_markdown(ordered_match.group(1))) 366 index += 1 367 continue 368 369 unordered_match = MARKDOWN_UNORDERED_LIST_PATTERN.match(stripped) 370 if unordered_match: 371 flush_paragraph() 372 if list_type != "ul": 373 flush_list() 374 list_type = "ul" 375 list_items = [] 376 list_items.append(render_inline_markdown(unordered_match.group(1))) 377 index += 1 378 continue 379 380 flush_list() 381 paragraph_lines.append(line.strip()) 382 index += 1 383 384 if in_code_fence: 385 flush_code_fence() 386 flush_paragraph() 387 flush_list() 388 389 return "\n".join(blocks) 390 391 392def format_block(value: Any) -> Markup: 393 """Escape plain text and convert it to safe HTML with line breaks. 394 395 Applies confidence-token highlighting (CRITICAL, HIGH, etc.) 396 and replaces newline characters with ``<br>`` tags. Intended for 397 use as a Jinja2 template filter. 398 399 Args: 400 value: Raw text to format. 401 402 Returns: 403 A :class:`~markupsafe.Markup` string safe for Jinja2 404 rendering, or an N/A placeholder when *value* is empty. 405 """ 406 text = _stringify(value, default="") 407 if not text: 408 return Markup('<span class="empty-value">N/A</span>') 409 410 escaped = str(escape(text.replace("\r\n", "\n").replace("\r", "\n"))) 411 highlighted = highlight_confidence_tokens(escaped) 412 with_line_breaks = highlighted.replace("\n", "<br>\n") 413 return Markup(with_line_breaks) 414 415 416def format_markdown_block(value: Any) -> Markup: 417 """Convert Markdown text to HTML via :func:`markdown_to_html`. 418 419 Intended for use as a Jinja2 template filter. 420 421 Args: 422 value: Raw Markdown text to render. 423 424 Returns: 425 A :class:`~markupsafe.Markup` string of rendered HTML, or 426 an N/A placeholder when *value* is empty. 427 """ 428 text = _stringify(value, default="") 429 if not text: 430 return Markup('<span class="empty-value">N/A</span>') 431 return Markup(markdown_to_html(text))
393def format_block(value: Any) -> Markup: 394 """Escape plain text and convert it to safe HTML with line breaks. 395 396 Applies confidence-token highlighting (CRITICAL, HIGH, etc.) 397 and replaces newline characters with ``<br>`` tags. Intended for 398 use as a Jinja2 template filter. 399 400 Args: 401 value: Raw text to format. 402 403 Returns: 404 A :class:`~markupsafe.Markup` string safe for Jinja2 405 rendering, or an N/A placeholder when *value* is empty. 406 """ 407 text = _stringify(value, default="") 408 if not text: 409 return Markup('<span class="empty-value">N/A</span>') 410 411 escaped = str(escape(text.replace("\r\n", "\n").replace("\r", "\n"))) 412 highlighted = highlight_confidence_tokens(escaped) 413 with_line_breaks = highlighted.replace("\n", "<br>\n") 414 return Markup(with_line_breaks)
Escape plain text and convert it to safe HTML with line breaks.
Applies confidence-token highlighting (CRITICAL, HIGH, etc.)
and replaces newline characters with <br> tags. Intended for
use as a Jinja2 template filter.
Arguments:
- value: Raw text to format.
Returns:
A
~markupsafe.Markupstring safe for Jinja2 rendering, or an N/A placeholder when value is empty.
417def format_markdown_block(value: Any) -> Markup: 418 """Convert Markdown text to HTML via :func:`markdown_to_html`. 419 420 Intended for use as a Jinja2 template filter. 421 422 Args: 423 value: Raw Markdown text to render. 424 425 Returns: 426 A :class:`~markupsafe.Markup` string of rendered HTML, or 427 an N/A placeholder when *value* is empty. 428 """ 429 text = _stringify(value, default="") 430 if not text: 431 return Markup('<span class="empty-value">N/A</span>') 432 return Markup(markdown_to_html(text))
Convert Markdown text to HTML via markdown_to_html().
Intended for use as a Jinja2 template filter.
Arguments:
- value: Raw Markdown text to render.
Returns:
A
~markupsafe.Markupstring of rendered HTML, or an N/A placeholder when value is empty.
84def highlight_confidence_tokens(text: str) -> str: 85 """Wrap severity tokens in coloured ``<span>`` elements. 86 87 Matches ``CRITICAL``, ``HIGH``, ``MEDIUM``, and ``LOW`` 88 (case-insensitive) and wraps each in a ``<span>`` with the 89 corresponding CSS class from :data:`CONFIDENCE_CLASS_MAP`. 90 91 Args: 92 text: Pre-escaped HTML string to scan for severity tokens. 93 94 Returns: 95 The input string with severity tokens wrapped in spans. 96 """ 97 def _replace_confidence(match: re.Match[str]) -> str: 98 """Replace a confidence token match with a styled span.""" 99 token = match.group(1).upper() 100 css_class = CONFIDENCE_CLASS_MAP.get(token, "confidence-unknown") 101 return f'<span class="confidence-inline {css_class}">{token}</span>' 102 103 return CONFIDENCE_PATTERN.sub(_replace_confidence, text)
Wrap severity tokens in coloured <span> elements.
Matches CRITICAL, HIGH, MEDIUM, and LOW
(case-insensitive) and wraps each in a <span> with the
corresponding CSS class from CONFIDENCE_CLASS_MAP.
Arguments:
- text: Pre-escaped HTML string to scan for severity tokens.
Returns:
The input string with severity tokens wrapped in spans.
235def markdown_to_html(value: str) -> str: 236 """Convert a complete Markdown text block to HTML. 237 238 Supports headings (``#`` through ``######``), ordered and 239 unordered lists, fenced code blocks (triple backticks), tables, 240 inline formatting (bold, italic, code spans), and 241 confidence-token highlighting. Paragraphs are wrapped in 242 ``<p>`` tags with ``<br>`` line breaks. 243 244 Args: 245 value: Raw Markdown text (may contain multiple blocks). 246 247 Returns: 248 An HTML string with all recognised Markdown constructs 249 converted to their HTML equivalents. 250 """ 251 value = html.escape(str(value)) 252 lines = value.replace("\r\n", "\n").replace("\r", "\n").split("\n") 253 blocks: list[str] = [] 254 paragraph_lines: list[str] = [] 255 list_items: list[str] = [] 256 list_type = "" 257 in_code_fence = False 258 code_lines: list[str] = [] 259 260 def flush_paragraph() -> None: 261 """Flush accumulated paragraph lines into a ``<p>`` block.""" 262 nonlocal paragraph_lines 263 if not paragraph_lines: 264 return 265 paragraph_text = "\n".join(paragraph_lines) 266 rendered = render_inline_markdown(paragraph_text).replace("\n", "<br>\n") 267 blocks.append(f"<p>{rendered}</p>") 268 paragraph_lines = [] 269 270 def flush_list() -> None: 271 """Flush accumulated list items into an ``<ol>`` or ``<ul>`` block.""" 272 nonlocal list_items, list_type 273 if not list_items or not list_type: 274 list_items = [] 275 list_type = "" 276 return 277 items_html = "".join(f"<li>{item}</li>" for item in list_items) 278 blocks.append(f"<{list_type}>{items_html}</{list_type}>") 279 list_items = [] 280 list_type = "" 281 282 def flush_code_fence() -> None: 283 """Flush accumulated code lines into a ``<pre><code>`` block.""" 284 nonlocal code_lines 285 code_text = "\n".join(code_lines) 286 blocks.append(f"<pre><code>{code_text}</code></pre>") 287 code_lines = [] 288 289 index = 0 290 while index < len(lines): 291 line = lines[index] 292 stripped = line.strip() 293 294 if in_code_fence: 295 if stripped.startswith("```"): 296 in_code_fence = False 297 flush_code_fence() 298 else: 299 code_lines.append(line) 300 index += 1 301 continue 302 303 if stripped.startswith("```"): 304 flush_paragraph() 305 flush_list() 306 in_code_fence = True 307 code_lines = [] 308 index += 1 309 continue 310 311 if not stripped: 312 flush_paragraph() 313 flush_list() 314 index += 1 315 continue 316 317 header_cells = _split_table_row(line) 318 if header_cells and index + 1 < len(lines): 319 separator_cells = _split_table_row(lines[index + 1]) 320 if ( 321 separator_cells 322 and len(header_cells) == len(separator_cells) 323 and _is_table_separator_row(separator_cells) 324 ): 325 flush_paragraph() 326 flush_list() 327 328 expected_columns = len(header_cells) 329 normalized_header = _normalize_table_row_cells(header_cells, expected_columns) 330 body_rows: list[list[str]] = [] 331 332 index += 2 333 while index < len(lines): 334 body_line = lines[index] 335 body_stripped = body_line.strip() 336 if not body_stripped: 337 break 338 339 parsed_cells = _split_table_row(body_line) 340 if not parsed_cells: 341 break 342 343 body_rows.append(_normalize_table_row_cells(parsed_cells, expected_columns)) 344 index += 1 345 346 blocks.append(_render_table_html(normalized_header, body_rows)) 347 continue 348 349 heading_match = MARKDOWN_HEADING_PATTERN.match(stripped) 350 if heading_match: 351 flush_paragraph() 352 flush_list() 353 level = len(heading_match.group(1)) 354 heading_text = render_inline_markdown(heading_match.group(2)) 355 blocks.append(f"<h{level}>{heading_text}</h{level}>") 356 index += 1 357 continue 358 359 ordered_match = MARKDOWN_ORDERED_LIST_PATTERN.match(stripped) 360 if ordered_match: 361 flush_paragraph() 362 if list_type != "ol": 363 flush_list() 364 list_type = "ol" 365 list_items = [] 366 list_items.append(render_inline_markdown(ordered_match.group(1))) 367 index += 1 368 continue 369 370 unordered_match = MARKDOWN_UNORDERED_LIST_PATTERN.match(stripped) 371 if unordered_match: 372 flush_paragraph() 373 if list_type != "ul": 374 flush_list() 375 list_type = "ul" 376 list_items = [] 377 list_items.append(render_inline_markdown(unordered_match.group(1))) 378 index += 1 379 continue 380 381 flush_list() 382 paragraph_lines.append(line.strip()) 383 index += 1 384 385 if in_code_fence: 386 flush_code_fence() 387 flush_paragraph() 388 flush_list() 389 390 return "\n".join(blocks)
Convert a complete Markdown text block to HTML.
Supports headings (# through ######), ordered and
unordered lists, fenced code blocks (triple backticks), tables,
inline formatting (bold, italic, code spans), and
confidence-token highlighting. Paragraphs are wrapped in
<p> tags with <br> line breaks.
Arguments:
- value: Raw Markdown text (may contain multiple blocks).
Returns:
An HTML string with all recognised Markdown constructs converted to their HTML equivalents.
106def render_inline_markdown(value: str) -> str: 107 """Render inline Markdown formatting to HTML. 108 109 Handles backtick code spans, bold (``**`` and ``__``), italic 110 (``*`` and ``_``), and confidence-token highlighting. Code spans 111 are preserved verbatim; all other text is HTML-escaped first. 112 113 Args: 114 value: Raw inline Markdown text. 115 116 Returns: 117 An HTML string with inline formatting applied. 118 """ 119 source = str(value or "") 120 if not source: 121 return "" 122 123 parts = re.split(r"(`[^`\n]*`)", source) 124 output: list[str] = [] 125 for part in parts: 126 if not part: 127 continue 128 if part.startswith("`") and part.endswith("`"): 129 output.append(f"<code>{part[1:-1]}</code>") 130 continue 131 132 escaped = part 133 escaped = MARKDOWN_BOLD_STAR_PATTERN.sub(r"<strong>\1</strong>", escaped) 134 escaped = MARKDOWN_BOLD_UNDERSCORE_PATTERN.sub(r"<strong>\1</strong>", escaped) 135 escaped = MARKDOWN_ITALIC_STAR_PATTERN.sub(r"<em>\1</em>", escaped) 136 escaped = MARKDOWN_ITALIC_UNDERSCORE_PATTERN.sub(r"<em>\1</em>", escaped) 137 escaped = highlight_confidence_tokens(escaped) 138 output.append(escaped) 139 return "".join(output)
Render inline Markdown formatting to HTML.
Handles backtick code spans, bold (** and __), italic
(* and _), and confidence-token highlighting. Code spans
are preserved verbatim; all other text is HTML-escaped first.
Arguments:
- value: Raw inline Markdown text.
Returns:
An HTML string with inline formatting applied.