README.md (2132B)
1 # md-stream 2 3 Incremental zero-copy markdown parser for streaming LLM output. 4 5 Designed for chat interfaces where markdown arrives token-by-token and 6 needs to be rendered progressively. Zero dependencies. 7 8 ## Design 9 10 All parsed output uses `Span { start, end }` byte indices into the 11 parser's internal buffer rather than owned `String`s. This means: 12 13 - **Zero heap allocations** in the parsing hot path 14 - Spans are `Copy` (16 bytes vs String's 24 + heap) 15 - Buffer is append-only — pushed content never moves, spans stay valid 16 - Resolve spans to `&str` via `span.resolve(parser.buffer())` 17 18 ## Usage 19 20 ```rust 21 use md_stream::{StreamParser, MdElement, Span}; 22 23 let mut parser = StreamParser::new(); 24 25 // Push tokens as they arrive from the LLM 26 parser.push("# Hello "); 27 parser.push("World\n\n"); 28 parser.push("Some **bold** text\n\n"); 29 30 // Read completed elements 31 for element in parser.parsed() { 32 match element { 33 MdElement::Heading { level, content } => { 34 println!("H{}: {}", level, content.resolve(parser.buffer())); 35 } 36 MdElement::Paragraph(inlines) => { 37 // render inline elements... 38 } 39 _ => {} 40 } 41 } 42 43 // Call finalize when the stream ends to flush any partial state 44 parser.finalize(); 45 ``` 46 47 ## Supported elements 48 49 - Headings (`# ` through `###### `) 50 - Paragraphs with inline formatting: 51 - **Bold** (`**text**`) 52 - *Italic* (`*text*`) 53 - ***Bold italic*** (`***text***`) 54 - ~~Strikethrough~~ (`~~text~~`) 55 - `Inline code` (`` `code` ``) 56 - [Links]() (`[text](url)`) 57 - Images (``) 58 - Fenced code blocks (`` ``` `` and `~~~`) with language tags 59 - Tables (`| header | header |` with separator row) 60 - Thematic breaks (`---`, `***`, `___`) 61 62 ## Streaming behavior 63 64 The parser handles partial input gracefully: 65 66 - **Partial state**: Access `parser.partial()` to speculatively render 67 in-progress elements (e.g. a code block still receiving content) 68 - **Ambiguous prefixes**: Single `` ` `` is deferred until the parser 69 can confirm it's not the start of `` ``` `` 70 - **Split boundaries**: Double newlines split across `push()` calls 71 are detected correctly