notedeck

One damus client to rule them all
git clone git://jb55.com/notedeck
Log | Files | Refs | README | LICENSE

DEVELOPER.md (6850B)


      1 # Tokenator Developer Documentation
      2 
      3 This document provides detailed information for developers who want to use the Tokenator library in their projects or contribute to its development.
      4 
      5 ## Core Concepts
      6 
      7 Tokenator works with two primary concepts:
      8 
      9 1. **Token Parsing**: Converting a sequence of string tokens into structured data
     10 2. **Token Serialization**: Converting structured data into a sequence of string tokens
     11 
     12 The library is designed to be simple, efficient, and flexible for working with delimited string formats.
     13 
     14 ## API Reference
     15 
     16 ### TokenParser
     17 
     18 `TokenParser` is responsible for parsing tokens from a slice of string references.
     19 
     20 ```rust
     21 pub struct TokenParser<'a> {
     22     tokens: &'a [&'a str],
     23     index: usize,
     24 }
     25 ```
     26 
     27 Key methods:
     28 
     29 - `new(tokens: &'a [&'a str]) -> Self`: Creates a new parser from a slice of string tokens
     30 - `pull_token() -> Result<&'a str, ParseError<'a>>`: Gets the next token and advances the index
     31 - `peek_token() -> Result<&'a str, ParseError<'a>>`: Looks at the next token without advancing the index
     32 - `parse_token(expected: &'static str) -> Result<&'a str, ParseError<'a>>`: Checks if the next token matches the expected value
     33 - `alt<R>(parser: &mut TokenParser<'a>, routes: &[fn(&mut TokenParser<'a>) -> Result<R, ParseError<'a>>]) -> Result<R, ParseError<'a>>`: Tries each parser in `routes` until one succeeds
     34 - `parse_all<R>(&mut self, parse_fn: impl FnOnce(&mut Self) -> Result<R, ParseError<'a>>) -> Result<R, ParseError<'a>>`: Ensures all tokens are consumed after parsing
     35 - `try_parse<R>(&mut self, parse_fn: impl FnOnce(&mut Self) -> Result<R, ParseError<'a>>) -> Result<R, ParseError<'a>>`: Attempts to parse and backtracks on failure
     36 - `is_eof() -> bool`: Checks if there are any tokens left to parse
     37 
     38 ### TokenWriter
     39 
     40 `TokenWriter` is responsible for serializing tokens into a string with the specified delimiter.
     41 
     42 ```rust
     43 pub struct TokenWriter {
     44     delim: &'static str,
     45     tokens_written: usize,
     46     buf: Vec<u8>,
     47 }
     48 ```
     49 
     50 Key methods:
     51 
     52 - `new(delim: &'static str) -> Self`: Creates a new writer with the specified delimiter
     53 - `default() -> Self`: Creates a new writer with ":" as the delimiter
     54 - `write_token(token: &str)`: Appends a token to the buffer
     55 - `str() -> &str`: Gets the current buffer as a string
     56 - `buffer() -> &[u8]`: Gets the current buffer as a byte slice
     57 
     58 ### TokenSerializable
     59 
     60 `TokenSerializable` is a trait that types can implement to be serialized to and parsed from tokens.
     61 
     62 ```rust
     63 pub trait TokenSerializable: Sized {
     64     fn parse_from_tokens<'a>(parser: &mut TokenParser<'a>) -> Result<Self, ParseError<'a>>;
     65     fn serialize_tokens(&self, writer: &mut TokenWriter);
     66 }
     67 ```
     68 
     69 ### Error Handling
     70 
     71 The library provides detailed error types:
     72 
     73 - `ParseError<'a>`: Represents errors that can occur during parsing
     74   - `Incomplete`: Not done parsing yet
     75   - `AltAllFailed`: All parsing options failed
     76   - `DecodeFailed`: General decoding failure
     77   - `HexDecodeFailed`: Hex decoding failure
     78   - `UnexpectedToken`: Encountered an unexpected token
     79   - `EOF`: No more tokens
     80 
     81 ## Advanced Usage
     82 
     83 ### Backtracking and Alternative Parsing
     84 
     85 One of the powerful features of Tokenator is its support for backtracking and alternative parsing paths:
     86 
     87 ```rust
     88 // Try multiple parsing strategies
     89 let result = TokenParser::alt(&mut parser, &[
     90     |p| parse_strategy_a(p),
     91     |p| parse_strategy_b(p),
     92     |p| parse_strategy_c(p),
     93 ]);
     94 
     95 // Attempt to parse but backtrack on failure
     96 let result = parser.try_parse(|p| {
     97     let token = p.parse_token("specific_token")?;
     98     // More parsing...
     99     Ok(result)
    100 });
    101 ```
    102 
    103 ### Parsing Hex Data
    104 
    105 The library includes utilities for parsing hexadecimal data:
    106 
    107 ```rust
    108 use tokenator::parse_hex_id;
    109 
    110 // Parse a 32-byte hex string from the next token
    111 let hash: [u8; 32] = parse_hex_id(&mut parser)?;
    112 ```
    113 
    114 ### Custom Delimiters
    115 
    116 You can use custom delimiters when serializing tokens:
    117 
    118 ```rust
    119 // Create a writer with a custom delimiter
    120 let mut writer = TokenWriter::new("|");
    121 writer.write_token("user");
    122 writer.write_token("alice");
    123 // Result: "user|alice"
    124 ```
    125 
    126 ## Best Practices
    127 
    128 1. **Implement TokenSerializable for your types**: This ensures consistency between parsing and serialization logic.
    129 
    130 2. **Use try_parse for speculative parsing**: When trying different parsing strategies, wrap them in `try_parse` to ensure proper backtracking.
    131 
    132 3. **Handle all error cases**: The detailed error types provided by Tokenator help identify and handle specific parsing issues.
    133 
    134 4. **Consider memory efficiency**: The parser works with string references to avoid unnecessary copying.
    135 
    136 5. **Validate input**: Always validate input tokens before attempting to parse them into your data structures.
    137 
    138 ## Integration Examples
    139 
    140 ### Custom Protocol Parser
    141 
    142 ```rust
    143 use tokenator::{TokenParser, TokenWriter, TokenSerializable, ParseError};
    144 
    145 enum Command {
    146     Get { key: String },
    147     Set { key: String, value: String },
    148     Delete { key: String },
    149 }
    150 
    151 impl TokenSerializable for Command {
    152     fn parse_from_tokens<'a>(parser: &mut TokenParser<'a>) -> Result<Self, ParseError<'a>> {
    153         let cmd = parser.pull_token()?;
    154         
    155         match cmd {
    156             "GET" => {
    157                 let key = parser.pull_token()?.to_string();
    158                 Ok(Command::Get { key })
    159             },
    160             "SET" => {
    161                 let key = parser.pull_token()?.to_string();
    162                 let value = parser.pull_token()?.to_string();
    163                 Ok(Command::Set { key, value })
    164             },
    165             "DEL" => {
    166                 let key = parser.pull_token()?.to_string();
    167                 Ok(Command::Delete { key })
    168             },
    169             _ => Err(ParseError::UnexpectedToken(tokenator::UnexpectedToken {
    170                 expected: "GET, SET, or DEL",
    171                 found: cmd,
    172             })),
    173         }
    174     }
    175 
    176     fn serialize_tokens(&self, writer: &mut TokenWriter) {
    177         match self {
    178             Command::Get { key } => {
    179                 writer.write_token("GET");
    180                 writer.write_token(key);
    181             },
    182             Command::Set { key, value } => {
    183                 writer.write_token("SET");
    184                 writer.write_token(key);
    185                 writer.write_token(value);
    186             },
    187             Command::Delete { key } => {
    188                 writer.write_token("DEL");
    189                 writer.write_token(key);
    190             },
    191         }
    192     }
    193 }
    194 ```
    195 
    196 ## Contributing
    197 
    198 Contributions to Tokenator are welcome! Here are some areas that could be improved:
    199 
    200 - Additional parsing utilities
    201 - Performance optimizations
    202 - More comprehensive test coverage
    203 - Example implementations for common use cases
    204 - Documentation improvements
    205 
    206 When submitting a pull request, please ensure:
    207 
    208 1. All tests pass
    209 2. New functionality includes appropriate tests
    210 3. Documentation is updated to reflect changes
    211 4. Code follows the existing style conventions