README.md (116106B)
1 Ubuntu, macOS and Windows: [![Build Status](https://github.com/dvidelabs/flatcc/actions/workflows/ci.yml/badge.svg)](https://github.com/dvidelabs/flatcc/actions/workflows/ci.yml) 2 Windows: [![Windows Build Status](https://ci.appveyor.com/api/projects/status/github/dvidelabs/flatcc?branch=master&svg=true)](https://ci.appveyor.com/project/dvidelabs/flatcc) 3 Weekly: [![Build Status](https://github.com/dvidelabs/flatcc/actions/workflows/weekly.yml/badge.svg)](https://github.com/dvidelabs/flatcc/actions/workflows/weekly.yml) 4 5 6 _The JSON parser may change the interface for parsing union vectors in a 7 future release which requires code generation to match library 8 versions._ 9 10 # FlatCC FlatBuffers in C for C 11 12 `flatcc` has no external dependencies except for build and compiler 13 tools, and the C runtime library. With concurrent Ninja builds, a small client 14 project can build flatcc with libraries, generate schema code, link the project 15 and execute a test case in a few seconds, produce binaries between 15K and 60K, 16 read small buffers in 30ns, build FlatBuffers in about 600ns, and with a larger 17 executable also handle optional json parsing or printing in less than 2 us for a 18 10 field mixed type message. 19 20 21 <!-- vim-markdown-toc GFM --> 22 23 * [Online Forums](#online-forums) 24 * [Introduction](#introduction) 25 * [Project Details](#project-details) 26 * [Poll on Meson Build](#poll-on-meson-build) 27 * [Reporting Bugs](#reporting-bugs) 28 * [Status](#status) 29 * [Main features supported as of 0.6.1](#main-features-supported-as-of-061) 30 * [Supported platforms (CI tested)](#supported-platforms-ci-tested) 31 * [Platforms reported to work by users](#platforms-reported-to-work-by-users) 32 * [Portability](#portability) 33 * [Time / Space / Usability Tradeoff](#time--space--usability-tradeoff) 34 * [Generated Files](#generated-files) 35 * [Use of Macros in Generated Code](#use-of-macros-in-generated-code) 36 * [Extracting Documentation](#extracting-documentation) 37 * [Using flatcc](#using-flatcc) 38 * [Trouble Shooting](#trouble-shooting) 39 * [Quickstart](#quickstart) 40 * [Reading a Buffer](#reading-a-buffer) 41 * [Compiling for Read-Only](#compiling-for-read-only) 42 * [Building a Buffer](#building-a-buffer) 43 * [Verifying a Buffer](#verifying-a-buffer) 44 * [Potential Name Conflicts](#potential-name-conflicts) 45 * [Debugging a Buffer](#debugging-a-buffer) 46 * [File and Type Identifiers](#file-and-type-identifiers) 47 * [File Identifiers](#file-identifiers) 48 * [Type Identifiers](#type-identifiers) 49 * [JSON Parsing and Printing](#json-parsing-and-printing) 50 * [Base64 Encoding](#base64-encoding) 51 * [Fixed Length Arrays](#fixed-length-arrays) 52 * [Runtime Flags](#runtime-flags) 53 * [Generic Parsing and Printing.](#generic-parsing-and-printing) 54 * [Performance Notes](#performance-notes) 55 * [Global Scope and Included Schema](#global-scope-and-included-schema) 56 * [Required Fields and Duplicate Fields](#required-fields-and-duplicate-fields) 57 * [Fast Buffers](#fast-buffers) 58 * [Types](#types) 59 * [Unions](#unions) 60 * [Union Scope Resolution](#union-scope-resolution) 61 * [Fixed Length Arrays](#fixed-length-arrays-1) 62 * [Optional Fields](#optional-fields) 63 * [Endianness](#endianness) 64 * [Pitfalls in Error Handling](#pitfalls-in-error-handling) 65 * [Searching and Sorting](#searching-and-sorting) 66 * [Null Values](#null-values) 67 * [Portability Layer](#portability-layer) 68 * [Building](#building) 69 * [Unix Build (OS-X, Linux, related)](#unix-build-os-x-linux-related) 70 * [Windows Build (MSVC)](#windows-build-msvc) 71 * [Docker](#docker) 72 * [Cross-compilation](#cross-compilation) 73 * [Custom Allocation](#custom-allocation) 74 * [Custom Asserts](#custom-asserts) 75 * [Shared Libraries](#shared-libraries) 76 * [Distribution](#distribution) 77 * [Unix Files](#unix-files) 78 * [Windows Files](#windows-files) 79 * [Running Tests on Unix](#running-tests-on-unix) 80 * [Running Tests on Windows](#running-tests-on-windows) 81 * [Configuration](#configuration) 82 * [Using the Compiler and Builder library](#using-the-compiler-and-builder-library) 83 * [FlatBuffers Binary Format](#flatbuffers-binary-format) 84 * [Security Considerations](#security-considerations) 85 * [Style Guide](#style-guide) 86 * [Benchmarks](#benchmarks) 87 88 <!-- vim-markdown-toc --> 89 90 ## Online Forums 91 92 - [Google Groups - FlatBuffers](https://groups.google.com/forum/#!forum/flatbuffers) 93 - [Discord - FlatBuffers](https://discord.gg/6qgKs3R) 94 - [Gitter - FlatBuffers](https://gitter.im/google/flatbuffers) 95 96 97 ## Introduction 98 99 This project builds flatcc, a compiler that generates FlatBuffers code for 100 C given a FlatBuffer schema file. This introduction also creates a separate test 101 project with the traditional monster example, here in a C version. 102 103 For now assume a Unix like system although that is not a general requirement - 104 see also [Building](#building). You will need git, cmake, bash, a C compiler, 105 and either the ninja build system, or make. 106 107 git clone https://github.com/dvidelabs/flatcc.git 108 cd flatcc 109 # scripts/initbuild.sh ninja 110 scripts/initbuild.sh make 111 scripts/setup.sh -a ../mymonster 112 ls bin 113 ls lib 114 cd ../mymonster 115 ls src 116 scripts/build.sh 117 ls generated 118 119 `scripts/initbuild.sh` is optional and chooses the build backend, which defaults 120 to ninja. 121 122 The setup script builds flatcc using CMake, then creates a test project 123 directory with the monster example, and a build script which is just a small 124 shell script. The headers and libraries are symbolically linked into the test 125 project. You do not need CMake to build your own projects once flatcc is 126 compiled. 127 128 To create another test project named foobar, call `scripts/setup.sh -s -x 129 ../foobar`. This will avoid rebuilding the flatcc project from scratch. 130 131 132 ## Project Details 133 134 NOTE: see 135 [CHANGELOG](https://github.com/dvidelabs/flatcc/blob/master/CHANGELOG.md). 136 There are occassionally minor breaking changes as API inconsistencies 137 are discovered. Unless clearly stated, breaking changes will not affect 138 the compiled runtime library, only the header files. In case of trouble, 139 make sure the `flatcc` tool is same version as the `include/flatcc` 140 path. 141 142 The project includes: 143 144 - an executable `flatcc` FlatBuffers schema compiler for C and a 145 corresponding library `libflatcc.a`. The compiler generates C header 146 files or a binary flatbuffers schema. 147 - a typeless runtime library `libflatccrt.a` for building and verifying 148 flatbuffers from C. Generated builder headers depend on this library. 149 It may also be useful for other language interfaces. The library 150 maintains a stack state to make it easy to build buffers from a parser 151 or similar. 152 - a small `flatcc/portable` header only library for non-C11 compliant 153 compilers, and small helpers for all compilers including endian 154 handling and numeric printing and parsing. 155 156 157 See also: 158 159 - [Reporting Bugs](https://github.com/dvidelabs/flatcc#reporting-bugs) 160 161 - [Google FlatBuffers](http://google.github.io/flatbuffers/) 162 163 - [Build Instructions](https://github.com/dvidelabs/flatcc#building) 164 165 - [Quickstart](https://github.com/dvidelabs/flatcc#quickstart) 166 167 - [Builder Interface Reference] 168 169 - [Benchmarks] 170 171 The `flatcc` compiler is implemented as a standalone tool instead of 172 extending Googles `flatc` compiler in order to have a pure portable C 173 library implementation of the schema compiler that is designed to fail 174 graciously on abusive input in long running processes. It is also 175 believed a C version may help provide schema parsing to other language 176 interfaces that find interfacing with C easier than C++. The FlatBuffers 177 team at Googles FPL lab has been very helpful in providing feedback and 178 answering many questions to help ensure the best possible compatibility. 179 Notice the name `flatcc` (FlatBuffers C Compiler) vs Googles `flatc`. 180 181 The JSON format is compatible with Googles `flatc` tool. The `flatc` 182 tool converts JSON from the command line using a schema and a buffer as 183 input. `flatcc` generates schema specific code to read and write JSON 184 at runtime. While the `flatcc` approach is likely much faster and also 185 easier to deploy, the `flatc` approach is likely more convenient when 186 manually working with JSON such as editing game scenes. Both tools have 187 their place. 188 189 **NOTE: Big-endian platforms are only supported as of release 0.4.0.** 190 191 192 ## Poll on Meson Build 193 194 It is being considered adding support for the Meson build system, but it 195 would be good with some feedback on this via 196 [issue #56](https://github.com/dvidelabs/flatcc/issues/56) 197 198 199 ## Reporting Bugs 200 201 If possible, please provide a short reproducible schema and source file 202 with a main program the returns 1 on error and 0 on success and a small 203 build script. Preferably generate a hexdump and call the buffer verifier 204 to ensure the input is valid and link with the debug library 205 `flatccrt_d`. 206 207 See also [Debugging a Buffer](#debugging-a-buffer), and [readfile.h] 208 useful for reading an existing buffer for verification. 209 210 Example: 211 212 [samples/bugreport](samples/bugreport) 213 214 eclectic.fbs : 215 216 ```c 217 namespace Eclectic; 218 219 enum Fruit : byte { Banana = -1, Orange = 42 } 220 table FooBar { 221 meal : Fruit = Banana; 222 density : long (deprecated); 223 say : string; 224 height : short; 225 } 226 file_identifier "NOOB"; 227 root_type FooBar; 228 ``` 229 230 myissue.c : 231 232 ```c 233 /* Minimal test with all headers generated into a single file. */ 234 #include "build/myissue_generated.h" 235 #include "flatcc/support/hexdump.h" 236 237 int main(int argc, char *argv[]) 238 { 239 int ret; 240 void *buf; 241 size_t size; 242 flatcc_builder_t builder, *B; 243 244 (void)argc; 245 (void)argv; 246 247 B = &builder; 248 flatcc_builder_init(B); 249 250 Eclectic_FooBar_start_as_root(B); 251 Eclectic_FooBar_say_create_str(B, "hello"); 252 Eclectic_FooBar_meal_add(B, Eclectic_Fruit_Orange); 253 Eclectic_FooBar_height_add(B, -8000); 254 Eclectic_FooBar_end_as_root(B); 255 buf = flatcc_builder_get_direct_buffer(B, &size); 256 #if defined(PROVOKE_ERROR) || 0 257 /* Provoke error for testing. */ 258 ((char*)buf)[0] = 42; 259 #endif 260 ret = Eclectic_FooBar_verify_as_root(buf, size); 261 if (ret) { 262 hexdump("Eclectic.FooBar buffer for myissue", buf, size, stdout); 263 printf("could not verify Electic.FooBar table, got %s\n", flatcc_verify_error_string(ret)); 264 } 265 flatcc_builder_clear(B); 266 return ret; 267 } 268 ``` 269 build.sh : 270 ```sh 271 #!/bin/sh 272 cd $(dirname $0) 273 274 FLATBUFFERS_DIR=../.. 275 NAME=myissue 276 SCHEMA=eclectic.fbs 277 OUT=build 278 279 FLATCC_EXE=$FLATBUFFERS_DIR/bin/flatcc 280 FLATCC_INCLUDE=$FLATBUFFERS_DIR/include 281 FLATCC_LIB=$FLATBUFFERS_DIR/lib 282 283 mkdir -p $OUT 284 $FLATCC_EXE --outfile $OUT/${NAME}_generated.h -a $SCHEMA || exit 1 285 cc -I$FLATCC_INCLUDE -g -o $OUT/$NAME $NAME.c -L$FLATCC_LIB -lflatccrt_d || exit 1 286 echo "running $OUT/$NAME" 287 if $OUT/$NAME; then 288 echo "success" 289 else 290 echo "failed" 291 exit 1 292 fi 293 ``` 294 295 ## Status 296 297 Release 0.6.2 (in development) is primarily a bug fix release, refer 298 to CHANGELOG for details. A long standing bug has been fixed where 299 where objects created before a call to _create_as_root would not be 300 properly aligned, and buffer end is now also padded to largest object 301 seen within the buffer. 302 Note that for clang debug builds, -fsanitize=undefined has been 303 added and this may require dependent source code to also use 304 that flag to avoid missing linker symbols. The feature can be disabled 305 in CMakeLists.txt. 306 307 Release 0.6.1 contains primarily bug fixes and numerous contributions 308 from the community to handle platform edge cases. Additionally, 309 pendantic GCC warnings are disabled, relying instead on clang, since GCC 310 is too aggressive, breaks builds frequently and works against 311 portability. An existing C++ test case ensures that C code also works 312 with common C++ compilers, but it can break some environments, so there 313 is now a flag to disable that test without disabling all tests. Support 314 for Optional Scalar Values in the FlatBuffer format has been added. 315 There is also improved support for abstracting memory allocation on 316 various platforms. `<table>_identifier` has been deprecated in favor 317 `<table>_file_identifier` in generated code due to `identifier` easily 318 leading to name conflicts. `file_extension` constant in generated code 319 is now without prefixed dot (.). 320 321 Release 0.6.0 introduces a "primary" attribute to be used together with 322 a key attribute to chose default key for finding and sorting. If primary 323 is absent, the key with the lowest id becomes primary. Tables and 324 vectors can now be sorted recursively on primary keys. BREAKING: 325 previously the first listed, not the lowest id, would be the primary 326 key. Also introduces fixed length scalar arrays in struct fields (struct 327 and enum elements are not supported). Structs support fixed length array 328 fields, including char arrays. Empty structs never fully worked and are 329 no longer supported, they are also no longer supported by flatc. 330 NOTE: char arrays are not currently part of Googles flatc compiler - 331 int8 arrays may be used instead. BREAKING: empty structs are no longer 332 supported - they are also not valid in Googles flatc compiler. See 333 CHANGELOG for additional changes. DEPRECATED: low-level `cast_to/from` 334 from functions in `flatcc_accessors.h` will be removed in favor of 335 `read/write_from/to` because the cast interface breaks float conversion 336 on some uncommon platforms. This should not affect normal use but 337 remains valid in this release. 338 339 Release 0.5.3 inlcudes various bug fixes (see changelog) and one 340 breaking but likely low impact change: BREAKING: 0.5.3 changes behavour 341 of builder create calls so arguments are always ordered by field id when 342 id attributes are being used, for example 343 `MyGame_Example_Monster_create()` in `monster_test.fbs` 344 ([#81](https://github.com/dvidelabs/flatcc/issues/81)). Fixes undefined 345 behavior when sorting tables by a numeric key field. 346 347 Release 0.5.2 introduces optional `_get` suffix to reader methods. By 348 using `flatcc -g` only `_get` methods are valid. This removes potential 349 name conficts for some field names. 0.5.2 also introduces the long 350 awaited clone operation for tables and vectors. A C++ smoketest was 351 added to reduce the number void pointer assignment errors that kept 352 sneaking in. The runtime library now needs an extra file `refmap.c`. 353 354 Release 0.5.1 fixes a buffer overrun in the JSON printer and improves 355 the portable libraries <stdalign.h> compatibility with C++ and the 356 embedded `newlib` standard library. JSON printing and parsing has been 357 made more consistent to help parse and print tables other than the 358 schema root as seen in the test driver in [test_json.c]. The 359 [monster_test.fbs] file has been reorganized to keep the Monster table 360 more consistent with Googles flatc version and a minor schema namespace 361 inconsistency has been resolved as a result. Explicit references to 362 portable headers have been moved out of generated source. extern "C" C++ 363 guards added around generated headers. 0.5.1 also cleaned up the 364 low-level union interface so the terms { type, value } are used 365 consistently over { type, member } and { types, members }. 366 367 368 ### Main features supported as of 0.6.1 369 370 - generated FlatBuffers reader and builder headers for C 371 - generated FlatBuffers verifier headers for C 372 - generated FlatBuffers JSON parser and printer for C 373 - ability to concatenate all output into one file, or to stdout 374 - robust dependency file generation for build systems 375 - binary schema (.bfbs) generation 376 - pre-generated reflection headers for handling .bfbs files 377 - cli schema compiler and library for compiling schema 378 - runtime library for builder, verifier and JSON support 379 - thorough test cases 380 - monster sample project 381 - fast build times 382 - support for big endian platforms (as of 0.4.0) 383 - support for big endian encoded flatbuffers on both le and be platforms. Enabled on `be` branch. 384 - size prefixed buffers - see also [Builder Interface Reference] 385 - flexible configuration of malloc alternatives and runtime 386 aligned_alloc/free support in builder library. 387 - feature parity with C++ FlatBuffers schema features added in 2017 388 adding support for union vectors and mixed type unions of strings, 389 structs, and tables, and type aliases for uint8, ..., float64. 390 - base64(url) encoded binary data in JSON. 391 - sort fields by primary key (as of 0.6.0) 392 - char arrays (as of 0.6.0) 393 - optional scalar values (as of 0.6.1) 394 395 There are no plans to make frequent updates once the project becomes 396 stable, but input from the community will always be welcome and included 397 in releases where relevant, especially with respect to testing on 398 different target platforms. 399 400 401 ### Supported platforms (CI tested) 402 403 This list is somewhat outdated, more recent compiler versions are added and 404 some old ones are removed when CI platforms no longer supported but largely 405 the supported targets remain unchanged. MSVC 2010 might become deprecated 406 in the future. 407 408 The ci-more branch tests additional compilers: 409 410 - Ubuntu Trusty gcc 4.4, 4.6-4.9, 5, 6, 7 and clang 3.6, 3.8 411 - OS-X current clang / gcc 412 - Windows MSVC 2010, 2013, 2015, 2015 Win64, 2017, 2017 Win64 413 - C++11/C++14 user code on the above platforms. 414 415 C11/C++11 is the reference that is expected to always work. 416 417 The GCC `--pedantic` compiler option is not supported as of GCC-8+ 418 because it forces non-portable code changes and because it tends to 419 break the code base with each new GCC release. 420 421 MSVC 2017 is not always tested because the CI environment then won't 422 support MSVC 2010. 423 424 Older/non-standard versions of C++ compilers cause problems because 425 `static_assert` and `alignas` behave in strange ways where they are 426 neither absent nor fully working as expected. There are often 427 workarounds, but it is more reliable to use `-std=c++11` or 428 `-std=c++14`. 429 430 The portably library does not support GCC C++ pre 4.7 because the 431 portable library does not work around C++ limitations in stdalign.h and 432 stdint.h before GCC 4.7. This could be fixed but is not a priority. 433 434 Some previously testet compiler versions may have been retired as the 435 CI environment gets updated. See `.travis.yml` and `appveyor.yml` in 436 the `ci-more` branch for the current configuration. 437 438 The monster sample does not work with MSVC 2010 because it intentionally 439 uses C99 style code to better follow the C++ version. 440 441 The build option `FLATCC_TEST` can be used to disable all tests which 442 might make flatcc compile on platforms that are otherwise problematic. 443 The buld option `FLATCC_CXX_TEST` can be disabled specifically for C++ 444 tests (a simple C++ file that includes generated C code). 445 446 ### Platforms reported to work by users 447 448 - ESP32 SoC SDK with FreeRTOS and newlib has been reported to compile 449 cleanly with C++ 14 using flatcc generated JSON parsers, as of flatcc 450 0.5.1. 451 - FreeRTOS when using custom memory allocation methods. 452 - Arduino (at least reading buffers) 453 - IBM XLC on AIX big endian Power PC has been tested for release 0.4.0 454 but is not part of regular release tests. 455 456 ### Portability 457 458 There is no reason why other or older compilers cannot be supported, but 459 it may require some work in the build configuration and possibly 460 updates to the portable library. The above is simply what has been 461 tested and configured. 462 463 The portability layer has some features that are generally important for 464 things like endian handling, and others to provide compatibility for 465 optional and missing C11 features. Together this should support most C 466 compilers around, but relies on community feedback for maturity. 467 468 The necessary size of the runtime include files can be reduced 469 significantly by using -std=c11 and avoiding JSON (which needs a lot of 470 numeric parsing support), and by removing `include/flatcc/reflection` 471 which is present to support handling of binary schema files and can be 472 generated from `reflection/reflection.fbs`, and removing 473 `include/flatcc/support` which is only used for tests and samples. The 474 exact set of required files may change from release to release, and it 475 doesn't really matter with respect to the compiled code size. 476 477 478 ## Time / Space / Usability Tradeoff 479 480 The priority has been to design an easy to use C builder interface that 481 is reasonably fast, suitable for both servers and embedded devices, but 482 with usability over absolute performance - still the small buffer output 483 rate is measured in millons per second and read access 10-100 millon 484 buffers per second from a rough estimate. Reading FlatBuffers is more 485 than an order of magnitude faster than building them. 486 487 For 100MB buffers with 1000 monsters, dynamically extended monster 488 names, monster vector, and inventory vector, the bandwidth reaches about 489 2.2GB/s and 45ms/buffer on 2.2GHz Haswell Core i7 CPU. This includes 490 reading back and validating all data. Reading only a few key fields 491 increases bandwidth to 2.7GB/s and 37ms/op. For 10MB buffers bandwidth 492 may be higher but eventually smaller buffers will be hit by call 493 overhead and thus we get down to 300MB/s at about 150ns/op encoding 494 small buffers. These numbers are just a rough guideline - they obviously 495 depend on hardware, compiler, and data encoded. Measurements are 496 excluding an initial warmup step. 497 498 The generated JSON parsers are roughly 4 times slower than building a 499 FlatBuffer directly in C or C++, or about 2200ns vs 600ns for a 700 byte 500 JSON message. JSON parsing is thus roughly two orders of magnitude faster 501 than reading the equivalent Protocol Buffer, as reported on the [Google 502 FlatBuffers 503 Benchmarks](http://google.github.io/flatbuffers/flatbuffers_benchmarks.html) 504 page. LZ4 compression would estimated double the overall processing time 505 of JSON parsing. JSON printing is faster than parsing but not very 506 significantly so. JSON compresses to roughly half the size of compressed 507 FlatBuffers on large buffers, but compresses worse on small buffers (not 508 to mention when not compressing at all). 509 510 It should be noted that FlatBuffer read performance exclude verification 511 which JSON parsers and Protocol Buffers inherently include by their 512 nature. Verification has not been benchmarked, but would presumably add 513 less than 50% read overhead unless only a fraction of a large buffer is to 514 be read. 515 516 See also [Benchmarks]. 517 518 The client C code can avoid almost any kind of allocation to build 519 buffers as a builder stack provides an extensible arena before 520 committing objects - for example appending strings or vectors piecemeal. 521 The stack is mostly bypassed when a complete object can be constructed 522 directly such as a vector from integer array on little endian platforms. 523 524 The reader interface should be pretty fast as is with less room for 525 improvement performance wise. It is also much simpler than the builder. 526 527 Usability has also been prioritized over smallest possible generated 528 source code and compile time. It shouldn't affect the compiled size 529 by much. 530 531 The compiled binary output should be reasonably small for everything but 532 the most restrictive microcontrollers. A 33K monster source test file 533 (in addition to the generated headers and the builder library) results 534 in a less than 50K optimized binary executable file including overhead 535 for printf statements and other support logic, or a 30K object file 536 excluding the builder library. 537 538 Read-only binaries are smaller but not necessarily much smaller than 539 builders considering they do less work: The compatibility test reads a 540 pre-generated binary `monsterdata_test.golden` monster file and verifies 541 that all content is as expected. This results in a 13K optimized binary 542 executable or a 6K object file. The source for this check is 5K 543 excluding header files. Readers do not need to link with a library. 544 545 JSON parsers bloat the compiled C binary compared to pure Flatbuffer 546 usage because they inline the parser decision tree. A JSON parser for 547 monster.fbs may add 100K +/- optimization settings to the executable 548 binary. 549 550 551 ## Generated Files 552 553 The generated code for building flatbuffers, 554 and for parsing and printing flatbuffers, all need access to 555 `include/flatcc`. The reader does no rely on any library but all other 556 generated files rely on the `libflatccrt.a` runtime library. Note that 557 `libflatcc.a` is only required if the flatcc compiler itself is required 558 as a library. 559 560 The reader and builder rely on generated common reader and builder 561 header files. These common file makes it possible to change the global 562 namespace and redefine basic types (`uoffset_t` etc.). In the future 563 this _might_ move into library code and use macros for these 564 abstractions and eventually have a set of predefined files for types 565 beyond the standard 32-bit unsigned offset (`uoffset_t`). The runtime 566 library is specific to one set of type definitions. 567 568 Refer to [monster_test.c] and the generated files for detailed guidance 569 on use. The monster schema used in this project is a slight adaptation 570 to the original to test some additional edge cases. 571 572 For building flatbuffers a separate builder header file is generated per 573 schema. It requires a `flatbuffers_common_builder.h` file also generated 574 by the compiler and a small runtime library `libflatccrt.a`. It is 575 because of this requirement that the reader and builder generated code 576 is kept separate. Typical uses can be seen in the [monster_test.c] file. 577 The builder allows for repeated pushing of content to a vector or a 578 string while a containing table is being updated which simplifies 579 parsing of external formats. It is also possible to build nested buffers 580 in-line - at first this may sound excessive but it is useful when 581 wrapping a union of buffers in a network interface and it ensures proper 582 alignment of all buffer levels. 583 584 For verifying flatbuffers, a `myschema_verifier.h` is generated. It 585 depends on the runtime library and the reader header. 586 587 Json parsers and printers generate one file per schema file and included 588 schema will have their own parsers and printers which including parsers 589 and printers will depend upon, rather similar to how builders work. 590 591 Low level note: the builder generates all vtables at the end of the 592 buffer instead of ad-hoc in front of each table but otherwise does the 593 same deduplication of vtables. This makes it possible to cluster vtables 594 in hot cache or to make sure all vtables are available when partially 595 transmitting a buffer. This behavior can be disabled by a runtime flag. 596 597 Because some use cases may include very constrained embedded devices, 598 the builder library can be customized with an allocator object and a 599 buffer emitter object. The separate emitter ensures a buffer can be 600 constructed without requiring a full buffer to be present in memory at 601 once, if so desired. 602 603 The typeless builder library is documented in [flatcc_builder.h] and 604 [flatcc_emitter.h] while the generated typed builder api for C is 605 documented in [Builder Interface Reference]. 606 607 608 ### Use of Macros in Generated Code 609 610 Occasionally a concern is raised about the dense nature of the macros 611 used in the generated code. These macros make it difficult to understand 612 which functions are actually available. The [Builder Interface Reference] 613 attempts to document the operations in general fashion. To get more 614 detailed information, generated function prototypes can be extracted 615 with the `scripts/flatcc-doc.sh` script. 616 617 Some are also concerned with macros being "unsafe". Macros are not 618 unsafe when used with FlatCC because they generate static or static 619 inline functions. These will trigger compile time errors if used 620 incorrectly to the same extend that they would in direct C code. 621 622 The expansion compresses the generated output by more than a factor 10 623 ensuring that code under source control does not explode and making it 624 possible to compare versions of generated code in a meaningful manner 625 and see if it matches the intended schema. The macros are also important 626 for dealing with platform abstractions via the portable headers. 627 628 Still, it is possible to see the generated output although not supported 629 directly by the build system. As an example, 630 `include/flatcc/reflection` contains pre-generated header files for the 631 reflection schema. To see the expanded output using the `clang` compiler 632 tool chain, run: 633 634 clang -E -DNDEBUG -I include \ 635 include/flatcc/reflection/reflection_reader.h | \ 636 clang-format 637 638 Other similar commands are likely available on platforms not supporting 639 clang. 640 641 Note that the compiler will optimize out nearly all of the generated 642 code and only use the logic actually referenced by end-user code because 643 the functions are static or static inline. The remaining parts generally 644 inline efficiently into the application code resulting in a reasonably 645 small binary code size. 646 647 More details can be found in 648 [#88](https://github.com/dvidelabs/flatcc/issues/88) 649 650 651 ### Extracting Documentation 652 653 The expansion of generated code can be used to get documentation for 654 a specific object type. 655 656 The following script automates this process: 657 658 scripts/flatcc-doc.sh <schema-file> <name-prefix> [<outdir>] 659 660 writing function prototypes to `<outdir>/<name-prefix>.doc`. 661 662 Note that the script requires the clang compiler and the clang-format 663 tool, but the script could likely be adapted for other tool chains as well. 664 665 The principle behind the script can be illustrated using the reflection 666 schema as an example, where documentation for the Object table is 667 extracted: 668 669 bin/flatcc reflection/reflection.fbs -a --json --stdout | \ 670 clang - -E -DNDEBUG -I include | \ 671 clang-format -style="WebKit" | \ 672 grep "^static.* reflection_Object_\w*(" | \ 673 cut -f 1 -d '{' | \ 674 grep -v deprecated | \ 675 grep -v ");" | \ 676 sed 's/__tmp//g' | \ 677 sed 's/)/);/g' 678 679 The WebKit style of clang-format ensures that parameters and the return 680 type are all placed on the same line. Grep extracts the function headers 681 and cut strips function bodies starting on the same line. Sed strips 682 `__tmp` suffix from parameter names used to avoid macro name conflicts. 683 Grep strips `);` to remove redundant forward declarations and sed then 684 adds ; to make each line a valid C prototype. 685 686 The above is not guaranteed to always work as output may change, but it 687 should go a long way. 688 689 A small extract of the output, as of flatcc-v0.5.2 690 691 static inline size_t reflection_Object_vec_len(reflection_Object_vec_t vec); 692 static inline reflection_Object_table_t reflection_Object_vec_at(reflection_Object_vec_t vec, size_t i); 693 static inline reflection_Object_table_t reflection_Object_as_root_with_identifier(const void* buffer, const char* fid); 694 static inline reflection_Object_table_t reflection_Object_as_root_with_type_hash(const void* buffer, flatbuffers_thash_t thash); 695 static inline reflection_Object_table_t reflection_Object_as_root(const void* buffer); 696 static inline reflection_Object_table_t reflection_Object_as_typed_root(const void* buffer); 697 static inline flatbuffers_string_t reflection_Object_name_get(reflection_Object_table_t t); 698 static inline flatbuffers_string_t reflection_Object_name(reflection_Object_table_t t); 699 static inline int reflection_Object_name_is_present(reflection_Object_table_t t); 700 static inline size_t reflection_Object_vec_scan_by_name(reflection_Object_vec_t vec, const char* s); 701 static inline size_t reflection_Object_vec_scan_n_by_name(reflection_Object_vec_t vec, const char* s, int n); 702 ... 703 704 705 Examples are provided in following script using the reflection and monster schema: 706 707 scripts/reflection-doc-example.sh 708 scripts/monster-doc-example.sh 709 710 The monster doc example essentially calls: 711 712 scripts/flatcc-doc.sh samples/monster/monster.fbs MyGame_Sample_Monster_ 713 714 resulting in the file `MyGame_Sample_Monster_.doc`: 715 716 static inline size_t MyGame_Sample_Monster_vec_len(MyGame_Sample_Monster_vec_t vec); 717 static inline MyGame_Sample_Monster_table_t MyGame_Sample_Monster_vec_at(MyGame_Sample_Monster_vec_t vec, size_t i); 718 static inline MyGame_Sample_Monster_table_t MyGame_Sample_Monster_as_root_with_identifier(const void* buffer, const char* fid); 719 static inline MyGame_Sample_Monster_table_t MyGame_Sample_Monster_as_root_with_type_hash(const void* buffer, flatbuffers_thash_t thash); 720 static inline MyGame_Sample_Monster_table_t MyGame_Sample_Monster_as_root(const void* buffer); 721 static inline MyGame_Sample_Monster_table_t MyGame_Sample_Monster_as_typed_root(const void* buffer); 722 static inline MyGame_Sample_Vec3_struct_t MyGame_Sample_Monster_pos_get(MyGame_Sample_Monster_table_t t); 723 static inline MyGame_Sample_Vec3_struct_t MyGame_Sample_Monster_pos(MyGame_Sample_Monster_table_t t); 724 static inline int MyGame_Sample_Monster_pos_is_present(MyGame_Sample_Monster_table_t t); 725 static inline int16_t MyGame_Sample_Monster_mana_get(MyGame_Sample_Monster_table_t t); 726 static inline int16_t MyGame_Sample_Monster_mana(MyGame_Sample_Monster_table_t t); 727 static inline const int16_t* MyGame_Sample_Monster_mana_get_ptr(MyGame_Sample_Monster_table_t t); 728 static inline int MyGame_Sample_Monster_mana_is_present(MyGame_Sample_Monster_table_t t); 729 static inline size_t MyGame_Sample_Monster_vec_scan_by_mana(MyGame_Sample_Monster_vec_t vec, int16_t key); 730 static inline size_t MyGame_Sample_Monster_vec_scan_ex_by_mana(MyGame_Sample_Monster_vec_t vec, size_t begin, size_t end, int16_t key); 731 ... 732 733 734 FlatBuffer native types can also be extracted, for example string operations: 735 736 scripts/flatcc-doc.sh samples/monster/monster.fbs flatbuffers_string_ 737 738 resulting in `flatbuffers_string_.doc`: 739 740 static inline size_t flatbuffers_string_len(flatbuffers_string_t s); 741 static inline size_t flatbuffers_string_vec_len(flatbuffers_string_vec_t vec); 742 static inline flatbuffers_string_t flatbuffers_string_vec_at(flatbuffers_string_vec_t vec, size_t i); 743 static inline flatbuffers_string_t flatbuffers_string_cast_from_generic(const flatbuffers_generic_t p); 744 static inline flatbuffers_string_t flatbuffers_string_cast_from_union(const flatbuffers_union_t u); 745 static inline size_t flatbuffers_string_vec_find(flatbuffers_string_vec_t vec, const char* s); 746 static inline size_t flatbuffers_string_vec_find_n(flatbuffers_string_vec_t vec, const char* s, size_t n); 747 static inline size_t flatbuffers_string_vec_scan(flatbuffers_string_vec_t vec, const char* s); 748 static inline size_t flatbuffers_string_vec_scan_n(flatbuffers_string_vec_t vec, const char* s, size_t n); 749 static inline size_t flatbuffers_string_vec_scan_ex(flatbuffers_string_vec_t vec, size_t begin, size_t end, const char* s); 750 ... 751 752 ## Using flatcc 753 754 Refer to `flatcc -h` for details. 755 756 An online version listed here: [flatcc-help.md] but please use `flatcc 757 -h` for an up to date reference. 758 759 760 The compiler can either generate a single header file or headers for all 761 included schema and a common file and with or without support for both 762 reading (default) and writing (-w) flatbuffers. The simplest option is 763 to use (-a) for all and include the `myschema_builder.h` file. 764 765 The (-a) or (-v) also generates a verifier file. 766 767 Make sure `flatcc` under the `include` folder is visible in the C 768 compilers include path when compiling flatbuffer builders. 769 770 The `flatcc` (-I) include path will assume all schema files with same 771 base name (case insentive) are identical and will only include the 772 first. All generated files use the input basename and will land in 773 working directory or the path set by (-o). 774 775 Files can be generated to stdout using (--stdout). C headers will be 776 ordered and concatenated, but are otherwise identical to the separate 777 file output. Each include statement is guarded so this will not lead to 778 missing include files. 779 780 The generated code, especially with all combined with --stdout, may 781 appear large, but only the parts actually used will take space up the 782 the final executable or object file. Modern compilers inline and include 783 only necessary parts of the statically linked builder library. 784 785 JSON printer and parser can be generated using the --json flag or 786 --json-printer or json-parser if only one of them is required. There are 787 some certain runtime library compile time flags that can optimize out 788 printing symbolic enums, but these can also be disabled at runtime. 789 790 ## Trouble Shooting 791 792 Make sure to link with `libflatccrt` (rt for runtime) and not `libflatcc` (the schema compiler), otherwise the builder will not be available. Also make sure to have the 'include' of the flatcc project root in the include path. 793 794 Flatcc will by default expect a `file_identifier` in the buffer when reading or 795 verifying a buffer. 796 797 A buffer can have an unexpected 4-byte identifier at offset 4, or the identifier 798 might be absent. 799 800 Not all language interfaces support file identifiers in buffers, and if they do, they might not do so in an older version. Users have reported problems with both Python and Lua interfaces but this is easily resolved. 801 802 Check the return value of the verifier: 803 804 int ret; 805 char *s; 806 807 ret = MyTable_verify_as_root(buf, size); 808 if (ret) { 809 s = flatcc_verify_error_string(ret); 810 printf("buffer failed: %s\n", s); 811 } 812 813 To verify a buffer with no identifier, or to ignore a different identifier, 814 use the `_with_identifier` version of the verifier with a null identifier: 815 816 char *identifier = 0; 817 818 MyTable_verify_as_root_with_identifier(buf, size, identifier); 819 820 To read a buffer use: 821 822 MyTable_as_root_with_identifier(buf, 0); 823 824 And to build a buffer without an identifier use: 825 826 MyTable_start_as_root_with_identifier(builder, 0); 827 ... 828 MyTable_end_as_root_with_identifier(builder, 0); 829 830 Several other `as_root` calls have an `as_root_with_identifier` version, 831 including JSON printing. 832 833 ## Quickstart 834 835 After [building](https://github.com/dvidelabs/flatcc#building) the `flatcc tool`, 836 binaries are located in the `bin` and `lib` directories under the 837 `flatcc` source tree. 838 839 You can either jump directly to the [monster 840 example](https://github.com/dvidelabs/flatcc/tree/master/samples/monster) 841 that follows 842 [Googles FlatBuffers Tutorial](https://google.github.io/flatbuffers/flatbuffers_guide_tutorial.html), or you can read along the quickstart guide below. If you follow 843 the monster tutorial, you may want to clone and build flatcc and copy 844 the source to a separate project directory as follows: 845 846 git clone https://github.com/dvidelabs/flatcc.git 847 flatcc/scripts/setup.sh -a mymonster 848 cd mymonster 849 scripts/build.sh 850 build/mymonster 851 852 `scripts/setup.sh` will as a minimum link the library and tool into a 853 custom directory, here `mymonster`. With (-a) it also adds a simple 854 build script, copies the example, and updates `.gitignore` - see 855 `scripts/setup.sh -h`. Setup can also build flatcc, but you still have 856 to ensure the build environment is configured for your system. 857 858 To write your own schema files please follow the main FlatBuffers 859 project documentation on [writing schema 860 files](https://google.github.io/flatbuffers/flatbuffers_guide_writing_schema.html). 861 862 The [Builder Interface Reference] may be useful after studying the 863 monster sample and quickstart below. 864 865 When looking for advanced examples such as sorting vectors and finding 866 elements by a key, you should find these in the 867 [`test/monster_test`](https://github.com/dvidelabs/flatcc/tree/master/test/monster_test) project. 868 869 The following quickstart guide is a broad simplification of the 870 `test/monster_test` project - note that the schema is slightly different 871 from the tutorial. Focus is on the C specific framework rather 872 than general FlatBuffers concepts. 873 874 You can still use the setup tool to create an empty project and 875 follow along, but there are no assumptions about that in the text below. 876 877 ### Reading a Buffer 878 879 Here we provide a quick example of read-only access to Monster flatbuffer - 880 it is an adapted extract of the [monster_test.c] file. 881 882 First we compile the schema read-only with common (-c) support header and we 883 add the recursion because [monster_test.fbs] includes other files. 884 885 flatcc -cr --reader test/monster_test/monster_test.fbs 886 887 For simplicity we assume you build an example project in the project 888 root folder, but in praxis you would want to change some paths, for 889 example: 890 891 mkdir -p build/example 892 flatcc -cr --reader -o build/example test/monster_test/monster_test.fbs 893 cd build/example 894 895 We get: 896 897 flatbuffers_common_reader.h 898 include_test1_reader.h 899 include_test2_reader.h 900 monster_test_reader.h 901 902 (There is also the simpler `samples/monster/monster.fbs` but then you won't get 903 included schema files). 904 905 Namespaces can be long so we optionally use a macro to manage this. 906 907 #include "monster_test_reader.h" 908 909 #undef ns 910 #define ns(x) FLATBUFFERS_WRAP_NAMESPACE(MyGame_Example, x) 911 912 int verify_monster(void *buffer) 913 { 914 ns(Monster_table_t) monster; 915 /* This is a read-only reference to a flatbuffer encoded struct. */ 916 ns(Vec3_struct_t) vec; 917 flatbuffers_string_t name; 918 size_t offset; 919 920 if (!(monster = ns(Monster_as_root(buffer)))) { 921 printf("Monster not available\n"); 922 return -1; 923 } 924 if (ns(Monster_hp(monster)) != 80) { 925 printf("Health points are not as expected\n"); 926 return -1; 927 } 928 if (!(vec = ns(Monster_pos(monster)))) { 929 printf("Position is absent\n"); 930 return -1; 931 } 932 933 /* -3.2f is actually -3.20000005 and not -3.2 due to representation loss. */ 934 if (ns(Vec3_z(vec)) != -3.2f) { 935 printf("Position failing on z coordinate\n"); 936 return -1; 937 } 938 939 /* Verify force_align relative to buffer start. */ 940 offset = (char *)vec - (char *)buffer; 941 if (offset & 15) { 942 printf("Force align of Vec3 struct not correct\n"); 943 return -1; 944 } 945 946 /* 947 * If we retrieved the buffer using `flatcc_builder_finalize_aligned_buffer` or 948 * `flatcc_builder_get_direct_buffer` the struct should also 949 * be aligned without subtracting the buffer. 950 */ 951 if (vec & 15) { 952 printf("warning: buffer not aligned in memory\n"); 953 } 954 955 /* ... */ 956 return 0; 957 } 958 /* main() {...} */ 959 960 961 ### Compiling for Read-Only 962 963 Assuming our above file is `monster_example.c` the following are a few 964 ways to compile the project for read-only - compilation with runtime 965 library is shown later on. 966 967 cc -I include monster_example.c -o monster_example 968 969 cc -std=c11 -I include monster_example.c -o monster_example 970 971 cc -D FLATCC_PORTABLE -I include monster_example.c -o monster_example 972 973 The include path or source path is likely different. Some files in 974 `include/flatcc/portable` are always used, but the `-D FLATCC_PORTABLE` 975 flag includes additional files to support compilers lacking c11 976 features. 977 978 NOTE: on some clang/gcc platforms it may be necessary to use -std=gnu99 or 979 -std=gnu11 if the linker is unable find `posix_memalign`, see also comments in 980 [paligned_alloc.h]. 981 982 983 ### Building a Buffer 984 985 Here we provide a very limited example of how to build a buffer - only a few 986 fields are updated. Pleaser refer to [monster_test.c] and the doc directory 987 for more information. 988 989 First we must generate the files: 990 991 flatcc -a monster_test.fbs 992 993 This produces: 994 995 flatbuffers_common_reader.h 996 flatbuffers_common_builder.h 997 include_test1_reader.h 998 include_test1_builder.h 999 include_test1_verifier.h 1000 include_test2_reader.h 1001 include_test2_builder.h 1002 include_test2_verifier.h 1003 monster_test_reader.h 1004 monster_test_builder.h 1005 monster_test_verifier.h 1006 1007 Note: we wouldn't actually do the readonly generation shown earlier 1008 unless we only intend to read buffers - the builder generation always 1009 generates read acces too. 1010 1011 By including `"monster_test_builder.h"` all other files are included 1012 automatically. The C compiler needs the `-I include` directive to access 1013 `flatcc/flatcc_builder.h`, `flatcc/flatcc_verifier.h`, and other files 1014 depending on specifics, assuming the project root is the current 1015 directory. 1016 1017 The verifiers are not required and just created because we lazily chose 1018 the -a option. 1019 1020 The builder must be initialized first to set up the runtime environment 1021 we need for building buffers efficiently - the builder depends on an 1022 emitter object to construct the actual buffer - here we implicitly use 1023 the default. Once we have that, we can just consider the builder a 1024 handle and focus on the FlatBuffers generated API until we finalize the 1025 buffer (i.e. access the result). For non-trivial uses it is recommended 1026 to provide a custom emitter and for example emit pages over the network 1027 as soon as they complete rather than merging all pages into a single 1028 buffer using `flatcc_builder_finalize_buffer`, or the simplistic 1029 `flatcc_builder_get_direct_buffer` which returns null if the buffer is 1030 too large. See also documentation comments in [flatcc_builder.h] and 1031 [flatcc_emitter.h]. See also `flatc_builder_finalize_aligned_buffer` in 1032 `builder.h` and the [Builder Interface Reference] when malloc aligned 1033 buffers are insufficent. 1034 1035 1036 #include "monster_test_builder.h" 1037 1038 /* See [monster_test.c] for more advanced examples. */ 1039 void build_monster(flatcc_builder_t *B) 1040 { 1041 ns(Vec3_t *vec); 1042 1043 /* Here we use a table, but structs can also be roots. */ 1044 ns(Monster_start_as_root(B)); 1045 1046 ns(Monster_hp_add(B, 80)); 1047 /* The vec struct is zero-initalized. */ 1048 vec = ns(Monster_pos_start(B)); 1049 /* Native endian. */ 1050 vec->x = 1, vec->y = 2, vec->z = -3.2f; 1051 /* _end call converts to protocol endian format - for LE it is a nop. */ 1052 ns(Monster_pos_end(B)); 1053 1054 /* Name is required, or we get an assertion in debug builds. */ 1055 ns(Monster_name_create_str(B, "MyMonster")); 1056 1057 ns(Monster_end_as_root(B)); 1058 } 1059 1060 #include "flatcc/support/hexdump.h" 1061 1062 int main(int argc, char *argv[]) 1063 { 1064 flatcc_builder_t builder; 1065 void *buffer; 1066 size_t size; 1067 1068 flatcc_builder_init(&builder); 1069 1070 build_monster(&builder); 1071 /* We could also use `flatcc_builder_finalize_buffer` and free the buffer later. */ 1072 buffer = flatcc_builder_get_direct_buffer(&builder, &size); 1073 assert(buffer); 1074 verify_monster(buffer); 1075 1076 /* Visualize what we got ... */ 1077 hexdump("monster example", buffer, size, stdout); 1078 1079 /* 1080 * Here we can call `flatcc_builder_reset(&builder) if 1081 * we wish to build more buffers before deallocating 1082 * internal memory with `flatcc_builder_clear`. 1083 */ 1084 1085 flatcc_builder_clear(&builder); 1086 return 0; 1087 } 1088 1089 Compile the example project: 1090 1091 cc -std=c11 -I include monster_example.c lib/libflatccrt.a -o monster_example 1092 1093 Note that the runtime library is required for building buffers, but not 1094 for reading them. If it is incovenient to distribute the runtime library 1095 for a given target, source files may be used instead. Each feature has 1096 its own source file, so not all runtime files are needed for building a 1097 buffer: 1098 1099 cc -std=c11 -I include monster_example.c \ 1100 src/runtime/emitter.c src/runtime/builder.c \ 1101 -o monster_example 1102 1103 Other features such as the verifier and the JSON printer and parser 1104 would each need a different file in src/runtime. Which file should be 1105 obvious from the filenames except that JSON parsing also requires the 1106 builder and emitter source files. 1107 1108 1109 ### Verifying a Buffer 1110 1111 A buffer can be verified to ensure it does not contain any ranges that 1112 point outside the the given buffer size, that all data structures are 1113 aligned according to the flatbuffer principles, that strings are zero 1114 terminated, and that required fields are present. 1115 1116 In the builder example above, we can apply a verifier to the output: 1117 1118 #include "monster_test_builder.h" 1119 #include "monster_test_verifier.h" 1120 int ret; 1121 ... 1122 ... finalize 1123 if ((ret = ns(Monster_verify_as_root_with_identifier(buffer, size, 1124 "MONS")))) { 1125 printf("Monster buffer is invalid: %s\n", 1126 flatcc_verify_error_string(ret)); 1127 } 1128 1129 The [readfile.h] utility may also be helpful in reading an existing 1130 buffer for verification. 1131 1132 Flatbuffers can optionally leave out the identifier, here "MONS". Use a 1133 null pointer as identifier argument to ignore any existing identifiers 1134 and allow for missing identifiers. 1135 1136 Nested flatbuffers are always verified with a null identifier, but it 1137 may be checked later when accessing the buffer. 1138 1139 The verifier does NOT verify that two datastructures are not 1140 overlapping. Sometimes this is indeed valid, such as a DAG (directed 1141 acyclic graph) where for example two string references refer to the same 1142 string in the buffer. In other cases an attacker may maliciously 1143 construct overlapping datastructures such that in-place updates may 1144 cause subsequent invalid buffers. Therefore an untrusted buffer should 1145 never be updated in-place without first rewriting it to a new buffer. 1146 1147 The CMake build system has build option to enable assertions in the 1148 verifier. This will break debug builds and not usually what is desired, 1149 but it can be very useful when debugging why a buffer is invalid. Traces 1150 can also be enabled so table offset and field id can be reported. 1151 1152 See also `include/flatcc/flatcc_verifier.h`. 1153 1154 When verifying buffers returned directly from the builder, it may be 1155 necessary to use the `flatcc_builder_finalize_aligned_buffer` to ensure 1156 proper alignment and use `aligned_free` to free the buffer (or as of 1157 v0.5.0 also `flatcc_builder_aligned_free`), see also the 1158 [Builder Interface Reference]. Buffers may also be copied into aligned 1159 memory via mmap or using the portable layers `paligned_alloc.h` feature 1160 which is available when including generated headers. 1161 `test/flatc_compat/flatc_compat.c` is an example of how this can be 1162 done. For the majority of use cases, standard allocation would be 1163 sufficient, but for example standard 32-bit Windows only allocates on an 1164 8-byte boundary and can break the monster schema because it has 16-byte 1165 aligned fields. 1166 1167 1168 ### Potential Name Conflicts 1169 1170 If unfortunate, it is possible to have a read accessor method conflict 1171 with other generated methods and typenames. Usually a small change in 1172 the schema will resolve this issue. 1173 1174 As of flatcc 0.5.2 read accors are generated with and without a `_get` 1175 suffix so it is also possible to use `Monster_pos_get(monster)` instead 1176 of `Monster_pos(monster)`. When calling flatcc with option `-g` the 1177 read accesors will only be generated with `_get` suffix. This avoids 1178 potential name conflicts. An example of a conflict is a field name 1179 like `pos_add` when there is also a `pos` field because the builder 1180 interface generates the `add` suffix. Using the -g option avoids this 1181 problem, but it is preferable to choose another name such as `added_pos` 1182 when the schema can be modified. 1183 1184 The `-g` option only changes the content of the 1185 `flatbuffers_common_reader.h` file, so it is technically possible to 1186 use different versions of this file if they are not mixed. 1187 1188 If an external code generator depends on flatcc output, it should use 1189 the `_get` suffix because it will work with and without the -g option, 1190 but only as of version 0.5.2 or later. For human readable code it is 1191 probaly simpler to stick to the orignal naming convention without the 1192 `_get` suffix. 1193 1194 Even with the above, it is still possible to have a conflict with the 1195 union type field. If a union field is named `foo`, an additional field 1196 is automatically - this field is named `foo_type` and holds, 1197 unsurprisingly, the type of the union. 1198 1199 Namespaces can also cause conflicts. If a schema has the namespace 1200 Foo.Bar and table named MyTable with a field name hello, then a 1201 read accessor will be named: `Foo_Bar_MyTable_hello_get`. It 1202 is also possible to have a table named `Bar_MyTable` because `_` are 1203 allowed in FlatBuffers schema names, but in this case we have name 1204 conflict in the generated the C code. FlatCC does not attempt to avoid 1205 such conflicts so such schema are considered invalid. 1206 1207 Notably several users have experienced conflicts with a table or struct 1208 field named 'identifier' because `<table-name>_identifier` has been 1209 defined to be the file identifier to be used when creating a buffer with 1210 that table (or struct) as root. As of 0.6.1, the name is 1211 `<table-name>_file_identifier` to reduce the risk of conflicts. The old 1212 form is deprecated but still generated for tables without a field named 1213 'identifier' for backwards compatibility. Mostly this macro is used for 1214 higher level functions such as `mytable_create_as_root` which need to 1215 know what identifier to use. 1216 1217 1218 ### Debugging a Buffer 1219 1220 When reading a FlatBuffer does not provide the expected results, the 1221 first line of defense is to ensure that the code being tested is linked 1222 against `flatccrt_d`, the debug build of the runtime library. This will 1223 raise an assertion if calls to the builder are not properly balanced or 1224 if required fields are not being set. 1225 1226 To dig further into a buffer, call the buffer verifier and see if the 1227 buffer is actually valid with respect to the expected buffer type. 1228 1229 Strings and tables will be returned as null pointers when their 1230 corresponding field is not set in the buffer. User code should test for 1231 this but it might also be helpful to temporarily or permanently set the 1232 `required` attribute in the schema. The builder will then detect missing fields 1233 when cerating buffers and the verifier can will detect their absence in 1234 an existing buffer. 1235 1236 If the verifier rejects a buffer, the error can be printed (see 1237 [Verifying a Buffer](#verifying-a-buffer)), but it will not say exactly 1238 where the problem was found. To go further, the verifier can be made to 1239 assert where the problem is encountered so the buffer content can be 1240 analyzed. This is enabled with: 1241 1242 -DFLATCC_DEBUG_VERIFY=1 1243 1244 Note that this will break test cases where a buffer is expected to fail 1245 verification. 1246 1247 To dump detailed contents of a valid buffer, or the valid contents up to 1248 the point of failure, use: 1249 1250 -DFLATCC_TRACE_VERIFY=1 1251 1252 Both of these options can be set as CMake options, or in the 1253 [flatcc_rtconfig.h] file. 1254 1255 When reporting bugs, output from the above might also prove helpful. 1256 1257 The JSON parser and printer can also be used to create and display 1258 buffers. The parser will use the builder API correctly or issue a syntax 1259 error or an error on required field missing. This can rule out some 1260 uncertainty about using the api correctly. The [test_json.c] file and 1261 [test_json_parser.c] have 1262 test functions that can be adapted for custom tests. 1263 1264 For advanced debugging the [hexdump.h] file can be used to dump the buffer 1265 contents. It is used in [test_json.c] and also in [monster_test.c]. 1266 See also [FlatBuffers Binary Format]. 1267 1268 As of April 2022, Googles flatc tool has implemented an `--annotate` feature. 1269 This provides an annotated hex dump given a binary buffer and a schema. The 1270 output can be used to troubleshoot and rule out or confirm suspected encoding 1271 bugs in the buffer at hand. The eclectic example in the [FlatBuffers Binary 1272 Format] document contains a hand written annotated example which inspired the 1273 `--annotate` feature, but it is not the exact same output format. Note also that 1274 `flatc` generated buffers tend to have vtables before the table it is referenced 1275 by, while flatcc normally packs all vtables at the end of the buffer for 1276 better padding and cache efficiency. 1277 1278 See also [flatc --annotate]. 1279 1280 Note: There is experimental support for text editor that supports 1281 clangd language server or similar. You can edit `CMakeList.txt` 1282 to generate `build/Debug/compile_comands.json`, at least when 1283 using clang as a compiler, and copy or symlink it from root. Or 1284 come with a better suggestion. There are `.gitignore` entries for 1285 `compile_flags.txt` and `compile_commands.json` in project root. 1286 1287 1288 ## File and Type Identifiers 1289 1290 There are two ways to identify the content of a FlatBuffer. The first is 1291 to use file identifiers which are defined in the schema. The second is 1292 to use `type identifiers` which are calculated hashes based on each 1293 tables name prefixed with its namespace, if any. In either case the 1294 identifier is stored at offset 4 in binary FlatBuffers, when present. 1295 Type identifiers are not to be confused with union types. 1296 1297 ### File Identifiers 1298 1299 The FlatBuffers schema language has the optional `file_identifier` 1300 declaration which accepts a 4 characer ASCII string. It is intended to be 1301 human readable. When absent, the buffer potentially becomes 4 bytes 1302 shorter (depending on padding). 1303 1304 The `file_identifier` is intended to match the `root_type` schema 1305 declaration, but this does not take into account that it is convenient 1306 to create FlatBuffers for other types as well. `flatcc` makes no special 1307 destinction for the `root_type` while Googles `flatc` JSON parser uses 1308 it to determine the JSON root object type. 1309 1310 As a consequence, the file identifier is ambigous. Included schema may 1311 have separate `file_identifier` declarations. To at least make sure each 1312 type is associated with its own schemas `file_identifier`, a symbol is 1313 defined for each type. If the schema has such identifier, it will be 1314 defined as the null identifier. 1315 1316 The generated code defines the identifiers for a given table: 1317 1318 #ifndef MyGame_Example_Monster_file_identifier 1319 #define MyGame_Example_Monster_file_identifier "MONS" 1320 #endif 1321 1322 The user can now override the identifier for a given type, for example: 1323 1324 #define MyGame_Example_Vec3_file_identifier "VEC3" 1325 #include "monster_test_builder.h" 1326 1327 ... 1328 MyGame_Example_Vec3_create_as_root(B, ...); 1329 1330 The `create_as_root` method uses the identifier for the type in question, 1331 and so does other `_as_root` methods. 1332 1333 The `file_extension` is handled in a similar manner: 1334 1335 #ifndef MyGame_Example_Monster_file_extension 1336 #define MyGame_Example_Monster_file_extension "mon" 1337 #endif 1338 1339 ### Type Identifiers 1340 1341 To better deal with the ambigouties of file identifiers, type 1342 identifiers have been introduced as an alternative 4 byte buffer 1343 identifier. The hash is standardized on FNV-1a for interoperability. 1344 1345 The type identifier use a type hash which maps a fully qualified type 1346 name into a 4 byte hash. The type hash is a 32-bit native value and the 1347 type identifier is a 4 character little endian encoded string of the 1348 same value. 1349 1350 In this example the type hash is derived from the string 1351 "MyGame.Example.Monster" and is the same for all FlatBuffer code 1352 generators that supports type hashes. 1353 1354 The value 0 is used to indicate that one does not care about the 1355 identifier in the buffer. 1356 1357 ... 1358 MyGame_Example_Monster_create_as_typed_root(B, ...); 1359 buffer = flatcc_builder_get_direct_buffer(B); 1360 MyGame_Example_Monster_verify_as_typed_root(buffer, size); 1361 // read back 1362 monster = MyGame_Example_Monster_as_typed_root(buffer); 1363 1364 switch (flatbuffers_get_type_hash(buffer)) { 1365 case MyGame_Example_Monster_type_hash: 1366 ... 1367 1368 } 1369 ... 1370 if (flatbuffers_get_type_hash(buffer) == 1371 flatbuffers_type_hash_from_name("Some.Old.Buffer")) { 1372 printf("Buffer is the old version, not supported.\n"); 1373 } 1374 1375 More API calls are available to naturally extend the existing API. See 1376 [monster_test.c] for more. 1377 1378 The type identifiers are defined like: 1379 1380 #define MyGame_Example_Monster_type_hash ((flatbuffers_thash_t)0x330ef481) 1381 #define MyGame_Example_Monster_type_identifier "\x81\xf4\x0e\x33" 1382 1383 The `type_identifier` can be used anywhere the original 4 character 1384 file identifier would be used, but a buffer must choose which system, if any, 1385 to use. This will not affect the `file_extension`. 1386 1387 NOTE: The generated `_type_identifier` strings should not normally be 1388 used when an identifier string is expected in the generated API because 1389 it may contain null bytes which will be zero padded after the first null 1390 before comparison. Use the API calls that take a type hash instead. The 1391 `type_identifier` can be used in low level [flatcc_builder.h] calls 1392 because it handles identifiers as a fixed byte array and handles type 1393 hashes and strings the same. 1394 1395 NOTE: it is possible to compile the flatcc runtime to encode buffers in 1396 big endian format rather than the standard little endian format 1397 regardless of the host platforms endianness. If this is done, the 1398 identifier field in the buffer is always byte swapped regardless of the 1399 identifier method chosen. The API calls make this transparent, so "MONS" 1400 will be stored as "SNOM" but should still be verified as "MONS" in API 1401 calls. This safeguards against mixing little- and big-endian buffers. 1402 Likewise, type hashes are always tested in native (host) endian format. 1403 1404 1405 The 1406 [`flatcc/flatcc_identifier.h`](https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/flatcc_identifier.h) 1407 file contains an implementation of the FNV-1a hash used. The hash was 1408 chosen for simplicity, availability, and collision resistance. For 1409 better distribution, and for internal use only, a dispersion function is 1410 also provided, mostly to discourage use of alternative hashes in 1411 transmission since the type hash is normally good enough as is. 1412 1413 _Note: there is a potential for collisions in the type hash values 1414 because the hash is only 4 bytes._ 1415 1416 1417 ## JSON Parsing and Printing 1418 1419 JSON support files are generated with `flatcc --json`. 1420 1421 This section is not a tutorial on JSON printing and parsing, it merely 1422 covers some non-obvious aspects. The best source to get started quickly 1423 is the test file: 1424 1425 test/json_test/json_test.c 1426 1427 For detailed usage, please refer to: 1428 1429 test/json_test/test_json_printer.c 1430 test/json_test/test_json_parser.c 1431 test/json_test/json_test.c 1432 test/benchmark/benchflatccjson 1433 1434 See also JSON parsing section in the Googles FlatBuffers [schema 1435 documentation](https://google.github.io/flatbuffers/flatbuffers_guide_writing_schema.html). 1436 1437 By using the flatbuffer schema it is possible to generate schema 1438 specific JSON printers and parsers. This differs for better and worse 1439 from Googles `flatc` tool which takes a binary schema as input and 1440 processes JSON input and output. Here that parser and printer only rely 1441 on the `flatcc` runtime library, is faster (probably significantly so), 1442 but requires recompilition when new JSON formats are to be supported - 1443 this is not as bad as it sounds - it would for example not be difficult 1444 to create a Docker container to process a specific schema in a web 1445 server context. 1446 1447 The parser always takes a text buffer as input and produces output 1448 according to how the builder object is initialized. The printer has 1449 different init functions: one for printing to a file pointer, including 1450 stdout, one for printing to a fixed length external buffer, and one for 1451 printing to a dynamically growing buffer. The dynamic buffer may be 1452 reused between prints via the reset function. See `flatcc_json_parser.h` 1453 for details. 1454 1455 The parser will accept unquoted names (not strings) and trailing commas, 1456 i.e. non-strict JSON and also allows for hex `\x03` in strings. Strict 1457 mode must be enabled by a compile time flag. In addition the parser 1458 schema specific symbolic enum values that can optionally be unquoted 1459 where a numeric value is expected: 1460 1461 color: Green 1462 color: Color.Green 1463 color: MyGame.Example.Color.Green 1464 color: 2 1465 1466 The symbolic values do not have to be quoted (unless required by runtime 1467 or compile time configuration), but can be while numeric values cannot 1468 be quoted. If no namespace is provided, like `color: Green`, the symbol 1469 must match the receiving enum type. Any scalar value may receive a 1470 symbolic value either in a relative namespace like `hp: Color.Green`, or 1471 an absolute namespace like `hp: MyGame.Example.Color.Green`, but not 1472 `hp: Green` (since `hp` in the monster example schema) is not an enum 1473 type with a `Green` value). A namespace is relative to the namespace of 1474 the receiving object. 1475 1476 It is also possible to have multiple values, but these always have to be 1477 quoted in order to be compatible with Googles flatc tool for Flatbuffers 1478 1.1: 1479 1480 color: "Green Red" 1481 1482 _Unquoted multi-valued enums can be enabled at compile time but this is 1483 deprecated because it is incompatible with both Googles flatc JSON and 1484 also with other possible future extensions: `color: Green Red`_ 1485 1486 These value-valued expressions were originally intended for enums that 1487 have the bit flag attribute defined (which Color does have), but this is 1488 tricky to process, so therefore any symblic value can be listed in a 1489 sequence with or without namespace as appropriate. Because this further 1490 causes problems with signed symbols the exact definition is that all 1491 symbols are first coerced to the target type (or fail), then added to 1492 the target type if not the first this results in: 1493 1494 color: "Green Blue Red Blue" 1495 color: 19 1496 1497 Because Green is 2, Red is 1, Blue is 8 and repeated. 1498 1499 __NOTE__: Duplicate values should be considered implemention dependent 1500 as it cannot be guaranteed that all flatbuffer JSON parsers will handle 1501 this the same. It may also be that this implementation will change in 1502 the future, for example to use bitwise or when all members and target 1503 are of bit flag type. 1504 1505 It is not valid to specify an empty set like: 1506 1507 color: "" 1508 1509 because it might be understood as 0 or the default value, and it does 1510 not unquote very well. 1511 1512 The printer will by default print valid json without any spaces and 1513 everything quoted. Use the non-strict formatting option (see headers and 1514 test examples) to produce pretty printing. It is possibly to disable 1515 symbolic enum values using the `noenum` option. 1516 1517 Only enums will print symbolic values are there is no history of any 1518 parsed symbolic values at all. Furthermore, symbolic values are only 1519 printed if the stored value maps cleanly to one value, or in the case of 1520 bit-flags, cleanly to multiple values. For exmaple if parsing `color: Green Red` 1521 it will print as `"color":"Red Green"` by default, while `color: Green 1522 Blue Red Blue` will print as `color:19`. 1523 1524 Both printer and parser are limited to roughly 100 table nesting levels 1525 and an additional 100 nested struct depths. This can be changed by 1526 configuration flags but must fit in the runtime stack since the 1527 operation is recursive descent. Exceedning the limits will result in an 1528 error. 1529 1530 Numeric values are coerced to the receiving type. Integer types will 1531 fail if the assignment does not fit the target while floating point 1532 values may loose precision silently. Integer types never accepts 1533 floating point values. Strings only accept strings. 1534 1535 Nested flatbuffers may either by arrays of byte sized integers, or a 1536 table or a struct of the target type. See test cases for details. 1537 1538 The parser will by default fail on unknown fields, but these can also be 1539 skipped silently with a runtime option. 1540 1541 Unions are difficult to parse. A union is two json fields: a table as 1542 usual, and an enum to indicate the type which has the same name with a 1543 `_type` suffix and accepts a numeric or symbolic type code: 1544 1545 { 1546 name: "Container Monster", 1547 test_type: Monster, 1548 test: { name: "Contained Monster" } 1549 } 1550 1551 based on the schema is defined in [monster_test.fbs]. 1552 1553 Because other json processors may sort fields, it is possible to receive 1554 the type field after the test field. The parser does not store temporary 1555 datastructures. It constructs a flatbuffer directly. This is not 1556 possible when the type is late. This is handled by parsing the field as 1557 a skipped field on a first pass, followed by a typed back-tracking 1558 second pass once the type is known (only the table is parsed twice, but 1559 for nested unions this can still expand). Needless to say this slows down 1560 parsing. It is an error to provide only the table field or the type 1561 field alone, except if the type is `NONE` or `0` in which case the table 1562 is not allowed to be present. 1563 1564 Union vectors are supported as of v0.5.0. A union vector is represented 1565 as two vectors, one with a vector of tables and one with a vector of 1566 types, similar to ordinary unions. It is more efficient to place the 1567 type vector first because it avoids backtracking. Because a union of 1568 type NONE cannot be represented by absence of table field when dealing 1569 with vectors of unions, a table must have the value `null` if its type 1570 is NONE in the corresponding type vector. In other cases a table should 1571 be absent, and not null. 1572 1573 Here is an example of JSON containing Monster root table with a union 1574 vector field named `manyany` which is a vector of `Any` unions in the 1575 [monster_test.fbs] schema: 1576 1577 { 1578 "name": "Monster", 1579 "manyany_type": [ "Monster", "NONE" ], 1580 "manyany": [{"name": "Joe"}, null] 1581 } 1582 1583 ### Base64 Encoding 1584 1585 As of v0.5.0 it is possible to encode and decode a vector of type 1586 `[uint8]` (aka `[ubyte]`) as a base64 encoded string or a base64url 1587 encoded string as documented in RFC 4648. Any other type, notably the 1588 string type, do not handle base64 encoding. 1589 1590 Limiting the support to `[uint8]` avoids introducing binary data into 1591 strings and also avoids dealing with sign and endian encoding of binary 1592 data of other types. Furthermore, array encoding of values larger than 8 1593 bits are not necessarily less efficient than base64. 1594 1595 Base64 padding is always printed and is optional when parsed. Spaces, 1596 linebreaks, JSON string escape character '\\', or any other character 1597 not in the base64(url) alphabet are rejected as a parse error. 1598 1599 The schema must add the attribute `(base64)` or `(base64url)` to the 1600 field holding the vector, for example: 1601 1602 table Monster { 1603 name: string; 1604 sprite: [uint8] (base64); 1605 token: [uint8] (base64url); 1606 } 1607 1608 If more complex data needs to be encoded as base64 such as vectors of 1609 structs, this can be done via nested FlatBuffers which are also of type 1610 `[uint8]`. 1611 1612 Note that for some use cases it might be desireable to read binary data as 1613 base64 into memory aligned to more than 8 bits. This is not currently 1614 possible, but it is recognized that a `(force_align: n)` attribute on 1615 `[ubyte]` vectors could be useful, but it can also be handled via nested 1616 flatbuffers which also align data. 1617 1618 ### Fixed Length Arrays 1619 1620 Fixed length arrays introduced in 0.6.0 allow for structs containing arrays 1621 of fixed length scalars, structs and chars. Arrays are parsed like vectors 1622 for of similar type but are zero padded if shorter than expected and fails 1623 if longer than expected. The flag `reject_array_underflow` will error if an 1624 array is shorter than expected instead of zero padding. The flag 1625 `skip_array_overflow` will allow overlong arrays and simply drop extra elements. 1626 1627 Char arrays are parsed like strings and zero padded if short than expected, but 1628 they are not zero terminated. A string like "hello" will exactly fit into a 1629 field of type `[char:5]`. Trailing zero characters are not printed, but embedded 1630 zero characters are. This allows for loss-less roundtrips without having to zero 1631 pad strings. Note that other arrays are always printed in full. If the flag 1632 `skip_array_overflow` is set, a string might be truncated in the middle of a 1633 multi-byte character. This is not checked nor enforced by the verifier. 1634 1635 ### Runtime Flags 1636 1637 Both the printer and the parser have the ability to accept runtime flags that 1638 modifies their behavior. Please refer to header file comments for documentation 1639 and test cases for examples. Notably it is possible to print unquoted symbols 1640 and to ignore unknown fields when parsing instead of generating an error. 1641 1642 Note that deprecated fields are considered unknown fields during parsing so in 1643 order to process JSON from an old schema version with deprecated fields present, 1644 unknown symbols must be skipped. 1645 1646 ### Generic Parsing and Printing. 1647 1648 As of v0.5.1 [test_json.c] demonstrates how a single parser driver can be used 1649 to parse different table types without changes to the driver or to the schema. 1650 1651 For example, the following layout can be used to configure a generic parser or printer. 1652 1653 struct json_scope { 1654 const char *identifier; 1655 flatcc_json_parser_table_f *parser; 1656 flatcc_json_printer_table_f *printer; 1657 flatcc_table_verifier_f *verifier; 1658 }; 1659 1660 static const struct json_scope Monster = { 1661 /* The is the schema global file identifier. */ 1662 ns(Monster_identifier), 1663 ns(Monster_parse_json_table), 1664 ns(Monster_print_json_table), 1665 ns(Monster_verify_table) 1666 }; 1667 1668 The `Monster` scope can now be used by a driver or replaced with a new scope as needed: 1669 1670 /* Abbreviated ... */ 1671 struct json_scope = Monster; 1672 flatcc_json_parser_table_as_root(B, &parser_ctx, json, strlen(json), parse_flags, 1673 scope->identifier, scope->parser); 1674 /* Printing and verifying works roughly the same. */ 1675 1676 The generated table `MyGame_Example_Monster_parse_json_as_root` is a thin 1677 convenience wrapper roughly implementing the above. 1678 1679 The generated `monster_test_parse_json` is a higher level convenience wrapper named 1680 of the schema file itself, not any specific table. It parses the `root_type` configured 1681 in the schema. This is how the `test_json.c` test driver operated prior to v0.5.1 but 1682 it made it hard to test parsing and printing distinct table types. 1683 1684 Note that verification is not really needed for JSON parsing because a 1685 generated JSON parser is supposed to build buffers that always verify (except 1686 for binary encoded nested buffers), but it is useful for testing. 1687 1688 1689 ### Performance Notes 1690 1691 Note that json parsing and printing is very fast reaching 500MB/s for 1692 printing and about 300 MB/s for parsing. Floating point parsing can 1693 signficantly skew these numbers. The integer and floating point parsing 1694 and printing are handled via support functions in the portable library. 1695 In addition the floating point `include/flatcc/portable/grisu3_*` library 1696 is used unless explicitly disable by a compile time flag. Disabling 1697 `grisu3` will revert to `sprintf` and `strtod`. Grisu3 will fall back to 1698 `strtod` and `grisu3` in some rare special cases. Due to the reliance on 1699 `strtod` and because `strtod` cannot efficiently handle 1700 non-zero-terminated buffers, it is recommended to zero terminate 1701 buffers. Alternatively, grisu3 can be compiled with a flag that allows 1702 errors in conversion. These errors are very small and still correct, but 1703 may break some checksums. Allowing for these errors can significantly 1704 improve parsing speed and moves the benchmark from below half a million 1705 parses to above half a million parses per second on 700 byte json 1706 string, on a 2.2 GHz core-i7. 1707 1708 While unquoted strings may sound more efficient due to the compact size, 1709 it is actually slower to process. Furthermore, large flatbuffer 1710 generated JSON files may compress by a factor 8 using gzip or a factor 1711 4 using LZ4 so this is probably the better place to optimize. For small 1712 buffers it may be more efficient to compress flatbuffer binaries, but 1713 for large files, json may actually compress significantly better due to 1714 the absence of pointers in the format. 1715 1716 SSE 4.2 has been experimentally added, but it the gains are limited 1717 because it works best when parsing space, and the space parsing is 1718 already fast without SSE 4.2 and because one might just leave out the 1719 spaces if in a hurry. For parsing strings, trivial use of SSE 4.2 string 1720 scanning doesn't work well becasuse all the escape codes below ASCII 32 1721 must be detected rather than just searching for `\` and `"`. That is not 1722 to say there are not gains, they just don't seem worthwhile. 1723 1724 The parser is heavily optimized for 64-bit because it implements an 1725 8-byte wide trie directly in code. It might work well for 32-bit 1726 compilers too, but this hasn't been tested. The large trie does put some 1727 strain on compile time. Optimizing beyond -O2 leads to too large 1728 binaries which offsets any speed gains. 1729 1730 1731 ## Global Scope and Included Schema 1732 1733 Attributes included in the schema are viewed in a global namespace and 1734 each include file adds to this namespace so a schema file can use 1735 included attributes without namespace prefixes. 1736 1737 Each included schema will also add types to a global scope until it sees 1738 a `namespace` declaration. An included schema does not inherit the 1739 namespace of an including file or an earlier included file, so all 1740 schema files starts in the global scope. An included file can, however, 1741 see other types previously defined in the global scope. Because include 1742 statements always appear first in a schema, this can only be earlier 1743 included files, not types from a containing schema. 1744 1745 The generated output for any included schema is indendent of how it was 1746 included, but it might not compile without the earlier included files 1747 being present and included first. By including the toplevel `myschema.h` 1748 or `myschema_builder.h` all these dependencies are handled correctly. 1749 1750 Note: `libflatcc.a` can only parse a single schema when the schema is 1751 given as a memory buffer, but can handle the above when given a 1752 filename. It is possible to concatenate schema files, but a `namespace;` 1753 declaration must be inserted as a separator to revert to global 1754 namespace at the start of each included file. This can lead to subtle 1755 errors because if one parent schema includes two child schema `a.fbs` 1756 and `b.fbs`, then `b.fbs` should not be able to see anything in `a.fbs` 1757 even if they share namespaces. This would rarely be a problem in praxis, 1758 but it means that schema compilation from memory buffers cannot 1759 authoratively validate a schema. The reason the schema must be isolated 1760 is that otherwise code generation for a given schema could change with 1761 how it is being used leading to very strange errors in user code. 1762 1763 1764 ## Required Fields and Duplicate Fields 1765 1766 If a field is required such as Monster.name, the table end call will 1767 assert in debug mode and create incorrect tables in non-debug builds. 1768 The assertion may not be easy to decipher as it happens in library code 1769 and it will not tell which field is missing. 1770 1771 When reading the name, debug mode will again assert and non-debug builds 1772 will return a default value. 1773 1774 Writing the same field twice will also trigger an assertion in debug 1775 builds. 1776 1777 1778 ## Fast Buffers 1779 1780 Buffers can be used for high speed communication by using the ability to 1781 create buffers with structs as root. In addition the default emitter 1782 supports `flatcc_emitter_direct_buffer` for small buffers so no extra copy 1783 step is required to get a linear buffer in memory. Preliminary 1784 measurements suggests there is a limit to how fast this can go (about 1785 6-7 mill. buffers/sec) because the builder object must be reset between 1786 buffers which involves zeroing allocated buffers. Small tables with a 1787 simple vector achieve roughly half that speed. For really high speed a 1788 dedicated builder for structs would be needed. See also 1789 [monster_test.c]. 1790 1791 1792 ## Types 1793 1794 All types stored in a buffer has a type suffix such as `Monster_table_t` 1795 or `Vec3_struct_t` (and namespace prefix which we leave out here). These 1796 types are read-only pointers into endian encoded data. Enum types are 1797 just constants easily grasped from the generated code. Tables are dense so 1798 they are never accessed directly. 1799 1800 Enums support schema evolution meaning that more names can be added to 1801 the enumeration in a future schema version. As of v0.5.0 the function 1802 `_is_known_value` can be used ot check if an enum value is known to the 1803 current schema version. 1804 1805 Structs have a dual purpose because they are also valid types in native 1806 format, yet the native reprsention has a slightly different purpose. 1807 Thus the convention is that a const pointer to a struct encoded in a 1808 flatbuffer has the type `Vec3_struct_t` where as a writeable pointer to 1809 a native struct has the type `Vec3_t *` or `struct Vec3 *`. 1810 1811 All types have a `_vec_t` suffix which is a const pointer to the 1812 underlying type. For example `Monster_table_t` has the vector type 1813 `Monster_vec_t`. There is also a non-const variant with suffix 1814 `_mutable_vec_t` which is rarely used. However, it is possible to sort 1815 vectors in-place in a buffer, and for this to work, the vector must be 1816 cast to mutable first. A vector (or string) type points to the element 1817 with index 0 in the buffer, just after the length field, and it may be 1818 cast to a native type for direct access with attention to endian 1819 encoding. (Note that `table_t` types do point to the header field unlike 1820 vectors.) These types are all for the reader interface. Corresponding 1821 types with a `_ref_t` suffix such as `_vec_ref_t` are used during 1822 the construction of buffers. 1823 1824 Native scalar types are mapped from the FlatBuffers schema type names 1825 such as ubyte to `uint8_t` and so forth. These types also have vector 1826 types provided in the common namespace (default `flatbuffers_`) so 1827 a `[ubyte]` vector has type `flatbuffers_uint8_vec_t` which is defined 1828 as `const uint8_t *`. 1829 1830 The FlatBuffers boolean type is strictly 8 bits wide so we cannot use or 1831 emulate `<stdbool.h>` where `sizeof(bool)` is implementation dependent. 1832 Therefore `flatbuffers_bool_t` is defined as `uint8_t` and used to 1833 represent FlatBuffers boolean values and the constants of same type: 1834 `flatbuffers_true = 1` and `flatbuffers_false = 0`. Even so, 1835 `pstdbool.h` is available in the `include/flatcc/portable` directory if 1836 `bool`, `true`, and `false` are desired in user code and `<stdbool.h>` 1837 is unavailable. 1838 1839 `flatbuffers_string_t` is `const char *` but imply the returned pointer 1840 has a length prefix just before the pointer. `flatbuffers_string_vec_t` 1841 is a vector of strings. The `flatbufers_string_t` type guarantees that a 1842 length field is present using `flatbuffers_string_len(s)` and that the 1843 string is zero terminated. It also suggests that it is in utf-8 format 1844 according to the FlatBuffers specification, but not checks are done and 1845 the `flatbuffers_create_string(B, s, n)` call explicitly allows for 1846 storing embedded null characters and other binary data. 1847 1848 All vector types have operations defined as the typename with `_vec_t` 1849 replaced by `_vec_at` and `_vec_len`. For example 1850 `flatbuffers_uint8_vec_at(inv, 1)` or `Monster_vec_len(inv)`. The length 1851 or `_vec_len` will be 0 if the vector is missing whereas `_vec_at` will 1852 assert in debug or behave undefined in release builds following out of 1853 bounds access. This also applies to related string operations. 1854 1855 The FlatBuffers schema uses the following scalar types: `ubyte`, `byte`, 1856 `ushort`, `short, uint`, `int`, `ulong`, and `long` to represent 1857 unsigned and signed integer types of length 8, 16, 32, and 64 1858 respectively. The schema syntax has been updated to also support the 1859 type aliases `uint8`, `int8`, `uint16`, `int16`, `uint32`, `int32`, 1860 `uint64`, `int64` to represent the same basic types. Likewise, the 1861 schema uses the types `float` and `double` to represent IEEE-754 1862 binary32 and binary64 floating point formats where the updated syntax 1863 also supports the type aliases `float32` and `float64`. 1864 1865 The C interface uses the standard C types such as uint8 and double to 1866 represent scalar types and this is unaffected by the schema type name 1867 used, so the schema vector type `[float64]` is represented as 1868 `flatbuffers_double_vec_t` the same as `[double]` would be. 1869 1870 Note that the C standard does not guarantee that the C types `float` and 1871 `double` are represented by the IEEE-754 binary32 single precision 1872 format and the binary64 double precision format respectively, although 1873 they usually are. If this is not the case FlatCC cannot work correctly 1874 with FlatBuffers floating point values. (If someone really has this 1875 problem, it would be possible to fix). 1876 1877 Unions are represented with a two table fields, one with a table field 1878 and one with a type field. See separate section on Unions. As of flatcc 1879 v0.5.0 union vectors are also supported. 1880 1881 ## Unions 1882 1883 A union represents one of several possible tables. A table with a union 1884 field such as `Monster.equipped` in the samples schema will have two 1885 accessors: `MyGame_Sample_Monster_equipped(t)` of type 1886 `flatbuffers_generic_t` and `MyGame_Sample_Monster_equipped_type(t)` of 1887 type `MyGame_Sample_Equipment_union_type_t`. A generic type is is just a 1888 const void pointer that can be assigned to the expected table type, 1889 struct type, or string type. The enumeration has a type code for member 1890 of the union and also `MyGame_Sample_Equipment_NONE` which has the value 1891 0. 1892 1893 The union interface were changed in 0.5.0 and 0.5.1 to use a consistent 1894 { type, value } naming convention for both unions and union vectors 1895 in all interfaces and to support unions and union vectors of multiple 1896 types. 1897 1898 A union can be accessed by its field name, like Monster 1899 `MyGame_Sample_Monster_equipped(t)` and its type is given by 1900 `MyGame_Sample_Monster_type(t)`, or a `flatbuffers_union_t` struct 1901 can be returned with `MyGame_Sample_monster_union(t)` with the fields 1902 { type, value }. A union vector is accessed in the same way but { 1903 type, value } represents a type vector and a vector of the given type, 1904 e.g. a vector Monster tables or a vector of strings. 1905 1906 There is a test in [monster_test.c] covering union vectors and a 1907 separate test focusing on mixed type unions that also has union vectors. 1908 1909 1910 ### Union Scope Resolution 1911 1912 Googles `monster_test.fbs` schema has the union (details left out): 1913 1914 namespace MyGame.Example2; 1915 table Monster{} 1916 1917 namespace MyGame.Example; 1918 table Monster{} 1919 1920 union Any { Monster, MyGame.Example2.Monster } 1921 1922 where the two Monster tables are defined in separate namespaces. 1923 1924 `flatcc` rejects this schema due to a name conflict because it uses the 1925 basename of a union type, here `Monster` to generate the union member names 1926 which are also used in JSON parsing. This can be resolved by adding an 1927 explicit name such as `Monster2` to resolve the conflict: 1928 1929 union Any { Monster, Monster2: MyGame.Example2.Monster } 1930 1931 This syntax is accepted by both `flatc` and `flatcc`. 1932 1933 Both versions will implement the same union with the same type codes in the 1934 binary format but generated code will differ in how the types are referred to. 1935 1936 In JSON the monster type values are now identified by 1937 `MyGame.Example.Any.Monster`, or just `Monster`, when assigning the first 1938 monster type to an Any union field, and `MyGame.Example.Any.Monster2`, or just 1939 `Monster2` when assigning the second monster type. C uses the usual enum 1940 namespace prefixed symbols like `MyGame_Example_Any_Monster2`. 1941 1942 ## Fixed Length Arrays 1943 1944 Fixed Length Arrays is a late feature to the FlatBuffers format introduced in 1945 flatc and flatcc mid 2019. Currently only scalars arrays are supported, and only 1946 as struct fields. To use fixed length arrays as a table field wrap it in a 1947 struct first. It would make sense to support struct elements and enum elements, 1948 but that has not been implemented. Char arrays are more controversial due to 1949 verification and zero termination and are also not supported. Arrays are aligned 1950 to the size of the first field and are equivalent to repeating elements within 1951 the struct. 1952 1953 The schema syntax is: 1954 1955 ``` 1956 struct MyStruct { 1957 my_array : [float:10]; 1958 } 1959 ``` 1960 1961 See `test_fixed_array` in [monster_test.c] for an example of how to work with 1962 these arrays. 1963 1964 Flatcc opts to allow arbitrary length fixed length arrays but limit the entire 1965 struct to 2^16-1 bytes. Tables cannot hold larger structs, and the C language 1966 does not guarantee support for larger structs. Other implementations might have 1967 different limits on maximum array size. Arrays of 0 length are not permitted. 1968 1969 1970 ## Optional Fields 1971 1972 Optional scalar table fields were introduced to FlatBuffers mid 2020 in order to 1973 better handle null values also for scalar data types, as is common in SQL 1974 databases. Before describing optional values, first understand how ordinary 1975 scalar values work in FlatBuffers: 1976 1977 Imagine a FlatBuffer table with a `mana` field from the monster sample schema. 1978 Ordinarily a scalar table field has implicit default value of 0 like `mana : 1979 uint8;`, or an explicit default value specified in the schema like `mana : uint8 1980 = 100;`. When a value is absent from a table field, the default value is 1981 returned, and when a value is added during buffer construction, it will not 1982 actually be stored if the value matches the default value, unless the 1983 `force_add` option is used to write a value even if it matches the default 1984 value. Likewise the `is_present` method can be used to test if a field was 1985 actually stored in the buffer when reading it. 1986 1987 When a table has many fields, most of which just hold default settings, 1988 signficant space can be saved using default values, but it also means that an 1989 absent value does not indicate null. Field absence is essentially just a data 1990 compression technique, not a semantic change to the data. However, it is 1991 possible to use `force_add` and `is_present` to interpret values as null when 1992 not present, except that this is not a standardized technique. Optional fields 1993 represents a standardized way to achieve this. 1994 1995 Scalar fields can be marked as optional by assigning `null` as a default 1996 value. For example, some objects might not have a meaningful `mana` 1997 value, so it could be represented as `lifeforce : uint8 = null`. Now the 1998 `lifeforce` field has become an optional field. In the FlatCC implementation 1999 this means that the field is written, it will always be written also if the 2000 value is 0 or any other representable value. It also means that the `force_add` 2001 method is not available for the field because `force_add` is essentially always 2002 in effect for the field. On the read side, optional scalar fields behave exactly is ordinary scalar fields that have not specified a default value, that is, if the field is absent, 0 will be returned and `is_present` will return false. Instead optional scalar fields get a new accessor method with the suffix `_option()` which returns a struct with two fiels: `{ is_null, value }` where `_option().is_null == !is_present()` and `_option().value` is the same value is the `_get()` method, which will be 0 if `is_null` is true. The option struct is named after the type similar to unions, for example `flatbuffers_uint8_option_t` or `MyGame_Example_Color_option_t`, and the option accessor method also works similar to unions. Note that `_get()` will also return 0 for optional enum values that are null (i.e. absent), even if the enum value does not have an enumerated element with the value 0. Normally enums without a 0 element is not allowed in the schema unless a default value is specified, but in this case it is null, and `_get()` needs some value to return in this case. 2003 2004 By keeping the original accessors, read logic can be made simpler and faster when it is not important whether a value is null or 0 and at the same time the option value can be returned and stored. 2005 2006 Note that struct fields cannot be optional. Also note that, non-scalar table fields are not declared optional because these types can already represent null via a null pointer or a NONE union type. 2007 2008 JSON parsing and printing change behavior for scalar fields by treating absent 2009 fields differently according the optional semantics. For example parsing a 2010 missing field will not store a default value even if the parser is executed with 2011 a flag to force default values to be stored and the printer will not print 2012 absent optional fields even if otherwise flagged to print default values. 2013 Currenlty the JSON printers and parsers do not print or parse JSON null and can 2014 only represent null be absence of a field. 2015 2016 For an example of reading and writing, as well as printing and parsing JSON, 2017 optional scalar fields, please refer to [optional_scalars_test.fbs] and [optional_scalars_test.c]. 2018 2019 2020 ## Endianness 2021 2022 The `include/flatcc/portable/pendian_detect.h` file detects endianness 2023 for popular compilers and provides a runtime fallback detection for 2024 others. In most cases even the runtime detection will be optimized out 2025 at compile time in release builds. 2026 2027 The `FLATBUFFERS_LITTLEENDIAN` flag is respected for compatibility with 2028 Googles `flatc` compiler, but it is recommended to avoid its use and 2029 work with the mostly standard flags defined and/or used in 2030 `pendian_detect.h`, or to provide for additional compiler support. 2031 2032 As of flatcc 0.4.0 there is support for flatbuffers running natively on 2033 big endian hosts. This has been tested on IBM AIX. However, always run 2034 tests against the system of interest - the release process does not cover 2035 automated tests on any BE platform. 2036 2037 As of flatcc 0.4.0 there is also support for compiling the flatbuffers 2038 runtime library with flatbuffers encoded in big endian format regardless 2039 of the host platforms endianness. Longer term this should probably be 2040 placed in a separate library with separate name prefixes or suffixes, 2041 but it is usable as is. Redefine `FLATBUFFERS_PROTOCOL_IS_LE/BE` 2042 accordingly in `include/flatcc/flatcc_types.h`. This is already done in 2043 the `be` branch. This branch is not maintained but the master branch can 2044 be merged into it as needed. 2045 2046 Note that standard flatbuffers are always encoded in little endian but 2047 in situations where all buffer producers and consumers are big endian, 2048 the non standard big endian encoding may be faster, depending on 2049 intrinsic byteswap support. As a curiosity, the `load_test` actually 2050 runs faster with big endian buffers on a little endian MacOS platform 2051 for reasons only the optimizer will know, but read performance of small 2052 buffers drop to 40% while writing buffers generally drops to 80-90% 2053 performance. For platforms without compiler intrinsics for byteswapping, 2054 this can be much worse. 2055 2056 Flatbuffers encoded in big endian will have the optional file identifier 2057 byteswapped. The interface should make this transparent, but details 2058 are still being worked out. For example, a buffer should always verify 2059 the monster buffer has the identifier "MONS", but internally the buffer 2060 will store the identifier as "SNOM" on big endian encoded buffers. 2061 2062 Because buffers can be encode in two ways, `flatcc` uses the term 2063 `native` endianness and `protocol` endianess. `_pe` is a suffix used in 2064 various low level API calls to convert between native and protocol 2065 endianness without caring about whether host or buffer is little or big 2066 endian. 2067 2068 If it is necessary to write application code that behaves differently if 2069 the native encoding differs from protocol encoding, use 2070 `flatbuffers_is_pe_native()`. This is a function, not a define, but for 2071 all practical purposes it will have same efficience while also 2072 supporting runtime endian detection where necessary. 2073 2074 The flatbuffer environment only supports reading either big or little 2075 endian for the time being. To test which is supported, use the define 2076 `FLATBUFFERS_PROTOCOL_IS_LE` or `FLATBUFFERS_PROTOCOL_IS_BE`. They are 2077 defines as 1 and 0 respectively. 2078 2079 2080 ## Pitfalls in Error Handling 2081 2082 The builder API often returns a reference or a pointer where null is 2083 considered an error or at least a missing object default. However, some 2084 operations do not have a meaningful object or value to return. These 2085 follow the convention of 0 for success and non-zero for failure. 2086 Also, if anything fails, it is not safe to proceed with building a 2087 buffer. However, to avoid overheads, there is no hand holding here. On 2088 the upside, failures only happen with incorrect use or allocation 2089 failure and since the allocator can be customized, it is possible to 2090 provide a central error state there or to guarantee no failure will 2091 happen depending on use case, assuming the API is otherwise used 2092 correctly. By not checking error codes, this logic also optimizes out 2093 for better performance. 2094 2095 2096 ## Searching and Sorting 2097 2098 The builder API does not support sorting due to the complexity of 2099 customizable emitters, but the reader API does support sorting so a 2100 buffer can be sorted at a later stage. This requires casting a vector to 2101 mutable and calling the sort method available for fields with keys. 2102 2103 The sort uses heap sort and can sort a vector in-place without using 2104 external memory or recursion. Due to the lack of external memory, the 2105 sort is not stable. The corresponding find operation returns the lowest 2106 index of any matching key, or `flatbuffers_not_found`. 2107 2108 When configured in `config.h` (the default), the `flatcc` compiler 2109 allows multiple keyed fields unlike Googles `flatc` compiler. This works 2110 transparently by providing `<table_name>_vec_sort_by_<field_name>` and 2111 `<table_name>_vec_find_by_<field_name>` methods for all keyed fields. 2112 The first field maps to `<table_name>_vec_sort` and 2113 `<table_name>_vec_find`. Obviously the chosen find method must match 2114 the chosen sort method. The find operation is O(logN). 2115 2116 As of v0.6.0 the default key used for find and and sort without the `by_name` 2117 suffix is the field with the smaller id instead of the first listed in the 2118 schema which is often but not always the same thing. 2119 2120 v0.6.0 also introduces the `primary_key` attribute that can be used instead of 2121 the `key` attribute on at most one field. The two attributes are mutually 2122 exclusive. This can be used if a key field with a higher id should be the 2123 default key. There is no difference when only one field has a `key` or 2124 `primary_key` attribute, so in that case choose `key` for compatiblity. 2125 Googles flatc compiler does not recognize the `primary_key` attribute. 2126 2127 As of v0.6.0 a 'sorted' attribute has been introduced together with the sort 2128 operations `<table_name>_sort` and `<union_name>_sort`. If a table or a union, 2129 directly or indirectly, contains a vector with the 'sorted' attribute, then the 2130 sort operation is made available. The sort will recursively visit all children 2131 with vectors marked sorted. The sort operatoin will use the default (primary) 2132 key. A table or union must first be cast to mutable, for example 2133 `ns(Monster_sort((ns(Monster_mutable_table_t))monster)`. The actual vector 2134 sort operations are the same as before, they are just called automatically. 2135 The `sorted` attribute can only be set on vectors that are not unions. The 2136 vector can be of scalar, string, struct, or table type. `sorted` is only valid 2137 for a struct or table vector if the struct or table has a field with a `key` 2138 or `primary_key` attribute. NOTE: A FlatBuffer can reference the same object 2139 multiple times. The sort operation will be repeated if this is the case. 2140 Sometimes that is OK, but if it is a concern, remove the `sorted` attribute 2141 and sort the vector manually. Note that sharing can also happen via a shared 2142 containing object. The sort operations are generated in `_reader.h` files 2143 and only for objects directly or indirectly affected by the `sorted` attribute. 2144 Unions have a new mutable case operator for use with sorting unions: 2145 `ns(Any_sort(ns(Any_mutable_cast)(my_any_union))`. Usually unions will be 2146 sorted via a containing table which performs this cast automatically. See also 2147 `test_recursive_sort` in [monster_test.c]. 2148 2149 As of v0.4.1 `<table_name>_vec_scan_by_<field_name>` and the default 2150 `<table_name>_vec_scan` are also provided, similar to `find`, but as a 2151 linear search that does not require the vector to be sorted. This is 2152 especially useful for searching by a secondary key (multiple keys is a 2153 non-standard flatcc feature). `_scan_ex` searches a sub-range [a, b) 2154 where b is an exclusive index. `b = flatbuffers_end == flatbuffers_not_found 2155 == (size_t)-1` may be used when searching from a position to the end, 2156 and `b` can also conveniently be the result of a previous search. 2157 2158 `rscan` searches in the opposite direction starting from the last 2159 element. `rscan_ex` accepts the same range arguments as `scan_ex`. If 2160 `a >= b or a >= len` the range is considered empty and 2161 `flatbuffers_not_found` is returned. `[r]scan[_ex]_n[_by_name]` is for 2162 length terminated string keys. See [monster_test.c] for examples. 2163 2164 Note that `find` requires `key` attribute in the schema. `scan` is also 2165 available on keyed fields. By default `flatcc` will also enable scan by 2166 any other field but this can be disabled by a compile time flag. 2167 2168 Basic types such as `uint8_vec` also have search operations. 2169 2170 See also [Builder Interface Reference] and [monster_test.c]. 2171 2172 2173 ## Null Values 2174 2175 The FlatBuffers format does not fully distinguish between default values 2176 and missing or null values but it is possible to force values to be 2177 written to the buffer. This is discussed further in the 2178 [Builder Interface Reference]. For SQL data roundtrips this may be more 2179 important that having compact data. 2180 2181 The `_is_present` suffix on table access methods can be used to detect if 2182 value is present in a vtable, for example `Monster_hp_present`. Unions 2183 return true of the type field is present, even if it holds the value 2184 None. 2185 2186 The `add` methods have corresponding `force_add` methods for scalar and enum 2187 values to force storing the value even if it is default and thus making 2188 it detectable by `is_present`. 2189 2190 2191 ## Portability Layer 2192 2193 The portable library is placed under `include/flatcc/portable` and is 2194 required by flatcc, but isn't strictly part of the `flatcc` project. It 2195 is intended as an independent light-weight header-only library to deal 2196 with compiler and platform variations. It is placed under the flatcc 2197 include path to simplify flatcc runtime distribution and to avoid 2198 name and versioning conflicts if used by other projects. 2199 2200 The license of portable is different from `flatcc`. It is mostly MIT or 2201 Apache depending on the original source of the various parts. 2202 2203 A larger set of portable files is included if `FLATCC_PORTABLE` is 2204 defined by the user when building. 2205 2206 cc -D FLATCC_PORTABLE -I include monster_test.c -o monster_test 2207 2208 Otherwise a targeted subset is 2209 included by `flatcc_flatbuffers.h` in order to deal with non-standard 2210 behavior of some C11 compilers. 2211 2212 `pwarnings.h` is also always included so compiler specific warnings can 2213 be disabled where necessary. 2214 2215 The portable library includes the essential parts of the grisu3 library 2216 found in `external/grisu3`, but excludes the test cases. The JSON 2217 printer and parser relies on fast portable numeric print and parse 2218 operations based mostly on grisu3. 2219 2220 If a specific platform has been tested, it would be good with feedback 2221 and possibly patches to the portability layer so these can be made 2222 available to other users. 2223 2224 2225 ## Building 2226 2227 ### Unix Build (OS-X, Linux, related) 2228 2229 To initialize and run the build (see required build tools below): 2230 2231 scripts/build.sh 2232 2233 The `bin` and `lib` folders will be created with debug and release 2234 build products. 2235 2236 The build depends on `CMake`. By default the `Ninja` build tool is also required, 2237 but alternatively `make` can be used. 2238 2239 Optionally switch to a different build tool by choosing one of: 2240 2241 scripts/initbuild.sh make 2242 scripts/initbuild.sh make-concurrent 2243 scripts/initbuild.sh ninja 2244 2245 where `ninja` is the default and `make-concurrent` is `make` with the `-j` flag. 2246 2247 To enforce a 32-bit build on a 64-bit machine the following configuration 2248 can be used: 2249 2250 scripts/initbuild.sh make-32bit 2251 2252 which uses `make` and provides the `-m32` flag to the compiler. 2253 A custom build configuration `X` can be added by adding a 2254 `scripts/build.cfg.X` file. 2255 2256 `scripts/initbuild.sh` cleans the build if a specific build 2257 configuration is given as argument. Without arguments it only ensures 2258 that CMake is initialized and is therefore fast to run on subsequent 2259 calls. This is used by all test scripts. 2260 2261 To install build tools on OS-X, and build: 2262 2263 brew update 2264 brew install cmake ninja 2265 git clone https://github.com/dvidelabs/flatcc.git 2266 cd flatcc 2267 scripts/build.sh 2268 2269 To install build tools on Ubuntu, and build: 2270 2271 sudo apt-get update 2272 sudo apt-get install cmake ninja-build 2273 git clone https://github.com/dvidelabs/flatcc.git 2274 cd flatcc 2275 scripts/build.sh 2276 2277 To install build tools on Centos, and build: 2278 2279 sudo yum group install "Development Tools" 2280 sudo yum install cmake 2281 git clone https://github.com/dvidelabs/flatcc.git 2282 cd flatcc 2283 scripts/initbuild.sh make # there is no ninja build tool 2284 scripts/build.sh 2285 2286 2287 OS-X also has a HomeBrew package: 2288 2289 brew update 2290 brew install flatcc 2291 2292 or for the bleeding edge: 2293 2294 brew update 2295 brew install flatcc --HEAD 2296 2297 2298 ### Windows Build (MSVC) 2299 2300 Install CMake, MSVC, and git (tested with MSVC 14 2015). 2301 2302 In PowerShell: 2303 2304 git clone https://github.com/dvidelabs/flatcc.git 2305 cd flatcc 2306 mkdir build\MSVC 2307 cd build\MSVC 2308 cmake -G "Visual Studio 14 2015" ..\.. 2309 2310 Optionally also build from the command line (in build\MSVC): 2311 2312 cmake --build . --target --config Debug 2313 cmake --build . --target --config Release 2314 2315 In Visual Studio: 2316 2317 open flatcc\build\MSVC\FlatCC.sln 2318 build solution 2319 choose Release build configuration menu 2320 rebuild solution 2321 2322 *Note that `flatcc\CMakeList.txt` sets the `-DFLATCC_PORTABLE` flag and 2323 that `include\flatcc\portable\pwarnings.h` disable certain warnings for 2324 warning level -W3.* 2325 2326 ### Docker 2327 2328 Docker image: 2329 2330 - <https://github.com/neomantra/docker-flatbuffers> 2331 2332 2333 ### Cross-compilation 2334 2335 Users have been reporting some degree of success using cross compiles 2336 from Linux x86 host to embedded ARM Linux devices. 2337 2338 For this to work, `FLATCC_TEST` option should be disabled in part 2339 because cross-compilation cannot run the cross-compiled flatcc tool, and 2340 in part because there appears to be some issues with CMake custom build 2341 steps needed when building test and sample projects. 2342 2343 The option `FLATCC_RTONLY` will disable tests and only build the runtime 2344 library. 2345 2346 The following is not well tested, but may be a starting point: 2347 2348 mkdir -p build/xbuild 2349 cd build/xbuild 2350 cmake ../.. -DBUILD_SHARED_LIBS=on -DFLATCC_RTONLY=on \ 2351 -DCMAKE_BUILD_TYPE=Release 2352 2353 Overall, it may be simpler to create a separate Makefile and just 2354 compile the few `src/runtime/*.c` into a library and distribute the 2355 headers as for other platforms, unless `flatcc` is also required for the 2356 target. Or to simply include the runtime source and header files in the user 2357 project. 2358 2359 Note that no tests will be built nor run with `FLATCC_RTONLY` enabled. 2360 It is highly recommended to at least run the `tests/monster_test` 2361 project on a new platform. 2362 2363 2364 ### Custom Allocation 2365 2366 Some target systems will not work with Posix `malloc`, `realloc`, `free` 2367 and C11 `aligned_alloc`. Or they might, but more allocation control is 2368 desired. The best approach is to use `flatcc_builder_custom_init` to 2369 provide a custom allocator and emitter object, but for simpler case or 2370 while piloting a new platform 2371 [flatcc_alloc.h](include/flatcc/flatcc_alloc.h) can be used to override 2372 runtime allocation functions. _Carefully_ read the comments in this file 2373 if doing so. There is a test case implementing a new emitter, and a 2374 custom allocator can be copied from the one embedded in the builder 2375 library source. 2376 2377 2378 ### Custom Asserts 2379 2380 On systems where the default POSIX `assert` call is unavailable, or when 2381 a different assert behaviour is desirable, it is possible to override 2382 the default behaviour in runtime part of flatcc library via logic defined 2383 in [flatcc_assert.h](include/flatcc/flatcc_assert.h). 2384 2385 By default Posix `assert` is beeing used. It can be changed by preprocessor definition: 2386 2387 -DFLATCC_ASSERT=own_assert 2388 2389 but it will not override assertions used in the portable library, notably the 2390 Grisu3 fast numerical conversion library used with JSON parsing. 2391 2392 Runtime assertions can be disabled using: 2393 2394 -DFLATCC_NO_ASSERT 2395 2396 This will also disable Grisu3 assertions. See 2397 [flatcc_assert.h](include/flatcc/flatcc_assert.h) for details. 2398 2399 The `<assert.h>` file will in all cases remain a dependency for C11 style static 2400 assertions. Static assertions are needed to ensure the generated structs have 2401 the correct physical layout on all compilers. The portable library has a generic 2402 static assert implementation for older compilers. 2403 2404 2405 ### Shared Libraries 2406 2407 By default libraries are built statically. 2408 2409 Occasionally there are requests 2410 [#42](https://github.com/dvidelabs/flatcc/pull/42) for also building shared 2411 libraries. It is not clear how to build both static and shared libraries 2412 at the same time without choosing some unconvential naming scheme that 2413 might affect install targets unexpectedly. 2414 2415 CMake supports building shared libraries out of the box using the 2416 standard library name using the following option: 2417 2418 CMAKE ... -DBUILD_SHARED_LIBS=ON ... 2419 2420 See also [CMake Gold: Static + shared](http://cgold.readthedocs.io/en/latest/tutorials/libraries/static-shared.html). 2421 2422 2423 ## Distribution 2424 2425 Install targes may be built with: 2426 2427 mkdir -p build/install 2428 cd build/install 2429 cmake ../.. -DBUILD_SHARED_LIBS=on -DFLATCC_RTONLY=on \ 2430 -DCMAKE_BUILD_TYPE=Release -DFLATCC_INSTALL=on 2431 make install 2432 2433 However, this is not well tested and should be seen as a starting point. 2434 The normal scripts/build.sh places files in bin and lib of the source tree. 2435 2436 By default lib files a built into the `lib` subdirectory of the project. This 2437 can be changed, for example like `-DFLATCC_INSTALL_LIB=lib64`. 2438 2439 2440 ### Unix Files 2441 2442 To distribute the compiled binaries the following files are 2443 required: 2444 2445 Compiler: 2446 2447 bin/flatcc (command line interface to schema compiler) 2448 lib/libflatcc.a (optional, for linking with schema compiler) 2449 include/flatcc/flatcc.h (optional, header and doc for libflatcc.a) 2450 2451 Runtime: 2452 2453 include/flatcc/** (runtime header files) 2454 include/flatcc/reflection (optional) 2455 include/flatcc/support (optional, only used for test and samples) 2456 lib/libflatccrt.a (runtime library) 2457 2458 In addition the runtime library source files may be used instead of 2459 `libflatccrt.a`. This may be handy when packaging the runtime library 2460 along with schema specific generated files for a foreign target that is 2461 not binary compatible with the host system: 2462 2463 src/runtime/*.c 2464 2465 ### Windows Files 2466 2467 The build products from MSVC are placed in the bin and lib subdirectories: 2468 2469 flatcc\bin\Debug\flatcc.exe 2470 flatcc\lib\Debug\flatcc_d.lib 2471 flatcc\lib\Debug\flatccrt_d.lib 2472 flatcc\bin\Release\flatcc.exe 2473 flatcc\lib\Release\flatcc.lib 2474 flatcc\lib\Release\flatccrt.lib 2475 2476 Runtime `include\flatcc` directory is distributed like other platforms. 2477 2478 2479 ## Running Tests on Unix 2480 2481 Run 2482 2483 scripts/test.sh [--no-clean] 2484 2485 **NOTE:** The test script will clean everything in the build directy before 2486 initializing CMake with the chosen or default build configuration, then 2487 build Debug and Release builds, and run tests for both. 2488 2489 The script must end with `TEST PASSED`, or it didn't pass. 2490 2491 To make sure everything works, also run the benchmarks: 2492 2493 scripts/benchmark.sh 2494 2495 2496 ## Running Tests on Windows 2497 2498 In Visual Studio the test can be run as follows: first build the main 2499 project, the right click the `RUN_TESTS` target and chose build. See 2500 the output window for test results. 2501 2502 It is also possible to run tests from the command line after the project has 2503 been built: 2504 2505 cd build\MSVC 2506 ctest 2507 2508 Note that the monster example is disabled for MSVC 2010. 2509 2510 Be aware that tests copy and generate certain files which are not 2511 automatically cleaned by Visual Studio. Close the solution and wipe the 2512 `MSVC` directory, and start over to get a guaranteed clean build. 2513 2514 Please also observe that the file `.gitattributes` is used to prevent 2515 certain files from getting CRLF line endings. Using another source 2516 control systems might break tests, notably 2517 `test/flatc_compat/monsterdata_test.golden`. 2518 2519 2520 *Note: Benchmarks have not been ported to Windows.* 2521 2522 2523 ## Configuration 2524 2525 The configuration 2526 2527 config/config.h 2528 2529 drives the permitted syntax and semantics of the schema compiler and 2530 code generator. These generally default to be compatible with 2531 Googles `flatc` compiler. It also sets things like permitted nesting 2532 depth of structs and tables. 2533 2534 The runtime library has a separate configuration file 2535 2536 include/flatcc/flatcc_rtconfig.h 2537 2538 This file can modify certain aspects of JSON parsing and printing such 2539 as disabling the Grisu3 library or requiring that all names in JSON are 2540 quoted. 2541 2542 For most users, it should not be relevant to modify these configuration 2543 settings. If changes are required, they can be given in the build 2544 system - it is not necessary to edit the config files, for example 2545 to disable trailing comma in the JSON parser: 2546 2547 cc -DFLATCC_JSON_PARSE_ALLOW_TRAILING_COMMA=0 ... 2548 2549 2550 ## Using the Compiler and Builder library 2551 2552 The compiler library `libflatcc.a` can compile schemas provided 2553 in a memory buffer or as a filename. When given as a buffer, the schema 2554 cannot contain include statements - these will cause a compile error. 2555 2556 When given a filename the behavior is similar to the commandline 2557 `flatcc` interface, but with more options - see `flatcc.h` and 2558 `config/config.h`. 2559 2560 `libflatcc.a` supports functions named `flatcc_...`. `reflection...` may 2561 also be available which are simple the C generated interface for the 2562 binary schema. The builder library is also included. These last two 2563 interfaces are only present because the library supports binary schema 2564 generation. 2565 2566 The standalone runtime library `libflatccrt.a` is a collection of the 2567 `src/runtime/*.c` files. This supports the generated C headers for 2568 various features. It is also possible to distribute and compile with the 2569 source files directly. For debugging, it is useful to use the 2570 `libflatccrt_d.a` version because it catches a lot of incorrect API use 2571 in assertions. 2572 2573 The runtime library may also be used by other languages. See comments 2574 in [flatcc_builder.h]. JSON parsing is on example of an 2575 alternative use of the builder library so it may help to inspect the 2576 generated JSON parser source and runtime source. 2577 2578 ## FlatBuffers Binary Format 2579 2580 Mostly for implementers: [FlatBuffers Binary Format] 2581 2582 2583 ## Security Considerations 2584 2585 See [Security Considerations]. 2586 2587 2588 ## Style Guide 2589 2590 FlatCC coding style is largely similar to the [WebKit Style], with the following notable exceptions: 2591 2592 * Syntax requiring C99 or later is avoided, except `<stdint.h>` types are made available. 2593 * If conditions always use curly brackets, or single line statements without linebreak: `if (err) return -1;`. 2594 * NULL and nullptr are generally just represented as `0`. 2595 * Comments are old-school C-style (pre C99). Text is generally cased with punctuation: `/* A comment. */` 2596 * `true` and `false` keywords are not used (pre C99). 2597 * In code generation there is essentially no formatting to avoid excessive bloat. 2598 * Struct names and other types is lower case since this is C, not C++. 2599 * `snake_case` is used over `camelCase`. 2600 * Header guards are used over `#pragma once` because it is non-standard and not always reliable in filesystems with ambigious paths. 2601 * Comma is not placed first in multi-line calls (but maybe that would be a good idea for diff stability). 2602 * `config.h` inclusion might be handled differently in that `flatbuffers.h` includes the config file. 2603 * `unsigned` is not used without `int` for historical reasons. Generally a type like `uint32_t` is preferred. 2604 * Use `TODO:` instead of `FIXME:` in comments for historical reasons. 2605 2606 All the main source code in compiler and runtime aim to be C11 compatible and 2607 uses many C11 constructs. This is made possible through the included portable 2608 library such that older compilers can also function. Therefore any platform specific adaptations will be provided by updating 2609 the portable library rather than introducing compile time flags in the main 2610 source code. 2611 2612 2613 ## Benchmarks 2614 2615 See [Benchmarks] 2616 2617 [Builder Interface Reference]: https://github.com/dvidelabs/flatcc/blob/master/doc/builder.md 2618 [FlatBuffers Binary Format]: https://github.com/dvidelabs/flatcc/blob/master/doc/binary-format.md 2619 [Benchmarks]: https://github.com/dvidelabs/flatcc/blob/master/doc/benchmarks.md 2620 [monster_test.c]: https://github.com/dvidelabs/flatcc/blob/master/test/monster_test/monster_test.c 2621 [monster_test.fbs]: https://github.com/dvidelabs/flatcc/blob/master/test/monster_test/monster_test.fbs 2622 [optional_scalars_test.fbs]: https://github.com/dvidelabs/flatcc/blob/optional/test/optional_scalars_test/optional_scalars_test.fbs 2623 [optional_scalars_test.c]: https://github.com/dvidelabs/flatcc/blob/optional/test/optional_scalars_test/optional_scalars_test.c 2624 [paligned_alloc.h]: https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/portable/paligned_alloc.h 2625 [test_json.c]: https://github.com/dvidelabs/flatcc/blob/master/test/json_test/test_json.c 2626 [test_json_parser.c]: https://github.com/dvidelabs/flatcc/blob/master/test/json_test/test_json_parser.c 2627 [flatcc_builder.h]: https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/flatcc_builder.h 2628 [flatcc_emitter.h]: https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/flatcc_emitter.h 2629 [flatcc-help.md]: https://github.com/dvidelabs/flatcc/blob/master/doc/flatcc-help.md 2630 [flatcc_rtconfig.h]: https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/flatcc_rtconfig.h 2631 [hexdump.h]: https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/support/hexdump.h 2632 [readfile.h]: include/flatcc/support/readfile.h 2633 [Security Considerations]: https://github.com/dvidelabs/flatcc/blob/master/doc/security.md 2634 [flatc --annotate]: https://github.com/google/flatbuffers/tree/master/tests/annotated_binary 2635 [WebKit Style]: https://webkit.org/code-style-guidelines/