Your file format should have a simple parser in C or Rust
Disclaimer
Quick Note
If AI fully automates software engineering, this whole post becomes invalid. If it partially automates it, I guess my point could still hold.
I am pretty mediocre at software, so take my advice with a grain of salt.
Main
Good formats that satisfy this criteria: cmark for .md, jq for .js, htmlq for .html, d2 for .d2
Bad formats that (arguably) don't satisfy this criteria: .epub, mathjax parser for .tex, tikz for .tex diagrams, mermaid browser engine for .mmdc
Formats that aren't even worth considering: MS word for .docx, google docs
Your parser should not require a python environment ideally. Your parser should not require an entire browser JS engine ideally.
A formal spec of your file format, is not the same thing as an actual parser script. In practice, usually the code is what becomes the spec, and the words of the spec are just words nobody follows. Example: nobody follows the epub spec
It is also generally a good idea for your format to be atleast somewhat human readable. Prefer UTF-8 fields over binary blobs wherever possible.
My guess is this is one of the biggest predictors of whether your file format exists 10 years later or not.
Subscribe
Enter email or phone number to subscribe. You will receive atmost one update per month