But why? It's roughly the same as JSON, but incompatible. Please, stop writing your own JSON variants.
Rust's serialization macro system allows you to write back ends for any format. I've written back ends for all three serialization formats used by Second Life.[1] (There's a binary form, an XML form, and something called "notation".) But other than for compatibility with existing code, there's no reason to use them.
I know of one format that serde's system doesn't support well. D-Bus requires empty arrays to be aligned according to the type of their element, but serde's system has no way to tell the serializer any metadata about the type of an array element apart from giving it the array element to serialize. All the serializer gets to know that an array has started and then ended; it doesn't know anything about the element alignment unless it receives at least one element.
There is a sort of workaround - preload into the serializer what it is expected to serialize, eg by passing in the D-Bus signature string in the serializer ctor. But a) this is a somewhat unclean solution because everything other than arrays is redundant information that the serializer already gets from the serde::Serialize impl, and b) it's manual work for the user to have to specify this and easy to make a mistake and have the two get out of sync.
In my D-Bus library I decided to have my own Serializer setup as a workaround. serde's Deserializer setup still works though so that's not a problem.
> But why?
It's literally in the README...
• Less syntactic noise, more intuitive look.
• Allow comments and trailing commas.
• Write KEON almost like you write Rust:
- Humanized optional type annotation.
- Distinguishable between tuples and lists (seqs).
- Arbitrary type as dictionary (map) keys.
...
• Shorthand for newtypes.
• Support use Base64, Base32 and Base16 to represent bytes.
• Provide paragraphs may be helpful when writing something by hand.
JSON is pretty terrible. Its only real pro is the fact that it is widely used.As a comparison, here's the "Why RON" for the native RON/Rust format:
Note the following advantages of RON over JSON:
* trailing commas allowed
* single- and multi-line comments
* field names aren't quoted, so it's less verbose
* optional struct names improve readability
* enums are supported (and less verbose than their JSON representation)
I feel like they are close enough that it would be better to just use RON, which has existing uptake/tooling.So basically JSON5, while not being a standard.
> • Allow comments
Ok, but whence the comments when serializing, and what does the codec do with comments when deserializing?
Adding comments to JSON wouldn't be hard syntactically speaking, but semantically it's a real problem not just for JSON but for every encoding where the encoded data isn't the source of truth.
Put another way, if I maintain XML, JSON, etc. by hand with $EDITOR, then embedded comments are sensible, but if XML, JSON, etc. are generated from an internal representation then now we have to wonder "what elements of the internal representation are the comments associated with, how do we ensure we represent that association when encoding, and how do we ensure that on decode we get the comments associated with the same elements.
/* Is this comment associated
with the following value?
Surely not with the preceding
one since there isn't one! */
{ "foo" /* what about this one? */:
/* or this one? */ "bar" /* or... */
}
/* And what about this one? */
Just drop the comments. They are comments, which means they should be ignored when parsing. I just want to be able to parse files with comments instead of getting an error. Same for trailing commas
Sure, no problem. But then your users will ask you to preserve comments. How do I know? Because we've seen that request in jq multiple times.
This has just enough type information in the encoded form. You can do that in JSON too, by convention, of course. There's also an aesthetics argument. Idk, I think this is fine, and probably desirable in some contexts, but yes, it's an Nth system syndrome.
Because JSON is explicitly meant for machines only?
Where in RFC 8259 is this stated?
Any encoding meant to be decoded by machines needs to also be encoded by machines because humans are bad at hand coding. This would be true for KEON as much as for JSON.
"data interchange format" is synonymous with what I said. Also consider there are no comments.
That's not how reality works.
Looks nice enough. How does it compare with RON? https://github.com/ron-rs/ron
More than a syntax, what I look for nowadays in a descriptive language is tooling, including a schema system, IDE plugins and library / build tool integration. Convenience makes it hard to dislodge the incumbents (toml, yaml, json).
I think UCL is probably the best of all. Wish it had more than one implementation and formal specification.
RON uses parentheses to represent structs. "This is not Rusty at all!", I thought to myself. This is where the story begins, a project written out of OCD. Eventually, KEON is different from RON in the following ways:
- Use braces `{}` to represent structs and maps.
- `macro_rules!` tells us, "`expr` and `stmt` may only be followed by one of: `=>`, `,`, or `;`". RON uses only `:` even though the left-hand can be arbitrary. KEON has added `=>`, now we have two ways to represent key-to-value. This is why structs and maps can be unified: structs can be regarded as maps with strings as keys. `ident: ...` is basically syntactic sugar for `"ident" => ...`.
- Since parentheses are saved, we can use `()` to represent tuples and `[]` to represent vectors. Although they are all `seq` in Serde, in the output, this certainty reassures me: the length of a tuple is immutable relative to a vector.
- Serde allows some weird structures, such as `struct AwfulNullary()`, which must `visit_tuple` rather than `visit_unit`. And `enum Foo { AwfulNullary() }`. Even though these never happened, I insisted on getting it sorted out.
- In RON, both types output `AwfulNullary()` when showing struct names, and only the backend knows whether it is an enum or a struct, that's unsettling to me.
- In KEON, pretty outputs `(AwfulNullary)()` and `Foo::AwfulNullary()`, or minimal outputs `()()` and `AwfulNullary()` respectively. You can tell what's going on at a glance.
- Variants can be written anywhere as `Enum::Variant` or just `Variant`, exactly as happens in Rust. Redundant annotations help to quickly figure out what's there, and jump to the corresponding location without relying too much on LSP?
- The type annotation of structs is done by `(TheStruct)`, like type conversions in C, implying the backend doesn't care what's in... If the parentheses were omitted, `TheStruct` would be treated as a variant in most places (refer to Turbofish), and I would not be able to write a usable parser at all. Although this isn't Rusty, it shouldn't be too obtrusive.
- RON doesn't guarantee work with `deserialize_any` may have to do with these details. I believe KEON can support this, but more comprehensive tests are needed.
Some other less Rusty things: - `Option<T>` doesn't accept `visit_enum`, it only accepts `visit_some`/`none`. I didn't want to provide exceptions for `Some(..)` and `None`, so I had to find the question mark `?` from the keyboard for it to use.
- Serde provides `visit_newtype_struct`, I think this *must* have its purpose, so we'd better have the corresponding syntactic sugar, that is `>`. Of course things like `Item::IdCard(101)` are also legal.
- Raw strings. KEON uses Backtick-Quote instead of r-Pound-Quote. This is because, when I want to turn a string to a raw string, after selecting them, I can't wrap them by simply hitting `#` -- they will be directly overwritten, this annoys me. But Backtick can almost always automatically enclose the selection without worrying about ambiguity, requires less typing, and is just as intuitive.
- Correspondingly, raw identifier uses Backtick instead of r-Pound.
- Paragraphs, added purely out of preference. I wanted to try out how much handwriting would be benefited by providing this syntax for a language that is indent-insensitive.
- BaseXX. Well, they're free.
I didn't expect that KEON would spark some interest. (Reply a bit late due to time difference :P)(Edit: formatting)
Interesting, thanks! IMO This detailed motivation / design should appear in the README or close to it.
> a project written out of OCD
The best ones. I have a few myself...
I too am curious. I like Ron quite a bit for it's Rust similarity.
> human-readable
> sytacticly similar to Rust
Choose one.
I kid, I kid, but Rust is not the easiest-reading language out there. It has the same problem as C++, with a syntax that's not terribly straightforward to begin with, but then it's just liberally sprinkled with nearly every bit of punctuation that can by typed with the US QWERTY layout.
Rust is extremely unreadable if you're not used to it, but becomes one of the most readable languages once you are. My subjective experience, at least.
With some notable exceptions. I'll never love the turbofish [1] for example.
[1] https://github.com/rust-lang/rust/blob/master/tests/ui/parse...
I love Rust, but would disagree. It is always difficult to read partially because there is so much boilerplate. impl blocks and where clauses in some codebases cause a lot of noise, for example. Granted, it probably could not be as easy to read as say a subset of Python is, for example, simply because it is a very strongly typed systems language and therefore needs to describe a lot more.
Some of this is just preference at the end of the day.
I find modern Typescript to be utter spaghetti and arguably worse at the points you’ve listed here.
Well this project isn't the full rust syntax, its just rust's object notation. I don't think its any harder to read than JSON.
I find it really straightforward until lifetimes are involved. I used Rust for a few weeks last January.
This is mostly just confusing to me. It's almost Rust syntax, but not quite, and the divergences away from Rust syntax don't really make sense (like the newtype syntax). What are the benefits of this over RON?
Feels like there’s three different ways that JSON objects are represented in KEON.
It might be more concise, e.g., in the Newtype case, but the sacrifice seems to be quite a bit more cognitive complexity, which I would personally value over conciseness
Item::IdCard > 101, // <- newtype variant.
That looks really weird.