I remember searching for a JSON library with minimal dependencies a while ago, and came across this:
https://rawgit.com/miloyip/nativejson-benchmark/master/sampl...
The variance in feature set, design and performance is huge across all of them. I ultimately landed on libjson, written in C: https://github.com/vincenthz/libjson
It does a lot for you, but it notably does not build a tree for you and does not try to interpret numbers, which I found perfect for adding to languages with C FFI that have their own collection and number types. It’s also great for partial parsing if you need to do any sort of streaming.
It looks like this one can’t currently do partial parsing, but it looks great if C++ maps/vectors are your target.
If you want to go extremely lightweight, there’s jsmn: https://github.com/zserge/jsmn
It does no dynamic memory allocation, which is a plus in constrained IoT/embedded applications. But it’s really only a tokenizer. For example, if you want to parse fields out of a map, you have to write your own wrappers to iterate over key/value pairs. Since no data is copied out of the original buffer, all the “tokens” are given as byte offsets and lengths, not null-terminated strings, so you can’t just do printf(“%s”).
If you can’t (or don’t want to) malloc, it gets the job done. Not sure I’d recommend it for other applications though.
I actually evaluated and used jsmn and almost mentioned it in my comment. It was really quite cool, but I believe I couldn’t use it due to the lack of UTF-8 validation. Because UTF-8 validation is in the state machine for libjson, I can actually ignore incomplete UTF-8 escape sequences in incomplete JSON strings when streaming.
Is there a reason you can't do printf("%.*s", strlen, strptr); ?
The terminating character is still the closing double quote and not a null, since the library does neither copy out nor alter the input. For example tiny_json replaces the closing quotes to create C strings, but that needs the full file to be in a mutable buffer which can be prohibitive for small controllers reading some config from flash only.
With the "%.*s" format you need no null at the end. It just counts out the characters:
#include <stdio.h>
void main()
{
char buff[10] = {'R', 'o', 'b', 'o', 't', 't', 'y', 'p', 'e', 's'};
printf("=>%.*s<=", 4, ((char *)&buff + 3));
}
prints =>otty<=
Ah. Ok. Scanning the length before printing is mandatory then.
You have to do (a variant of) one or the other, no?
Yes, that’s exactly what I’ve done.
I settled on this one too.
Far better than nicklohmann's monster build times.
Somehow I didn't run across that one in my searching - I'll check it out. I've been working on a json C library myself:
https://github.com/nwpierce/jsb
My goal was to convert a stream of JSON to/from a binary stream that is easier to traverse and manipulate.
Did you try cJSON? Works well for me. https://github.com/DaveGamble/cJSON
Compile time is largely a "developer problem", but so is the usability of a library. nlohmann/json's main perk that it is selling is that it's interface is usable. Whether or not a developer values usability at typing time vs compile time is an interesting thing to ponder for sure.
Compile time is a collective problem and usability is an individual problem. I work with llama.cpp. The files in that codebase that were made using nlohmann json take about a minute to compile using g++ -O3 -g, all because one guy who originally wrote it wanted to type fewer keystrokes on his keyboard by using a more magical library, and the rest of us have to suffer for it every time we experiment with a 1 line of code change to those files.
> (...) and the rest of us have to suffer for it every time we experiment with a 1 line of code change to those files.
If you feel this is an issue then why don't you move it to an independent submodule that can be compiled independently? That means you can build it in parallel along with the whole project, and in the end you just link the resulting binaries.
” If you feel this is an issue then why don't you move it to an independent submodule that can be compiled independently?”
If it’s a header, you necessarily can’t. Header gets included every time you want to compile code that depends on a header.
Compilers may offer precompilation etc but if the code you want to change has direct dependency to a large header you need to recompile all of the dependencies.
This is one of the painpoints C++.
It is a pain point of build management regardless of the language, even with a language having proper modules one can have a cascade build, if the public interface or module ABI is impacted.
C++ modules are here, unfortunely outside VC++ and clang latest, plus MSBuild or CMake/ninja, they are not an option.
Are they? According to some people (github issue to support cpp modules on vscode) the standard is mess and is likely to go away. VSCode doesnt support modules atm.
Visual Studio is what matters.
VSCode is never going to be as good, you are better of with Clion then.
This!
As win/mac user Visual Studio is my preferred tool, but in MacOS Clion (with vscode for few random workflow things not supported in Clion) is an adequate replacement (but Visual Studio remains king).
VSCode can be used as an industrial editor if one likes to, but if it does not feel right, it’s not a skill issue.
I just wrote a new server instead. There's nothing I won't do, no lengths I'm not willing to go, when it comes to cutting back on build latency.
I follow the same philosophy, to the point where at this point I barely use the STL; most of that template-heavy junk has been replaced in most of my projects. For instance, most of what I typically used <iostream> for was replaced with a 150-line .h (plus a 50-line .cpp that uses explicit template insantiation and a <charconv> include). {fmt} was too heavy for me. And I'm locked into C++17 because C++20 seems to double down on the 20k-line header madness.
When I was stuck with C++ codebases that forced me to take a mandatory coffee break every time I needed to run a bit of new code, it made me a little bit insane! Never again.
> I just wrote a new server instead.
I'm sorry, this makes no sense at all. Why would anyone write a new server just because a small component was taking a minute to build?
It makes no sense at all that person concerned with slow build time rewrote slow component to compile faster?
While the pursuit of faster build times is definitely a worthy cause, I feel like there's something I'm not quite seeing here. Does the JSON-code change frequently enough to incur build cache misses and the full minute penalty? Is there something inherent about the structure of the library that makes it unable to have its compilation be cached? Is the code structured in such a way that editing other code requires also invalidating the cache for the JSON-related code? I guess one way would be to break out the JSON parsing code to its own module and have it produce language-specific structs to be interacted with by the rest of the program.
Programming is the process of manipulating data structures, so if you're building a JSON server, then every piece of code in your server is going to be dealing with and operating on JSON data structures. It can't be neatly tucked away in a corner. Because it would be foolish to design a server that makes needless copies of all its inputs and outputs. This truth would be the same if you were using something like protobuf instead. Therefore it's important that your fundamental data structures be something that (a) you can control, and (b) doesn't make everything it touches take forever to build. Do you feel in control of someone else's 24000 line header full of template magic? If that thing is sitting between me and my data structures, then I will wipe it out of existence.
It seems like nlohmann/json is a header-only library, meaning the entire library has to be compiled once for every source file which uses it any time that source file or its includes has updated.
So I guess in a JSON-heavy code base or a code base where nlohmann/json has leaked into common headers, you may end up recompiling the library a few dozen times per build where a few dozen of your C++ source files must be recompiled (e.g due to common header changes)...
(But don't worry, the linker will then spend a bunch of time throwing away almost all of that work so you only get one copy of the library in your binary)
I missed that part. That is a pretty significant downside in that case.
> Does the JSON-code change frequently enough to incur build cache misses and the full minute penalty?
The moment you switch branches - it changes.
If you develop for Android - it generates build for with hash name from some CMake/Gradle variables, the moment one of those changes (like AGP version) you get a new build dir and essentially have to compile from scratch.
If you're on something reasonably smart like Bazel it will be able to determine whether the module itself has been changed and requires recompilation instead of running from cache.
Nice.
We, and majority of Android projects, aren’t on Bazel, though.
This is true, and it's kind of a bummer to be honest. There's some serious time being wasted on recompilation that could be avoided with a really sharp build system.
Bazel comes with its own bag of sharp edges though so it's unfortunately not like you can just adopt it and be on your merry way.
As a prolific contributor to open source yourself, I’d have expected you to be a little more sympathetic to other open source developers giving up their time freely.
For some contributors, they’ll have a day job, a family and other personal commitments. so writing open source code is a luxury they don’t have a lot of time for. I know this because I fall exactly into that camp myself.
I'm defending open source developers. We can't freely modify open source code if it has glacial build times. It's specifically because people are volunteering that we should aim to be as conscientious as possible when it comes to build latency. Someone who volunteers to contribute code that compiles slowly is not being respectful of the time of all the other volunteers, which is like pumping the brakes on the open source movement. So I will make my views clear that development practices need to improve.
Just because they give up their time freely it makes their decisions immune to criticism?
Constructive feedback is fine. jarts comment wasn’t that.
We will never know if jarts comment was constructive or not until we know original developers decision process.
If original decision process was indeed “less keystrokes”, then how is that not a constructive criticism?
I don't think a supposedly bad decision has to be answered with being snarky. A pull request, or a fork focused on reducing build times are actual net gains. From that poster's original name, seems like they went on and did just that, which is great I believe.
At the very least, giving the original developer the benefit of doubt, or assuming their decision made sense under the circumstances they were in at the time, is IMO a better start than just public criticism.
The developers motives doesn’t change the snarky way jart wrote their comment.
And if you felt their comment was acceptable then I question how much you’ve contributed to open source yourself. Snarky comments like jarts are all too common and really demotivate people from maintaining popular projects.
But don’t just take my word on it, there’s a plethora of other contributors who’ve talked about this topic as well.
Where's the snark though? jart's comment reads true literally.
Compile times are a big deal, and 'jart is right about individual vs. collective problems. And unlike most other critics on the Internet, 'jart actually provided a solution along with the criticism. If that kind of behavior "demotivates [some] people from maintaining popular projects", I still feel it's a net win.
> Compile time is largely a "developer problem", but so is the usability of a library.
Compiler time is way more than a "developer problem". It's an operational problem that ends up permeating to software architecture and development practices, and ultimately affects how the whole project is delivered and deployed.
Significantly faster compilation means less friction to iterate ideas, try things, which in the end lead to more polished results.
A nice interface is agreable, but maybe there are diminishing returns when you pay it with large compile time. I remember pondering about that when working with the Eigen math library, which is very nice but such a resource hog when you compile a project using it.
On the other end of the spectrum there is [1]. It's both performance and usability oriented, although compile times are probably higher.
Nlohmann is the slowest out of the popular libraries, AFAIK, and not particularly more usable than rapidjson, in my experience. So "better than nlohmann" is not very novel.
Really interesting that nlohmann isn't fully compliant. What cases are these?
It seems to me though that if you're encountering the edges of json where nlohmann or simple parsing doesn't work properly, a binary format might be better. And if you're trying to serialize so much data that speed actually becomes an issue, then again, binary format might be what you really want.
The killer feature of nlohmann are the the NLOHMANN_DEFINE_TYPE_INTRUSIVE or NLOHMANN_DEFINE_TYPE_NON_INTRUSIVE macros that handle all of the ??? -> json -> ??? steps for you. That alone make it my default go to unless the above reasons force me to go another direction.
The moment nlohmann's library came out, I switched to it and I never looked back.
I loved the interface and its exactly how I would've designed a json library with modern c++.
Just maybe turn off the implicit conversion option, that can get a bit messy ;)
"This project is a reaction agains..." is such a punk move I can't do anything but appreciate.
jart is such a good programmer. a lot of people already know this but i just have to give props where it's due.
What does “Classic C++” mean?
This library is nicely concise, and the code is mostly readable (although there are some non-obvious tricks that could be better documented).
The Makefile could need some work:
json_test.cpp:360:23: warning: missing terminating '"' character [-Winvalid-pp-token]
{ Json::success, R"({
^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
9 warnings and 20 errors generated.
make: *** [json_test.o] Error 1
% c++ --version
Apple clang version 15.0.0 (clang-1500.1.0.2.5)
Target: arm64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Compiling direclty with c++ --std=c++11 -c json.cpp
works fine, though.There are approximately three major dialects of C++. They are distinguished by major changes in what idiomatic code looks like, enabled by the addition of core features to the language that made it more efficient and type-safe to express many things.
The era of so-called “modern” C++ started with C++11, which was a radical reworking of the language. All prior versions of C++ are “legacy” or “classic”. Idiomatic code in “modern” and “classic” dialects almost look like different languages.
C++20 arguably marks a new dialect break but it doesn’t have a colloquial label to distinguish it from “legacy” and “modern” AFAIK. Idiomatic C++20 looks pretty foreign from a C++11 perspective (but is unambiguously an improvement).
This library supports building with C++11. I haven't tried compiling it with an older standard, but I imagine it might work. One thing I like about the C++11 compilers like GCC 4.9 is they build code magnificently faster than recent editions. See https://x.com/JustineTunney/status/1795427808631758936
> This library supports building with C++11. I haven't tried compiling it with an older standard, but I imagine it might work.
I believe it does require C++11, due to std::nullptr_t and r-value references (&&), but that might be it. It's not a show stopper though since everyone should have a c++11 compiler now (even Ubuntu 14.04 LTS, which still has paid support I believe).
> One thing I like about the C++11 compilers like GCC 4.9 is they build code magnificently faster than recent editions
Kind of reminds me of gcc 2.95 which people kept around for the compiler speed. They would use gcc 3.x for the warning support and then compile with gcc 2.95 after fixing the warnings :).
Yes they'd be very trivial to remove locally. It might also be nice to have #ifdef statements around them like we're already doing for std::string_view. If we consider that many big name C projects like curl are still on C89 then there's surely got to be people still out there using 2000's era C++.
> It's not a show stopper though since everyone should have a c++11 compiler now (...)
I think the point of pointing out it's C++11 is that it's not "classic C++" as it's using "modern C++" features. Thus it's a mystery why it would be referred to as classic C++.
Just because I included an rvalue constructor doesn't make it C++11. This library was originally written in C. It hasn't changed a whole lot since Gautham and I originally wrote it: https://github.com/jart/cosmopolitan/blob/master/tool/net/lj... I feel perfectly comfortable calling C++11 "classic" or even "baroque" compared to what people are doing with C++ in 2024. However if you disagree with me, and feel that classic means C++03, then I've made certain that your preferences are supported by this library too. Just remove the rvalue and nullptr_t constructors. I'll probably add #ifdefs soon to automate that too.
"Classic C++" and "Modern C++" refer to the language before and after C++11, respectively.
Some of the key differences are use of standard library and its containers, smart pointers, and other language features that look less like C. In this specific library, this refers to some of the techniques like bit manipulation, manual memory management and string parsing, and using things like enums to improve speed and reduce complexity.
An example of a more robust (but still "classic") library would be something like https://github.com/Tencent/rapidjson.
https://github.com/jart/json.cpp/blob/4f0a02dab1af7d81888cf5...
The response doesn't tell you the location of the problem in the input.
That might actually be the explanation for why json.cpp benchmarks 39x faster than nlohmann's library if I include the failure test cases.
Code in jart's version is refreshingly clean and easy to read compared the nlohmann's version.
As an aside, I wonder: what are the ThomPike* set of macros actually doing in jart's implem ?
Also, a speed comparison of this vs the other one would be very welcome: conformance and simplicity are certainly important criteria when picking a JSON parser, but speed is rather crucial.
Thompson Pike encoding. It predates the UTF-8 standard and was invented on a napkin in a New Jersey diner. It allows the full spectrum of 32-bit numbers to be encoded, rather than restricting characters to only those also present in UTF-16. The json.cpp library enforces UTF-8 restrictions on parsing, because we have no choice. But you're allowed to serialize anything you want, thanks to the ThomPike macros.
What are the performance numbers? nlohmann/json is no speed demon.
I've added benchmarks to the readme. https://github.com/jart/json.cpp?tab=readme-ov-file#benchmar... You're looking at a 2x or 3x performance advantage across the board. If you include invalid JSON handling, then 10x or more.
This is a fine library, but I use nlohmann extensively and haven't experienced any considerable compilation slowdown once I added it to the project.
Overloading from_json to modularize parsing is really useful, I think that should be a part of every templated C++ json parser library.
That said, I have seen these ThomPike* macros in cosmopolitan.h before, I wonder what the origin is.
Sounds like there's a backlash to modern C++.
Interesting approach, but without providing a conan/vcpkg in (the end of) 2024, makes only friction.
We are not living in 90s anymore..
Dunking on nlohmann for performance is pretty easy. I’m interested in what the value proposition is over one of rapidjson, glaze, or simdjson (all of which have some amount of SIMD or SWAR optimization, and more importantly SAX and the use of something other than std::map)