Technium Adeptus: July 2019

JSON is really popular for passing small to medium amounts of data around the web. It is commonly used in AJAX applications for communication between the client and server. JSON can be really handy, because it can represent objects in a human readable form. This is great for debugging, and makes certain aspects of writing web applications a breeze. Unfortunately, it is also terribly wasteful, and most of the time it is unnecessary.

JSON is not the only communication protocol that has problems. In fact, most popular protocols designed to be human readable have serious problems. The two most common ones are JSON and XML. The primary problem in both of these is the passing of unnecessary metadata. For example, most JSON applications pass data that has a well defined, static format. Metadata does not need to be passed with this kind of data, because the metadata is implied. Imagine passing a C struct or a Java object with metadata about the variable types and names. Perhaps for regular JSON users this seems normal and reasonably, but it is not. The metadata in C-like languages will take up several times the memory, bandwidth, and processing power of the data itself, and this is true even if making it all human readable is not a priority. C structs are a really good example here, because they are one of the most well defined, statically formatted, composite data types one can use. In all but the rarest cases, a struct can be passed across a network or some other interprocess communication media, and as long as the struct definition is identical on both sides, the end product will be identical, without the need for any metadata to consume bandwidth. Adding metadata consumes more memory, and it does not just consume a little bit more memory. A standard integer might be 4 to 8 bytes. If human readability is not important, type metadata can use as little as 1 byte. The attribute name could be anywhere between 4 bytes and 16 bytes, depending on coding style (yes, it could be 1 byte, but then your code is no longer human readable, and that is a serious problem). The metadata will generally fall between two and four times the size of the data itself. If the type metadata is human readable, it will likely be between 3 and 6 bytes, making the metadata between three and five times the size of the data itself. JSON avoids sending much type data, because JavaScript is so loosely typed that the parser can just infer the type from the data, but it still has the attribute names, which are often several times the length of the data, even with the data being in human readable format (which is its own problem). XML is far worse. XML is designed to track metadata about metadata. In fact, it is designed to be as deeply nestable as desired. It can track metadata about metadata about metadata and so on as deeply as one may wish to plunge. XML attributes also add yet another form of metadata on top of the hierarchical metadata. It is so human readable that it is actually quite difficult to read without an editor that will collapse the hierarchy to avoid displaying all of the meaningless metadata all at once. In the vast majority of applications, there is no need to pass metadata. In many cases, having the metadata does not even improve convenience for the programmer. The metadata can be useful for debugging, but well designed debugging and testing tools can abstract that need away quite easily.

The other major problem with JSON, XML, and other human readable protocols is general verbosity. Technically this is also a metadata issue, but it exists at a higher level than attribute names. Structural metadata includes braces, brackets, parenthesis, commas, quotation marks, colons, equal signs, and anything else defining the structure of the data or the relationship between data and metadata. Unsurprisingly, this stuff also takes up a lot of memory, bandwidth, and processing power. XML is the biggest offender here, by far, but JSON is not that far behind. The minimal JSON example looks something like this:

{"var":12}

The data being passed is the number 12. The attribute name is much smaller than typical, and the only structural metadata is the mandatory outer braces, the quotation mares, and the colon. This is 10 bytes total. To parse it, one must find the location of the two braces, find the colon, check for commas to see how many attributes the object has, find the location of the quotes between the opening brace and the colon, compare the characters of the text between the quotes with the desired attribute name, and then read the characters between the colon and the closing brace and convert it to a numerical value. Yes, JavaScript and any JSON library will do this for you. No, this does not made it free. Parsing JSON consumes a lot of processor power. The above process takes nine steps. The steps for comparing the attribute name and parsing the value both take a number of steps equal to their length, and the conversion of the number to a numerical data type requires multiple steps per character. Complex JSON with multiple attributes, arrays, and nested objects gets far more complex. If the value will always be fairly small, this value could be send in a single byte. The client knows what to expect, because it requested the data. The metadata is unnecessarily, because the client already knows the metadata. Wrapping it in a JSON object is wasteful. Send as a single byte, no parsing is necessary. A single byte uses a tenth of the bandwidth and memory. Even if the client does not know exactly what to expect, in the vast majority of cases, a single byte is enough to tell it. That's 256 possible things the client could expect, and few applications have more than that. No, it is not as human readable, but ultimately what is the purpose of writing the application? Is the goal maximum convenience for the developers or is it getting a decent quality product to the consumers? Yes, there is some flexibility here, but using JSON in an application like this is absurd. It is not even lazy, because writing the code to send and process the JSON is more work than sending and receiving one byte of data. How does XML measure up? Let's assume we are using raw XML that is non-compliant, so we can skip things like the doctype line, the xml version line, and such. The above expression, in its simplest form, might be rendered this way:

<var>12</var>

Of course, compliant XML would include an xml version line and a doctype line, the above would be enclosed in a root tag of some sort, and the tags might have attributes. Alternatively, the root tag could have an attribute named "var" set to 12. Whatever the case, this non-compliant, simplest form is 13 bytes total, which is 3 bytes longer than the JSON version. Parsing is also a problem, though perhaps simpler than the JSON. First, the opening tag must be parsed, by finding the bracketing greater-than/less-than symbols. The "var" text is then compared with the expected attribute name. The parser must then search for the closing "var" tag. If the parser is a full XML parser, it will also have to put some CPU time into making sure the closing tag belongs to the opening tag and another tag at a different nesting level. If not, all it has to do is verify that the closing tag matches the opening tag. At this point it will already know where the opening tag ends and closing tag begins, so reading the value and converting it to an integer is all that is left. In the best case, we are looking at seven steps with a few sub-steps. In this case, the non-compliant XML takes up 30% more memory and bandwidth than the JSON, but it costs around 20% less processing power to parse. If we enforced standards compliance, the XML would be several times larger and it would take much more CPU time to parse as well. Again, just passing the number 12 as a single byte would be far cheaper, eliminating the parsing and minimizing memory and bandwidth. In terms of natural language, this is like the difference between the answer, "I saw twelve cars" and "Twelve" when someone asks how many cars you saw. The first answer is more complete, but the second answer is entirely appropriate, because the metadata is already included in the context of the question. Similarly, if a client asks the server for the value of a particular variable, it might be more complete to respond with an object containing the context, but doing so is unnecessarily verbose, because the context was already included in the request.

There are cases where passing large amounts of metadata may be appropriate. Most reasons for using protocols like this, however, are actually just laziness. For example, JSON and XML do not care much about order. An object with multiple attributes can be passed with the attributes in any order. This isn't a valid excuse though. It's noting more than nesting a poorly defined protocol inside of another protocol that contains metadata, so you can avoid the consequences of lazy coding. A well defined protocol would define length, order, type, and meaning of each piece of data being sent, eliminating the need to send any metadata. In the case where length cannot be predefined or would be unacceptably inefficient to predefine, the protocol would provide a dedicated place to provide the length metadata, but it would not include any unnecessary metadata. The only place where large amounts of metadata are necessary are places where it is impossible or infeasible to use a well defined protocol. Several examples of this are found in web sites. HTTP and HTML are weakly defined protocols. HTTP has a few somewhat positional elements, but the rest is defined by data not by position or length. HTTP headers must be parsed, they cannot be treated like well defined, statically formatted structs. This is necessary though, because applications of HTTP vary so dramatically. Every server cannot define its own unique protocol and expect every client to understand it, nor can every client define its own unique protocol and expect every server to understand it, and the W3C cannot be expected to create a separate standard for every single web site. Some servers may need to communicate different information from others. Some web sites may need to communicate different information from others. Thus a flexible and dynamic approach is necessary. The same is true of HTML at the presentation layer. Every web pages doesn't need the same organization. CSS must by dynamic, because every web page doesn't need the same style and layout. But unless your web application is going to be communicating with a lot of different systems that might have a lot of different needs, there is no reason to use such a flexible and dynamic protocol. Flexible and dynamic protocols are wasteful and inefficient, and they are far more prone to bugs than compact, well defined, statically formatted protocols. Less well defined protocols are also prone to bloat, because they typically ignore extraneous white space. For the applications where they are needed, this is a useful feature, but for well defined applications, this is wasteful and potentially harmful to clients. Bandwidth, memory, and time are not free, after all.

JSON isn't a useless protocol. Dynamic protocols like JSON and XML have their place. That place, however, is not in applications with well defined communication needs. When the need can be filled by designing a much simpler, much more elegant, and much more efficient protocol, then designing and using such a protocol should be the solution. Only when a simple, elegant, static protocol is impossible or unfeasible should a stock dynamic protocol like JSON or XML be used. And even in those cases, it may still be appropriate to use a simplified version, if the parser you are using can handle it.

The real reason I don't like JSON isn't that JSON is bad. The real reason is that it is used as a default in applications where it really shouldn't be and where it actually wastes time instead of saving it.

Technium Adeptus

Sunday, July 21, 2019

Why Reusable Software is Bad

Why I don't Like JSON