Posted on

Almost every program, particularly in compiled languages, needs some type of config file to store database connection strings,

It’s become very common for JSON to be used by package managers and tools for configuration, perhaps due to it’s simplicity or ubiquity, but is it really a good choice?

To determine this, let’s first determine the features we want in a good configuration language:

  1. Easy to be understood and edited by humans
  2. Simple to parse
  3. Expressive power to cleanly and accurately represent the data
  4. Simplicity (along with #1)

Let’s explore the ways that JSON does not meet this criteria:

Limited Type Support

JSON only includes support for only 6 types: string, number, boolean, null, object and array. While this generally allows you the ability to markup most anything, there is a critical item missing: the ability to represent any type of date or time unambuguously and in an easy-to-understand format for humans.

Yes, you can just agree to use a timestamp, or a string containing an iso 8601 string, but these require extra steps that cannot be handled by the parser, and make it unclear to the user how to work with the language.

Other noteworthy types include the ability to differentiate between string or integer types, or the ability to specify date ranges.

Rigidity

JSON has almost no convenience factors that make it easier to use, such as the ability to use underscores as number separators, training commas in lists, or support for hex.

Comment Support

This is by far the largest of the problems. When building a configuation file, particularly for others to use, you’ll almost certainly want to document some of the parameters and what they do or how they work, or at least have the ability to comment things out temporarily for testing purposes.

There is simply no way to do this in JSON. Sometimes this is worked around by adding comments directly to the markup:

{
    "dependencies": [
        "This is a comment to explain why we need the dependency": "repo/thing"
    ]
}

But this is far from idea, and not easy at first glance to understand what’s happening. It’s essential to have comment ability in config files. Consider the following instead:

{
    "dependencies": [
        "repo/thing" // needed for feature X
    ]
}

Cleanliness

There’s a lot of “noise” in JSON documents. Almost everything must be quoted, and representing almost anything necessitates a bunch of nested data structures, usually coupled with a lot of indentation. This really hurts readability a lot.

What should you use instead?

If you’re using npm, yarn or composer there’s really nothing you can do. You’re stuck, because the developers of these tools made a bad choice for you. However, if you have the ability to use something else for your program, try one of these:

Yaml

YAML is actually a superset of JSON, so you can improve the situation immediately by switching to YAML, and get instant access to comments, and the additional data types without having to make any other changes to the config document.

However, YAML still has a lot of major drawbacks, with a ton of complex features, and being heavily dependeny on spacing for structure; making it challenging to work with once the config file reaches a complexity where it’s no longer possible to eyeball the nesting level.

The language of your program

You may not need a config file at all! If you’re working in a scripting language like PHP or JavaScript, you can just use the language itself. Wordpress uses this technique with wp-config.php, which is easy to understand and familiar to any PHP developer:

/** The name of the database for WordPress */
define( 'DB_NAME', 'blog_wp' );

/** MySQL database username */
define( 'DB_USER', 'blog_wp' );

/** MySQL database password */
define( 'DB_PASSWORD', 'hunter2' );

/** MySQL hostname */
define( 'DB_HOST', 'localhost' );

/** Database Charset to use in creating database tables. */
define( 'DB_CHARSET', 'utf8' );

/** The Database Collate type. Don't change this if in doubt. */
define( 'DB_COLLATE', '' );

Custom purpose-built Format

If you’re building a configuration-heavy system, you’re probably doing it wrong - you should probably just allow the user to write code instead.

On the slim chance you’re not doing it wrong, you’ll probably want to create your own config format, as a well-designed purpose-built format will be much easier to work with than a generic language.

Consider Nginx config format for example.

TOML

TOML is a great all-around generic option for a config format, with it’s simplicity combined with some convenience factors. It’s effectively a modern take on the INI file, and addresses nearly all of the criteria I mentioned.

The main drawback of TOML is that it’s not as well supported as some of the more popular options depending on your programming language.

TOML is more often than not the tool I reach for when I get to choose the configuration language.

« Back to home