Why YAML Beats JSON for Configuration– A Guide for Python Developers

yaml

Do you find the JSON format boring? Or it feels like building a machine.

Let's be clear: If you've been using Docker Compose files or setting up Kubernetes configuration for some time, you're probably familiar with YAML. It's widely used, and there are good reasons behind its use.

YAML looks natural to the human eye and is easy enough for a 10-year-old to understand. There's no weird bracket or complex, simply keys and values cleanly organized using spaces. It's much more readable than JSON, which feels like something a machine would build.

In this guide, we'll explore how to work with YAML in Python. You'll learn why YAML is better than JSON for configuration files and managing complex data structures. Let's skip the boring "Hello World" examples and dive straight into useful programs for real projects.

To Learn About YAML, What Do You Need?

Before you work in YAML, make sure you've:

  • Python 3.8 installed on your computer (or a newer version is OK)
  • A good text editor or IDE (VS Code, PyCharm, even Sublime Text 3)
  • Python knowledge is required, as we'll talk about variables, dictionaries, lists, and basic file handling
  • Need to know how to install pip and run python scripts
  • Knowing how data is stored and shared between program

How to check if Python is installed on your computer?

Open your terminal and enter:

python --version
      # or 
python3 --version

Setting Up PyYAML

PyYAML is almost every developer's favorite library for YAML.

Let's install it in python:

pip install pyyaml

Want to verify if it worked? Run this in terminal:

python -c "import yaml; print(yaml.__version__)"

If you see a version number, you can go to the next step.

Table of Contents


YAML vs JSON: What's the Difference?

Here's a simple example to show you why developers keep reaching for YAML.

Data in JSON format:

{
  "products": {
    "wireless_mouse": {
      "price": 29.99,
      "stock": 47,
      "category": "electronics",
      "sku": "WM-XST7AQ",
      "brand": "TechNova"
    }
  }
}

The same data in YAML:

products:
  wireless_mouse:
    price: 29.99
    stock: 47
    category: electronics
    sku: WM-XST7AQ
    brand: TechNova

Do you see the difference? The YAML format is cleaner than JSON. There are no {} curly braces and “” quotation marks anywhere. This difference really provides many benefits for configuring an application or setting up infrastructure.

Developer Feature JSON YAML
Readability Harder Easy
Writing Verbose and strict syntax Simple
Comments Support Not supported Supported using #
Error Proclivity High Lower
Trailing Commas Not allowed Not required
Data Types Limited (string, number, boolean, null) Rich data types (strings, numbers, booleans, null, lists, maps)

Why use YAML and How does YAML work?

YAML was created with a clear goal, which is to put humans first and machines second. So that developers can easily read and write configuration files.

Here's what you will find in YAML files:

YAML supports several data types that are commonly used.

# Strings
app_name: MyCodehemuApp
description: "This string includes special characters and spaces"

# Numbers
max_connections: 100
timeout_seconds: 30.5

# Booleans (true/false, yes/no)
debug_mode: true
enable_caching: false

# Null values
deprecated_setting: null
Collections
# Lists (indented items)
allowed_regions:
  - us-east-1
  - us-west-2
  - eu-central-1

# Nested structures
features:
  authentication:
    enabled: true
    timeout: 300
  logging:
    level: INFO
    format: json

Real-World Problems with JSON

Let's look at some real-world case studies, where incorrect JSON caused problems. After doing these studies, I think using YAML instead of JSON would have helped.

Trailing Comma Error:

When a developer deletes the last line of a JSON array. But forgets to remove the comma from the previous line, like this:

{
  "name": "codehemu",
  "age": 21,
}

It's not valid JSON and throws a syntax error. Here's the right JSON format:

{
  "name": "codehemu",
  "age": 21
}

Here’s the YAML conversion from valid JSON.

name: hemu
age: 21

No commas needed and easy to read.

Based on some GitHub issues (e.g., #141 from 2017) and Stack Overflow posts, I understand that... A few years ago, trailing-comma errors in VS Code (a code editing software) settings.json caused users to corrupt their settings. This issue affected millions of VS Code users in versions before 1.18.

These issues cause IDE extensions to fail and corrupt user preferences. Using YAML would have been better in this case, as it ignores trailing commas in lists.

Lack of Comments:

JSON does not support comments, so many times developers cannot add comments to important changes. This limitation can create an issue, just like when managing AWS CloudFormation templates written in JSON.

For reasons like these, many AWS teams have migrated from JSON to YAML for CloudFormation and other infrastructure as code workflows. YAML supports inline and block comments. This feature allows developers to explain their intentions right alongside the configuration.

Here's an example of YAML comments

# Enable auto-scaling for high-traffic periods
auto_scaling:
  enabled: true
  min_instances: 2
  max_instances: 10

Comma Conflict:

When multiple developers edit the same line in Git (software that tracks file changes), a merge conflict may appear. Such issues can be frustrating, as they require developers to resolve comma-related conflicts manually.

Here's a "comma conflict" example:

Imagine two developers, Ram and Sam, both want to add a new dependency to the end of the list. The original file ended with "expo": "54.0.0". Like this:

"dependencies": {
    "react": "18.2.0",
    "expo": "54.0.0"
 }

Developer Ram adds axios:

"dependencies": {
    "react": "18.2.0",
    "expo": "54.0.0",  <-  Ram needs to add a comma here
+   "axios": "1.13.2"
  }

Developer Sam adds lodash:

"dependencies": {
    "react": "18.2.0",
    "expo": "54.0.0",  <-  Sam needs to add a comma here
+   "lodash": "4.17.21"
  }

Git merge conflict:

"dependencies": {
    "react": "18.2.0",
    "expo": "54.0.0",
    "axios": "1.13.2"
========
    "react": "18.2.0",
    "expo": "54.0.0",
    "lodash": "4.17.21"
  }

Because both developers modified the same line ("expo": "54.0.0") by adding a trailing comma, Git sees overlapping changes and reports a merge conflict.

In YAML, list items do not depend on the previous line having a comma.

Git merge result in YAML format:

dependencies:
  react: 18.2.0
  expo: 54.0.0
  axios: 1.13.2
  lodash: 4.17.21

Since no developer touched the react: 18.2.0 line, Git can auto-merge them without any issues.

How to Read and Write YAML in Python

Now let's get to work. We'll use the PyYAML library to work with configuration files.

Reading a YAML File

The YAML file is easy to read, but there is a problem. If possible, do not use the standard load() function. If the YAML file is untrusted, it can execute harmful code. My advice is to always use safe_load().

Create a file named config.yaml:

app_name: CodeHemuApp
version: 1.0.5
database:
  host: localhost
  port: 3000
features:
  - dark_mode
  - beta_access

Now, run this Python script:

import yaml

with open('config.yaml', 'r') as file:
    config = yaml.safe_load(file)

print(f"App Name: {config['app_name']}")
print(f"Database Port: {config['database']['port']}")
print(f"Features: {config['features']}")

Output:

App Name: CodeHemuApp
Database Port: 3000
Features: ['dark_mode', 'beta_access']

Writing to a YAML File

You can also generate configuration files dynamically using Python dictionaries. Like this:

import yaml

data = {
    'server': {
        'ip': '192.128.1.10',
        'status': 'active'
    },
    'users': ['ram', 'sam', 'jodu']
}

with open('output.yaml', 'w') as file:
    yaml.dump(data, file, sort_keys=False)

print("YAML file created successfully!")

Here are some of the most commonly used:

Category Platforms & Tools
Container Orchestration Kubernetes, Docker Compose, OpenShift
Infrastructure as Code Terraform, Ansible, CloudFormation (AWS)
CI/CD Pipeline GitHub Actions, GitLab CI/CD, Azure Pipelines
Development Tools (Support) VS Code, Prettier, ESLint

Conclusion

If you are already familiar with JSON in Python, working with YAML will feel effortless.

Throughout this article, you have learned to read YAML files in Python applications without any security issues. You learned to write configurations that are clear and easy to understand even 1 year later.

First, start a small program and change only a small part of your configuration. Practice using what you're learning, and gradually, as you become more comfortable with it, use more YAML. Learning step by step makes the process easier and more enjoyable.

Nowadays, many software projects use YAML to manage settings. Knowing how to use YAML will help you succeed as you learn.

Happy Python Development Journey!

Post a Comment

0 Comments