Skip to content

JSON and YAML

JSON (JavaScript Object Notation) and YAML (YAML Ain't Markup Language) are popular data serialization formats used for configuration files, data exchange, and structured data storage.

JSON (JavaScript Object Notation)

JSON is a lightweight, text-based data interchange format that's easy for humans to read and write, and easy for machines to parse and generate.

JSON Syntax

{
  "name": "John Doe",
  "age": 30,
  "isActive": true,
  "address": {
    "street": "123 Main St",
    "city": "New York",
    "zipCode": "10001"
  },
  "hobbies": ["reading", "swimming", "coding"],
  "spouse": null
}

Key Features

  • Data types: strings, numbers, booleans, arrays, objects, null
  • No comments: JSON doesn't support comments
  • Strict syntax: requires double quotes for strings and property names
  • Widely supported: native support in JavaScript and most programming languages

YAML (YAML Ain't Markup Language)

YAML is a human-readable data serialization standard often used for configuration files and data exchange.

YAML Syntax

name: John Doe
age: 30
isActive: true
address:
  street: 123 Main St
  city: New York
  zipCode: "10001"
hobbies:
  - reading
  - swimming
  - coding
spouse: null

# Comments are supported in YAML
# This is a configuration example
database:
  host: localhost
  port: 5432
  credentials:
    username: admin
    password: secret123

Key Features

  • Indentation-based: uses spaces (not tabs) for structure
  • Comments supported: lines starting with #
  • More readable: often preferred for configuration files
  • Data types: strings, numbers, booleans, arrays, objects, null
  • Multi-line strings: supports literal and folded styles
# Literal style (preserves line breaks)
description: |
  This is a multi-line string
  that preserves line breaks
  exactly as written.

# Folded style (converts line breaks to spaces)
summary: >
  This is a long description
  that will be folded into
  a single line with spaces.

Using jq to Parse JSON

jq is a powerful command-line JSON processor that allows you to slice, filter, and transform JSON data.

Basic jq Examples

Sample JSON file (data.json):

{
  "users": [
    {"id": 1, "name": "Alice", "department": "Engineering", "salary": 85000},
    {"id": 2, "name": "Bob", "department": "Marketing", "salary": 65000},
    {"id": 3, "name": "Carol", "department": "Engineering", "salary": 90000}
  ],
  "company": "TechCorp",
  "founded": 2010
}

Common jq Commands:

# Pretty print JSON
jq '.' data.json

# Extract specific field
jq '.company' data.json
# Output: "TechCorp"

# Extract array elements
jq '.users[0]' data.json
# Output: {"id": 1, "name": "Alice", "department": "Engineering", "salary": 85000}

# Extract specific field from all array elements
jq '.users[].name' data.json
# Output: "Alice", "Bob", "Carol"

# Filter array elements
jq '.users[] | select(.department == "Engineering")' data.json

# Map over array elements
jq '.users | map(.name)' data.json
# Output: ["Alice", "Bob", "Carol"]

# Calculate average salary
jq '.users | map(.salary) | add / length' data.json
# Output: 80000

# Create new structure
jq '{company_name: .company, employee_count: (.users | length)}' data.json
# Output: {"company_name": "TechCorp", "employee_count": 3}

Parsing JSON in Python

Using Pydantic

Pydantic provides data validation and parsing using Python type annotations:

from pydantic import BaseModel
from typing import List, Optional
import json

class Address(BaseModel):
    street: str
    city: str
    zipCode: str

class User(BaseModel):
    name: str
    age: int
    isActive: bool
    address: Address
    hobbies: List[str]
    spouse: Optional[str] = None

# Parse JSON string
json_data = '''
{
  "name": "John Doe",
  "age": 30,
  "isActive": true,
  "address": {
    "street": "123 Main St",
    "city": "New York",
    "zipCode": "10001"
  },
  "hobbies": ["reading", "swimming", "coding"],
  "spouse": null
}
'''

user = User.model_validate_json(json_data)
print(f"Name: {user.name}, Age: {user.age}")
print(f"City: {user.address.city}")

Using Dataclass

Python's built-in dataclass with JSON:

from dataclasses import dataclass
from typing import List, Optional
import json

@dataclass
class Address:
    street: str
    city: str
    zipCode: str

@dataclass
class User:
    name: str
    age: int
    isActive: bool
    address: Address
    hobbies: List[str]
    spouse: Optional[str] = None

    @classmethod
    def from_dict(cls, data: dict):
        return cls(
            name=data['name'],
            age=data['age'],
            isActive=data['isActive'],
            address=Address(**data['address']),
            hobbies=data['hobbies'],
            spouse=data.get('spouse')
        )

# Parse JSON
data = json.loads(json_data)
user = User.from_dict(data)
print(f"User: {user.name} from {user.address.city}")

Loading and Parsing in Python

JSON in Python

import json

# Reading JSON from file
with open('data.json', 'r') as f:
    data = json.load(f)

# Parsing JSON string
json_string = '{"name": "Alice", "age": 30}'
data = json.loads(json_string)

# Writing JSON to file
data = {"name": "Bob", "age": 25, "city": "NYC"}
with open('output.json', 'w') as f:
    json.dump(data, f, indent=2)

# Converting to JSON string
json_string = json.dumps(data, indent=2)

YAML in Python

import yaml

# Reading YAML from file
with open('config.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Parsing YAML string
yaml_string = """
name: Alice
age: 30
hobbies:
  - reading
  - coding
"""
data = yaml.safe_load(yaml_string)

# Writing YAML to file
data = {"name": "Bob", "age": 25, "hobbies": ["gaming", "hiking"]}
with open('output.yaml', 'w') as f:
    yaml.dump(data, f, default_flow_style=False)

# Converting to YAML string
yaml_string = yaml.dump(data, default_flow_style=False)

Loading and Parsing in Julia

JSON in Julia

using JSON

# Reading JSON from file
data = JSON.parsefile("data.json")

# Parsing JSON string
json_string = """{"name": "Alice", "age": 30}"""
data = JSON.parse(json_string)

# Writing JSON to file
data = Dict("name" => "Bob", "age" => 25, "city" => "NYC")
open("output.json", "w") do f
    JSON.print(f, data, 2)
end

# Converting to JSON string
json_string = JSON.json(data, 2)

YAML in Julia

using YAML

# Reading YAML from file
config = YAML.load_file("config.yaml")

# Parsing YAML string
yaml_string = """
name: Alice
age: 30
hobbies:
  - reading
  - coding
"""
data = YAML.load(yaml_string)

# Writing YAML to file
data = Dict("name" => "Bob", "age" => 25, "hobbies" => ["gaming", "hiking"])
YAML.write_file("output.yaml", data)

# Converting to YAML string
yaml_string = YAML.write(data)

Both JSON and YAML are essential tools for modern development, with JSON being preferred for APIs and data exchange, while YAML is often chosen for configuration files due to its readability.