JSON and YAML
JSON (JavaScript Object Notation) and YAML (YAML Ain't Markup Language) are popular data serialization formats used for configuration files, data exchange, and structured data storage.
JSON (JavaScript Object Notation)
JSON is a lightweight, text-based data interchange format that's easy for humans to read and write, and easy for machines to parse and generate.
JSON Syntax
{
"name": "John Doe",
"age": 30,
"isActive": true,
"address": {
"street": "123 Main St",
"city": "New York",
"zipCode": "10001"
},
"hobbies": ["reading", "swimming", "coding"],
"spouse": null
}
Key Features
- Data types: strings, numbers, booleans, arrays, objects, null
- No comments: JSON doesn't support comments
- Strict syntax: requires double quotes for strings and property names
- Widely supported: native support in JavaScript and most programming languages
YAML (YAML Ain't Markup Language)
YAML is a human-readable data serialization standard often used for configuration files and data exchange.
YAML Syntax
name: John Doe
age: 30
isActive: true
address:
street: 123 Main St
city: New York
zipCode: "10001"
hobbies:
- reading
- swimming
- coding
spouse: null
# Comments are supported in YAML
# This is a configuration example
database:
host: localhost
port: 5432
credentials:
username: admin
password: secret123
Key Features
- Indentation-based: uses spaces (not tabs) for structure
- Comments supported: lines starting with
# - More readable: often preferred for configuration files
- Data types: strings, numbers, booleans, arrays, objects, null
- Multi-line strings: supports literal and folded styles
# Literal style (preserves line breaks)
description: |
This is a multi-line string
that preserves line breaks
exactly as written.
# Folded style (converts line breaks to spaces)
summary: >
This is a long description
that will be folded into
a single line with spaces.
Using jq to Parse JSON
jq is a powerful command-line JSON processor that allows you to slice, filter, and transform JSON data.
Basic jq Examples
Sample JSON file (data.json):
{
"users": [
{"id": 1, "name": "Alice", "department": "Engineering", "salary": 85000},
{"id": 2, "name": "Bob", "department": "Marketing", "salary": 65000},
{"id": 3, "name": "Carol", "department": "Engineering", "salary": 90000}
],
"company": "TechCorp",
"founded": 2010
}
Common jq Commands:
# Pretty print JSON
jq '.' data.json
# Extract specific field
jq '.company' data.json
# Output: "TechCorp"
# Extract array elements
jq '.users[0]' data.json
# Output: {"id": 1, "name": "Alice", "department": "Engineering", "salary": 85000}
# Extract specific field from all array elements
jq '.users[].name' data.json
# Output: "Alice", "Bob", "Carol"
# Filter array elements
jq '.users[] | select(.department == "Engineering")' data.json
# Map over array elements
jq '.users | map(.name)' data.json
# Output: ["Alice", "Bob", "Carol"]
# Calculate average salary
jq '.users | map(.salary) | add / length' data.json
# Output: 80000
# Create new structure
jq '{company_name: .company, employee_count: (.users | length)}' data.json
# Output: {"company_name": "TechCorp", "employee_count": 3}
Parsing JSON in Python
Using Pydantic
Pydantic provides data validation and parsing using Python type annotations:
from pydantic import BaseModel
from typing import List, Optional
import json
class Address(BaseModel):
street: str
city: str
zipCode: str
class User(BaseModel):
name: str
age: int
isActive: bool
address: Address
hobbies: List[str]
spouse: Optional[str] = None
# Parse JSON string
json_data = '''
{
"name": "John Doe",
"age": 30,
"isActive": true,
"address": {
"street": "123 Main St",
"city": "New York",
"zipCode": "10001"
},
"hobbies": ["reading", "swimming", "coding"],
"spouse": null
}
'''
user = User.model_validate_json(json_data)
print(f"Name: {user.name}, Age: {user.age}")
print(f"City: {user.address.city}")
Using Dataclass
Python's built-in dataclass with JSON:
from dataclasses import dataclass
from typing import List, Optional
import json
@dataclass
class Address:
street: str
city: str
zipCode: str
@dataclass
class User:
name: str
age: int
isActive: bool
address: Address
hobbies: List[str]
spouse: Optional[str] = None
@classmethod
def from_dict(cls, data: dict):
return cls(
name=data['name'],
age=data['age'],
isActive=data['isActive'],
address=Address(**data['address']),
hobbies=data['hobbies'],
spouse=data.get('spouse')
)
# Parse JSON
data = json.loads(json_data)
user = User.from_dict(data)
print(f"User: {user.name} from {user.address.city}")
Loading and Parsing in Python
JSON in Python
import json
# Reading JSON from file
with open('data.json', 'r') as f:
data = json.load(f)
# Parsing JSON string
json_string = '{"name": "Alice", "age": 30}'
data = json.loads(json_string)
# Writing JSON to file
data = {"name": "Bob", "age": 25, "city": "NYC"}
with open('output.json', 'w') as f:
json.dump(data, f, indent=2)
# Converting to JSON string
json_string = json.dumps(data, indent=2)
YAML in Python
import yaml
# Reading YAML from file
with open('config.yaml', 'r') as f:
config = yaml.safe_load(f)
# Parsing YAML string
yaml_string = """
name: Alice
age: 30
hobbies:
- reading
- coding
"""
data = yaml.safe_load(yaml_string)
# Writing YAML to file
data = {"name": "Bob", "age": 25, "hobbies": ["gaming", "hiking"]}
with open('output.yaml', 'w') as f:
yaml.dump(data, f, default_flow_style=False)
# Converting to YAML string
yaml_string = yaml.dump(data, default_flow_style=False)
Loading and Parsing in Julia
JSON in Julia
using JSON
# Reading JSON from file
data = JSON.parsefile("data.json")
# Parsing JSON string
json_string = """{"name": "Alice", "age": 30}"""
data = JSON.parse(json_string)
# Writing JSON to file
data = Dict("name" => "Bob", "age" => 25, "city" => "NYC")
open("output.json", "w") do f
JSON.print(f, data, 2)
end
# Converting to JSON string
json_string = JSON.json(data, 2)
YAML in Julia
using YAML
# Reading YAML from file
config = YAML.load_file("config.yaml")
# Parsing YAML string
yaml_string = """
name: Alice
age: 30
hobbies:
- reading
- coding
"""
data = YAML.load(yaml_string)
# Writing YAML to file
data = Dict("name" => "Bob", "age" => 25, "hobbies" => ["gaming", "hiking"])
YAML.write_file("output.yaml", data)
# Converting to YAML string
yaml_string = YAML.write(data)
Both JSON and YAML are essential tools for modern development, with JSON being preferred for APIs and data exchange, while YAML is often chosen for configuration files due to its readability.