Input & UX

Parsing Nested JSON Args in Python CLIs

Parse and validate nested JSON arguments in Python CLIs using Click custom types, Pydantic models, and shell-safe encoding for complex config objects.

Updated

Some CLI inputs are genuinely structured: a deploy spec with nested resource limits, a filter object with arrays of conditions, an env map. Flattening those into a dozen flags (--cpu, --mem, --env-key, --env-value...) gets unwieldy fast. The pragmatic alternative is to accept a single JSON argument and validate it. This article builds a Click custom ParamType that calls json.loads and then validates the result against a Pydantic v2 model, handles @file and stdin input so you stay shell-safe, and reports errors clearly with the right exit code.

TL;DR

  • Accept nested input as one JSON string and parse it with a Click ParamType.convert.
  • Validate the parsed object with Model.model_validate(data) so a single type owns shape, ranges, and cross-field rules.
  • Support @file.json and @- (stdin) conventions to dodge shell-quoting nightmares for big payloads.
  • On bad JSON or a ValidationError, call self.fail(...) so Click exits with status 2 and prints a usage error — never a traceback.

Flow diagram: a JSON --config string runs through json.loads into a Python dict, then Pydantic model_validate into a validated nested object; a failure branches to a ValidationError on resources.cpu with exit code 2.

Define the schema first

Start with the data contract. A Pydantic v2 model is the single source of truth — nesting, types, bounds, and invariants all live here:

# models.py
from __future__ import annotations
from typing import Annotated
from pydantic import BaseModel, Field, field_validator, model_validator

class Resources(BaseModel):
    cpu: Annotated[int, Field(ge=1, le=64)]
    memory_mb: Annotated[int, Field(ge=128)]

class DeployConfig(BaseModel):
    name: Annotated[str, Field(min_length=1, max_length=63)]
    replicas: Annotated[int, Field(ge=1, le=100)]
    resources: Resources
    env: dict[str, str] = Field(default_factory=dict)
    canary_percent: Annotated[int, Field(ge=0, le=100)] = 0

    @field_validator("name")
    @classmethod
    def name_is_dns_safe(cls, v: str) -> str:
        if not all(c.isalnum() or c == "-" for c in v):
            raise ValueError("name must contain only alphanumerics and hyphens")
        return v.lower()

    @model_validator(mode="after")
    def canary_needs_replicas(self) -> "DeployConfig":
        if self.canary_percent > 0 and self.replicas < 2:
            raise ValueError("canary_percent requires at least 2 replicas")
        return self

The Resources submodel makes the nesting explicit, and Pydantic recurses into it automatically when you validate the parent.

A Click ParamType that parses and validates

A custom click.ParamType is the right seam. Click calls convert() once per argument; we do the json.loads, then model_validate, then hand back a fully typed object. The same type also handles @file.json and @- for stdin:

# cli.py
from __future__ import annotations
import json
import sys
import click
from pydantic import BaseModel, ValidationError
from models import DeployConfig

class JSONModel(click.ParamType):
    """Load JSON from a string, @file, or @- (stdin), then validate via Pydantic."""
    name = "json"

    def __init__(self, model: type[BaseModel]) -> None:
        self.model = model

    def convert(self, value, param, ctx):
        if isinstance(value, self.model):   # already converted (e.g. a default)
            return value
        raw = self._read(value, param, ctx)
        try:
            data = json.loads(raw)
        except json.JSONDecodeError as exc:
            self.fail(
                f"invalid JSON: {exc.msg} (line {exc.lineno}, column {exc.colno})",
                param, ctx,
            )
        try:
            return self.model.model_validate(data)
        except ValidationError as exc:
            self.fail(self._format(exc), param, ctx)

    def _read(self, value: str, param, ctx) -> str:
        if value == "@-":
            return sys.stdin.read()
        if value.startswith("@"):
            path = value[1:]
            try:
                with open(path, "r", encoding="utf-8") as fh:
                    return fh.read()
            except OSError as exc:
                self.fail(f"cannot read {path!r}: {exc.strerror}", param, ctx)
        return value

    @staticmethod
    def _format(exc: ValidationError) -> str:
        lines = []
        for err in exc.errors():
            loc = ".".join(str(p) for p in err["loc"]) or "(root)"
            lines.append(f"  {loc}: {err['msg']}")
        return "validation failed:\n" + "\n".join(lines)


@click.command()
@click.option("--config", type=JSONModel(DeployConfig), required=True,
              help="Deploy config as inline JSON, @file.json, or @- for stdin.")
def deploy(config: DeployConfig) -> None:
    click.echo(
        f"deploying {config.name} x{config.replicas} "
        f"(cpu={config.resources.cpu}, mem={config.resources.memory_mb}MB)"
    )

if __name__ == "__main__":
    deploy()

Calling self.fail() is the key detail: it raises click.BadParameter under the hood, which Click renders as a usage error and exits with status 2. The command body receives a validated DeployConfig and nothing else.

Shell-safe input: quoting, files, and stdin

The hardest part of passing JSON on the command line is not Python — it's the shell. JSON is full of characters your shell wants to interpret: double quotes, spaces, {, }, $, and backticks. A few rules keep you sane:

  • Single-quote the whole payload. In bash/zsh, '...' is literal, so --config '{"name": "api", "replicas": 3}' passes through untouched. Double quotes would let the shell expand $ and backticks inside the JSON.
  • For dynamic payloads, never build the string by hand. If you are launching the CLI from another Python program, build the args as a list and use subprocess.run([...], shell=False) so there is no shell to quote for at all. If you must produce a shell command string, run each piece through shlex.quote().
  • For anything non-trivial, read from a file or stdin. That is exactly what the @file.json and @- conventions above are for — they sidestep quoting entirely. --config @spec.json reads the file; cat spec.json | mytool deploy --config @- pipes it in. The leading @ mirrors the convention curl and jq use, so it will feel familiar.

The file/stdin path also scales: a 4 KB nested config is miserable to inline but trivial as @spec.json, and it keeps secrets out of your shell history.

Verifying it with click.testing

Validation code earns its keep only if you test both the happy path and the failures. CliRunner invokes the command in-process and captures the exit code and output:

# test_cli.py
import json
import pytest
from click.testing import CliRunner
from pydantic import ValidationError
from cli import deploy
from models import DeployConfig

VALID = {
    "name": "API-Gateway",
    "replicas": 3,
    "resources": {"cpu": 4, "memory_mb": 512},
    "env": {"LOG_LEVEL": "info"},
    "canary_percent": 25,
}

def test_model_lowercases_name():
    assert DeployConfig.model_validate(VALID).name == "api-gateway"

def test_cross_field_rule():
    with pytest.raises(ValidationError) as exc:
        DeployConfig.model_validate({**VALID, "replicas": 1, "canary_percent": 10})
    assert "at least 2 replicas" in str(exc.value)

def test_cli_valid_inline():
    result = CliRunner().invoke(deploy, ["--config", json.dumps(VALID)])
    assert result.exit_code == 0
    assert "deploying api-gateway x3" in result.output

def test_cli_invalid_json():
    result = CliRunner().invoke(deploy, ["--config", "{not json}"])
    assert result.exit_code == 2
    assert "invalid JSON" in result.output

def test_cli_validation_error():
    bad = json.dumps({**VALID, "resources": {"cpu": 0, "memory_mb": 1}})
    result = CliRunner().invoke(deploy, ["--config", bad])
    assert result.exit_code == 2
    assert "resources.cpu" in result.output

def test_cli_from_file(tmp_path):
    p = tmp_path / "cfg.json"
    p.write_text(json.dumps(VALID), encoding="utf-8")
    result = CliRunner().invoke(deploy, ["--config", f"@{p}"])
    assert result.exit_code == 0

def test_cli_from_stdin():
    result = CliRunner().invoke(deploy, ["--config", "@-"], input=json.dumps(VALID))
    assert result.exit_code == 0

Run it with pytest -q. Note how test_cli_validation_error asserts both the exit code and the field path resources.cpu — that nested location string is what makes the error actually useful, and it comes straight from Pydantic's err["loc"].

Production notes

  • Depth limits. json.loads will happily parse deeply nested input. If the JSON comes from an untrusted source, cap the payload size before parsing (len(raw) < MAX) — Python's parser is recursive and pathological nesting can exhaust the stack.
  • extra fields. By default Pydantic ignores unknown keys. Set model_config = ConfigDict(extra="forbid") if a typo'd key like "replcas" should be a hard error rather than silently dropped — usually the right call for a CLI.
  • @ collisions. If a literal value could legitimately begin with @, document the convention and offer an explicit --config-file option as an alternative, the way curl distinguishes -d from -d @file.
  • Cross-platform stdin. @- works the same on Windows, but pipe behavior in cmd.exe differs from PowerShell; prefer @file.json in cross-platform docs and CI.