Every time your CLI exits, it hands the shell a single integer. That number is the only thing &&, ||, set -e, and CI gates ever look at — the text you printed is invisible to them. This guide covers which numbers to use: the 0/1/2 baseline, the sysexits.h set and when it earns its keep, the reserved range above 125, and how to model your codes as one IntEnum you can document and test.
TL;DR
0is success. Everything non-zero is failure. Never exit0on an error path.1is a general error;2is a usage error (bad flags/args) —argparseand Click already use2for this, so match them.- The
sysexits.hcodes (EX_USAGE 64,EX_DATAERR 65, …) give richer semantics. Adopt them only if your callers actually branch on them; otherwise they add noise. - Values
126,127,128, and128+Nare reserved by the shell (not executable, not found, killed by signal N). Stay out of that range. - Define your codes once as an
IntEnum, document them in--helpor the man page, and assert them in tests withCliRunnerorsubprocess.
Start with 0, 1, and 2
The bedrock convention is older than Python and universally understood:
0— success. The command did what was asked.1— a general, catch-all failure. Something went wrong and you have nothing more specific to say.2— a usage error: an unknown option, a missing required argument, a malformed value. The user needs to fix the command line, not the world.
That 2-means-usage split is not arbitrary. Both argparse and Click exit 2 when parsing fails, so if you invent your own error handling you should keep 2 reserved for the same meaning. Otherwise a script that treats 2 as "retry with different args" gets confused when your tool returns 2 for a network error.
$ mytool deploy --nonsuch
Usage: mytool deploy [OPTIONS]
Try 'mytool deploy --help' for help.
Error: No such option: --nonsuch
$ echo $?
2
For a great many tools, 0/1/2 is the entire vocabulary you need. Reach for more only when a caller genuinely needs to distinguish failure modes programmatically.
When distinct failure modes earn distinct codes
Sometimes "it failed" is not enough. A backup tool might want CI to retry on a transient network failure but hard-stop on corrupted data. That is a real reason to hand out different numbers:
raise SystemExit(3) # network unreachable — safe to retry
raise SystemExit(4) # data integrity check failed — do NOT retry
Now a wrapper script can branch:
mytool backup
case $? in
0) echo "ok" ;;
3) echo "transient — retrying"; retry ;;
4) echo "corruption — paging oncall"; page ;;
*) echo "unknown failure"; exit 1 ;;
esac
The test for whether a custom code is worth it is simple: will a caller ever behave differently because of it? If yes, define it. If the only consumer is a human reading the message, a plain 1 with good error text is enough — see friendly error messages and tracebacks for making that text actionable.
The sysexits.h codes
BSD's <sysexits.h> defines a standard set of codes in the 64–78 range, meant to give failures a shared vocabulary across tools:
| Code | Name | Meaning |
|---|---|---|
| 64 | EX_USAGE | Command used incorrectly (bad args) |
| 65 | EX_DATAERR | Input data was incorrect |
| 66 | EX_NOINPUT | An input file did not exist or was unreadable |
| 69 | EX_UNAVAILABLE | A required service is unavailable |
| 70 | EX_SOFTWARE | Internal software error |
| 73 | EX_CANTCREAT | Cannot create an output file |
| 74 | EX_IOERR | An I/O error occurred |
| 77 | EX_NOPERM | Permission denied |
| 78 | EX_CONFIG | Something is misconfigured |
Python does not ship these as constants, so define the ones you use:
EX_USAGE = 64
EX_DATAERR = 65
EX_NOINPUT = 66
EX_CONFIG = 78
When to bother: adopt sysexits.h if your tool lives in an ecosystem that already reads them — mail delivery agents, some init systems, tools invoked by xargs pipelines that inspect specific codes. For a typical developer CLI, most callers only distinguish 0 from non-zero, and the 64–78 numbers are more obscure than a documented 1/2/3 scheme of your own. Consistency and documentation beat conformance to a table nobody reads. Pick one convention and hold it across every subcommand.
Reserved values above 125
Some codes are not yours to assign — the shell claims them, and reusing them creates ambiguity:
126— the command was found but is not executable (permission problem).127— command not found.128— invalid argument toexit(e.g. a non-integer).128 + N— the process was killed by signalN. So130=128 + 2(SIGINT, a Ctrl-C),137=128 + 9(SIGKILL),143=128 + 15(SIGTERM).- Above
255— impossible. Exit status is 8 bits, so codes wrap modulo 256:exit(256)reports as0,exit(257)as1. A code over 255 is a silent bug.
The practical rule: keep your own meaningful codes in the 1–125 range. If a caller sees 130, they should be able to conclude "someone hit Ctrl-C," not "the backup's canary check failed." Matching Ctrl-C to 130 is a nicety worth implementing:
except KeyboardInterrupt:
raise SystemExit(130) # 128 + SIGINT
One IntEnum as the single source of truth
Scattering magic numbers like raise SystemExit(4) across a codebase is how a 4 comes to mean two different things in two commands. Centralize them:
from enum import IntEnum
class ExitCode(IntEnum):
OK = 0
ERROR = 1 # general failure
USAGE = 2 # bad invocation
NETWORK = 3 # transient, retryable
DATA = 4 # corrupt input, do not retry
CONFIG = 5 # misconfiguration
def __str__(self) -> str: # so f-strings show the number
return str(self.value)
Because IntEnum is an int, you can hand it straight to sys.exit or return it from main:
import sys
def main() -> ExitCode:
if not config_ok():
print("error: config invalid; see 'mytool config --check'", file=sys.stderr)
return ExitCode.CONFIG
if not reachable():
print("error: registry unreachable", file=sys.stderr)
return ExitCode.NETWORK
do_work()
return ExitCode.OK
if __name__ == "__main__":
sys.exit(main()) # IntEnum → int, cleanly
The enum becomes the one place you look to answer "what does 5 mean?" — and the names make the call sites self-documenting. This pairs naturally with the top-level error boundary described in the error handling and exit codes overview, where each caught exception maps to one ExitCode.
Documenting your codes
An exit code nobody can look up might as well be random. Publish the table where callers will find it — in --help epilog text, a man page, or the README:
import click
EPILOG = """\
Exit codes:
0 success
1 general error
2 usage error
3 network error (retryable)
4 data error (do not retry)
5 configuration error
"""
@click.command(epilog=EPILOG)
def cli() -> None:
...
Keeping the table next to the IntEnum — ideally generated from it — means the docs cannot drift from the code.
Testing exit codes
An exit code is a promise; test it like one. With Click's CliRunner you get the code without spawning a process:
from click.testing import CliRunner
from mytool.cli import cli
def test_bad_flag_is_usage_error() -> None:
result = CliRunner().invoke(cli, ["--nonsuch"])
assert result.exit_code == 2
def test_missing_config_returns_config_code() -> None:
result = CliRunner().invoke(cli, ["run"])
assert result.exit_code == ExitCode.CONFIG
For an end-to-end check that includes your real entry point and sys.exit, drive it as a subprocess:
import subprocess
def test_network_failure_exit_code() -> None:
proc = subprocess.run(
["mytool", "backup", "--registry", "http://127.0.0.1:1"],
capture_output=True,
text=True,
)
assert proc.returncode == 3
assert "unreachable" in proc.stderr
Note the second test also asserts the message went to stderr, not stdout — the two promises a good failure keeps. For a broader look at exercising a CLI in tests, the entry-point mechanics are covered in best practices for Python CLI entry points.
Production notes
- Wrapping bites silently.
sys.exit(256)exits0. If codes are computed, clamp or assert they stay in0–125. - Click and Typer own some codes. Click exits
2on parse errors and1forClickException;Abortexits1. Don't reassign those meanings in the same app. set -eand pipelines. Ina | b, the shell reportsb's code by default. Callers who needa's status useset -o pipefail— document that your meaningful codes may be masked mid-pipe.- Cross-platform. Signal-derived codes (
130,143) are a Unix convention; Windows reports different values. Keep the codes you rely on in the low range for portability. - CI gates. Most CI systems treat any non-zero as a failed step. If you use codes like
3for "retryable," make sure the wrapper — not the raw CI step — is what interprets them.