Once your CLI has more than a handful of teams depending on it, every new feature request becomes a merge into the core. A plugin architecture breaks that bottleneck: third parties (or other teams in your org) ship their own commands as separate, independently versioned packages that snap into your tool at runtime. This article shows how to build one with Python's native entry-points mechanism, a typing.Protocol contract, and a loader that isolates failures so a single broken plugin can't take down the whole CLI.
TL;DR
- Declare plugins as entry points in each plugin package's
pyproject.tomlunder[project.entry-points."<your.group>"]. - Discover them at runtime with
importlib.metadata.entry_points(group="<your.group>")— no scanning, no import-by-string-name hacks. - Define the contract with a
typing.Protocolso plugins depend on an interface, not your internals. - Wrap each plugin's load and registration in
try/except, and gate on a declaredapi_version. One bad plugin should log and skip, never crash.
Why plugins, and what "stable core API" means
The point of a plugin system is independent shipping. Your core CLI exposes a small, frozen surface — a way to register commands and maybe a shared context object — and everything else is built against it. A data-science team ships mycli-plugin-pandas; the SRE team ships mycli-plugin-deploy. Neither needs a PR into your repo, neither blocks the other's release, and your core stays small.
That only works if the contract between core and plugin is genuinely stable. If plugins reach into your internal modules, every refactor breaks the ecosystem. So the core API is whatever you promise not to break: the entry-point group name, the protocol methods plugins implement, and the type of any context you pass in. Treat it like a public API with semantic versioning.
If you haven't yet structured the core CLI itself, read Structuring multi-command Python CLIs first — the plugin layer sits on top of a clean command tree.
The entry-points mechanism in pyproject.toml
Entry points are metadata that a package advertises at install time. Python's packaging tooling records them, and any other process can query them by group. A plugin package declares its plugin like this:
# pyproject.toml of a plugin package, e.g. "mycli-plugin-greet"
[project]
name = "mycli-plugin-greet"
version = "0.2.0"
dependencies = ["mycli-core>=1.4,<2"] # depend on the stable core API
# The group name is YOUR namespace — pick something unique to your CLI.
[project.entry-points."mycli.plugins"]
greet = "mycli_plugin_greet:GreetPlugin"
The group "mycli.plugins" is the rendezvous point: your core CLI will look it up by exactly this string. The key (greet) is the plugin's advertised name; the value (mycli_plugin_greet:GreetPlugin) is an import.path:object reference. The :object part can point at a class, a factory function, or anything callable/loadable. Note the dependencies line — the plugin pins the core package's version range, which is how compatibility gets enforced at install time before your runtime checks even run. This is the same [project.entry-points] table that powers console scripts; for the mechanics of that side, see Best practices for Python CLI entry points.
Discovering plugins at runtime
On the core side, discovery is a single standard-library call. Since Python 3.10, entry_points() accepts a group= keyword and returns a filtered EntryPoints collection:
from importlib.metadata import entry_points
def discover(group: str = "mycli.plugins"):
for ep in entry_points(group=group):
plugin_cls = ep.load() # imports the module and resolves the object
yield ep.name, plugin_cls()
ep.load() performs the import lazily — nothing in a plugin package is imported until you ask for it. That keeps startup fast and means an uninstalled or broken module only fails when you actually try to load it (which is where error isolation comes in). The call shape is exactly what we validated:
>>> from importlib.metadata import entry_points
>>> eps = entry_points(group="mycli.plugins")
>>> type(eps).__name__
'EntryPoints'
>>> [ep.name for ep in eps] # empty until plugin packages are installed
[]
Compatibility note: on Python 3.9 the
group=keyword does not exist — you callentry_points()with no args and index the returned dict witheps.get("mycli.plugins", []). If you support 3.9, branch onsys.version_info. From 3.10 onward the keyword form above is canonical.
Defining the plugin contract with Protocol
Rather than a base class plugins must subclass, use a typing.Protocol. Structural typing means a plugin only has to have the right shape — it never imports your class, which keeps coupling minimal and makes plugins testable in isolation. Mark it @runtime_checkable so the loader can verify the shape with isinstance.
from typing import Protocol, runtime_checkable
import typer
API_VERSION = 1
@runtime_checkable
class CLIPlugin(Protocol):
name: str
api_version: int
def register(self, app: typer.Typer) -> None: ...
Three things make up the stable contract here: an identifying name, a declared api_version the loader can gate on, and a single register(app) method that receives the Typer (or Click) application and wires in commands. Plugins never touch your internals — they only call app.command(). This is also where the Typer-vs-Click choice matters: Typer's decorator-based registration is ergonomic for plugin authors, while Click gives you add_command/Group objects that are easier to compose programmatically. See Typer vs Click: when to use each for the trade-off.
Loading and registering into a Typer app — with isolation
Here is the complete, validated loader. It checks the protocol, gates on version, and isolates registration failures. In production the candidate list comes from discover() above; in this self-contained example we hand it the objects directly so it runs with no installed packages.
from __future__ import annotations
from typing import Protocol, runtime_checkable
import typer
@runtime_checkable
class CLIPlugin(Protocol):
name: str
api_version: int
def register(self, app: typer.Typer) -> None: ...
API_VERSION = 1
class GreetPlugin:
name = "greet"
api_version = 1
def register(self, app: typer.Typer) -> None:
@app.command()
def greet(who: str = "world") -> None:
print(f"hello, {who}")
class StatsPlugin:
name = "stats"
api_version = 1
def register(self, app: typer.Typer) -> None:
@app.command()
def stats(n: int) -> None:
print(f"sum 0..{n} = {sum(range(n + 1))}")
class BrokenPlugin: # raises during register
name = "broken"
api_version = 1
def register(self, app: typer.Typer) -> None:
raise RuntimeError("boom during register")
class StalePlugin: # built for an incompatible API version
name = "stale"
api_version = 99
def register(self, app: typer.Typer) -> None:
@app.command()
def stale() -> None:
print("should never load")
# In production: DISCOVERED = [obj for _, obj in discover("mycli.plugins")]
DISCOVERED = [GreetPlugin(), StatsPlugin(), BrokenPlugin(), StalePlugin()]
def load_plugins(app: typer.Typer, candidates) -> list[str]:
loaded: list[str] = []
for plugin in candidates:
if not isinstance(plugin, CLIPlugin): # contract check
print(f"[skip] {plugin!r}: does not satisfy CLIPlugin protocol")
continue
if plugin.api_version != API_VERSION: # version/compat gate
print(f"[skip] {plugin.name}: api_version {plugin.api_version} != {API_VERSION}")
continue
try: # error isolation
plugin.register(app)
except Exception as exc: # noqa: BLE001
print(f"[error] {plugin.name}: failed to register: {exc}")
continue
loaded.append(plugin.name)
print(f"[ok] loaded plugin: {plugin.name}")
return loaded
if __name__ == "__main__":
app = typer.Typer()
loaded = load_plugins(app, DISCOVERED)
print("ACTIVE PLUGINS:", loaded)
print("COMMANDS:", sorted(c.callback.__name__ for c in app.registered_commands))
Running this against Typer 0.26 / Click 8.4 on Python 3.14 produces:
[ok] loaded plugin: greet
[ok] loaded plugin: stats
[error] broken: failed to register: boom during register
[skip] stale: api_version 99 != 1
ACTIVE PLUGINS: ['greet', 'stats']
COMMANDS: ['greet', 'stats']
The two good plugins load, the broken one is caught and logged, the version-mismatched one is skipped — and the CLI keeps running with a coherent command set. For Click, the shape is identical: swap the parameter type to click.Group and have register call app.add_command(some_command).
Version, compatibility, and error isolation
Two failure modes dominate plugin systems, and both are visible above.
Compatibility drift. When you change the core API, old plugins built against the previous contract may call methods that no longer exist or pass the wrong context shape. The defenses layer up: the plugin's pyproject.toml pins mycli-core>=1.4,<2, so a major bump won't even install together; and the runtime api_version gate refuses anything that slips through. Bump API_VERSION only on breaking changes, and treat it like the major component of a semver contract.
Crash propagation. The cardinal rule: one bad plugin must never crash the CLI. Discovery (ep.load()) and registration (plugin.register(app)) are the two points where arbitrary third-party code runs, so both belong inside try/except. Catching broad Exception is the correct call here even though linters flag it — you genuinely cannot predict what a third-party plugin will raise, and the goal is graceful degradation. Log the failure (with a real logger and a --debug-gated traceback in production rather than a bare print), skip the plugin, and carry on.
A robust loader wraps discovery the same way:
def discover(group="mycli.plugins"):
from importlib.metadata import entry_points
for ep in entry_points(group=group):
try:
yield ep.name, ep.load()()
except Exception as exc: # bad import, missing dep, syntax error
logging.getLogger(__name__).warning("plugin %s failed to load: %s", ep.name, exc)
Security considerations
Entry-point plugins are arbitrary code that runs in your process with your user's privileges. There is no sandbox. Installing a plugin is exactly as dangerous as pip install of any package — the moment ep.load() imports the module, its top-level code executes. Treat the plugin ecosystem as a supply-chain surface:
- Don't auto-install plugins. Let users opt in explicitly, and surface which plugins are active (a
mycli plugins listcommand that prints names, versions, and distribution origins builds trust). - Pin and audit. Plugins are dependencies; lock them and review them like any other. A typosquatted
mycli-plugin-greetvsmycli_plugin_greetis a real attack vector. - Be wary of confused-deputy escalation: if your CLI runs with elevated rights (a deploy tool, say), every loaded plugin inherits them. Document this loudly.
- A version gate is a compatibility control, not a security one — it stops accidental breakage, not malicious code. Don't conflate the two.
For most internal tools the right posture is: plugins are trusted because you control the install, and the loader's job is robustness (isolation, versioning), not defense against hostile code. If you ever need to load genuinely untrusted plugins, an in-process entry-point system is the wrong tool — reach for subprocess isolation or a real sandbox.
Related
- Modern Python CLI Frameworks & Architecture — the pillar this fits into.
- Structuring multi-command Python CLIs — the command tree your plugins extend.
- Best practices for Python CLI entry points — the
[project.entry-points]mechanics in depth. - Typer vs Click: when to use each — choosing the framework your plugins register against.