Return non-zero exit from get on query failure#2313
Merged
Conversation
osism get hostvars and osism get hosts logged an error but then
returned None when the underlying ansible-inventory subprocess
failed. cliff turns a take_action return value of None into exit
code 0 (return_code = self.take_action(parsed_args) or 0), so these
commands exited successfully even though the inventory query could
not be run at all. That makes them unsafe to use in set -e scripts
or && chains, where a failed lookup silently looks like success.
Both commands now return 1 when subprocess.CalledProcessError is
raised, i.e. when the query itself could not be executed.
A missing host is one such failure. With the pinned ansible-core
(2.18.9, release latest/base.yml), ansible-inventory --host on a
host that is not in the inventory does not return an empty result;
it refuses the query with "ERROR! You must pass a single valid host
to --host parameter" and exits 5. osism get hostvars for an unknown
host therefore now exits 1, which matches its existing "Host not
found in inventory." error message.
Empty results are deliberately left as exit 0: a host that exists
but has no host vars (ansible-inventory prints {} and exits 0), a
requested variable or fact that is absent from an otherwise
successfully retrieved data set, or an inventory that lists no
matching hosts. These are valid answers to a query that did run.
This keeps the behaviour consistent with the convention used
elsewhere in the CLI, where a non-zero status signals an operational
failure rather than an empty result set.
Adds unit tests for both commands covering the failure path (a
raised CalledProcessError yields exit code 1) and the empty-result
path (a successful query with nothing to show yields exit code 0).
AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
Contributor
Author
|
Follow-up: #2314 catalogs other |
berendt
pushed a commit
that referenced
this pull request
Jun 9, 2026
Several commands wait for a Celery task's output, hit TimeoutError, log the timeout, and then fall through to the end of take_action with an implicit None return. cliff turns that into exit code 0 (return_code = self.take_action(parsed_args) or 0), so a command that gave up waiting for its task still reports success. That makes the affected commands unsafe in set -e scripts and && chains, where a timed-out sync silently looks like it completed. This is the same class of bug PR #2313 fixed for osism get, but on the timeout path rather than the inventory-query path. The affected sites: - reconciler.Sync - validate.Run._handle_task (the value take_action returns) - netbox.Ironic - netbox.Sync Each now returns 1 from the TimeoutError handler, so a timeout yields a non-zero exit. Adds unit tests asserting exit 1 when fetch_task_output raises TimeoutError, mirroring PR #2313's failure-path tests. Related-Bug: #2314 AI-assisted: Claude Code Signed-off-by: Roger Luethi <luethi@osism.tech>
berendt
pushed a commit
that referenced
this pull request
Jun 9, 2026
osism report memory, lldp, bgp and status each load the Ansible
inventory by running ansible-inventory via subprocess.run. On a
non-zero return code ("Error loading inventory.") and on
subprocess.TimeoutExpired ("Timeout loading inventory.") they logged
the error and then bare-returned None. cliff turns that into exit
code 0 (return_code = self.take_action(parsed_args) or 0), so a report
whose inventory could not be loaded at all still exited successfully.
That makes these commands unsafe in set -e scripts and && chains.
These are the direct twins of the bug PR #2313 fixed for osism get.
All eight inventory-load failure paths now return 1.
The neighbouring "No hosts found in inventory." branch is deliberately
left at exit 0: the query ran fine and simply returned no hosts, which
is a valid empty result rather than an operational failure, matching
the convention established in PR #2313.
Adds unit tests for all four commands covering the failure path (a
non-zero ansible-inventory return code yields exit 1), the timeout
path (TimeoutExpired yields exit 1) and the empty-result path (an empty
inventory yields exit 0).
Part of issue #2314 (Category B).
Related-Bug: #2314
AI-assisted: Claude Code
Signed-off-by: Roger Luethi <luethi@osism.tech>
berendt
pushed a commit
that referenced
this pull request
Jun 9, 2026
Several commands look up a resource, log an error or warning when it cannot be found, and then bare-return None. cliff turns that into exit code 0 (return_code = self.take_action(parsed_args) or 0), so a failed lookup silently looks like success and the commands are unsafe in set -e scripts and && chains. All the affected not-found / unresolved-lookup paths now return 1: - compute.ComputeMigrationList: no user domain, no user, no project domain, no project, multiple servers matched, no server found. - server.ServerList: no user domain, no user, domain not found, project domain not found, project not found. - volume.VolumeList: domain not found, project domain not found, project not found. - netbox.Console / netbox.Dump: NetBox integration not configured (the operation cannot run without it), and Dump's device not found. - baremetal Deploy, Undeploy, BurnIn, Clean, Provide, MaintenanceSet, MaintenanceUnset, PowerOn, PowerOff and Delete: the single named node could not be found (conn.baremetal.find_node returned nothing). Adds unit tests for each command driving the not-found path and asserting exit 1, mirroring PR #2313. Part of issue #2314 (Category C and Category D not-found lookups). Related-Bug: #2314 AI-assisted: Claude Code Signed-off-by: Roger Luethi <luethi@osism.tech>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
osism get hostvarsandosism get hostslogged an error but then returnedNonewhen the underlyingansible-inventorysubprocess failed. cliff turns atake_actionreturn value ofNoneinto exit code 0 (return_code = self.take_action(parsed_args) or 0), so these commands exited successfully even when the inventory query could not be run at all — making them unsafe inset -escripts and&&chains.Both commands now
return 1onsubprocess.CalledProcessError, i.e. when the query itself could not be executed.Missing host
With the pinned
ansible-core(2.18.9 —releaserepolatest/base.yml),ansible-inventory --host <unknown>does not return an empty result. It refuses the query withERROR! You must pass a single valid host to --host parameterand exits 5. Soosism get hostvars <unknown-host>now exits 1, matching its existingHost not found in inventory.error message.What stays exit 0 (empty results)
Empty results are deliberately left as success — a query that ran fine but had nothing to show:
ansible-inventoryprints{}, exits 0)A note on the convention
There is no documented exit-code contract for the
osismCLI — the behaviour is emergent and inconsistent across commands. This PR does not attempt to define or enforce one. It only tries to makeosism getmatch what looks like the prevailing convention elsewhere in the CLI: a non-zero exit signals an operational failure (the query could not be run), while an empty result set is treated as a valid, successful answer.Tests
Adds unit tests for both commands covering the failure path (raised
CalledProcessError→ exit 1) and the empty-result path (successful query with nothing to show → exit 0).