Skip to content

fix(installer): implement auto-fallback for port 8080 conflicts (#51)#703

Closed
hmyousuf2010 wants to merge 8 commits into
NVIDIA:mainfrom
hmyousuf2010:fix-port-8080-conflict
Closed

fix(installer): implement auto-fallback for port 8080 conflicts (#51)#703
hmyousuf2010 wants to merge 8 commits into
NVIDIA:mainfrom
hmyousuf2010:fix-port-8080-conflict

Conversation

@hmyousuf2010

@hmyousuf2010 hmyousuf2010 commented Mar 23, 2026

Copy link
Copy Markdown

Summary

This PR implements an automatic port fallback mechanism for the NemoClaw onboarder. If the default port 8080 (or the port specified by NEMOCLAW_PORT) is already in use, the installer will automatically increment to the next available port, preventing silent failures or cryptic errors during installation.

Related Issue

Fixes #51

Changes

  • Implemented findFreePort utility in bin/lib/onboard.js using Node's net module with a 100-attempt limit.
  • Updated the preflight check in bin/lib/onboard.js to automatically find an alternative port if 8080 is occupied.
  • Added support for the NEMOCLAW_PORT environment variable to allow user-defined base port overrides.
  • Updated the openshell gateway start command to use the dynamically determined port via the --port flag.
  • Adjusted the E2E test test/e2e/test-double-onboard.sh to match the new warning output format for port conflicts.
  • Updated docs/reference/commands.md to document the new environment variable and fallback behavior.

Type of Change

  • Code change with doc updates.

Testing

  • npm test passes (verified locally).
  • Verified auto-fallback and stale session recovery via test/e2e/test-double-onboard.sh.

Checklist

General

Code Changes

  • Tests updated for changed behavior.
  • No secrets, API keys, or credentials committed.
  • Doc pages updated for user-facing behavior changes (documented NEMOCLAW_PORT and auto-increment behavior).

Summary by CodeRabbit

  • New Features

    • Gateway port configurable via NEMOCLAW_PORT (default 8080) and propagated through startup
    • Automatic fallback: if chosen gateway port is occupied, the wizard picks the next available port and logs a warning
  • Bug Fixes

    • Dashboard port (18789) now fails fast with clearer messaging when unavailable
  • Documentation

    • Onboarding docs updated to describe gateway port behavior and override
  • Tests

    • End-to-end test updated to detect and accept the new fallback log message

@coderabbitai

coderabbitai Bot commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added runtime gateway port discovery: findFreePort() probes successive TCP ports when NEMOCLAW_PORT (default 8080) is occupied. preflight() now returns { gpu, gatewayPort }; startGateway(gpu, gatewayPort) starts OpenShell gateway with the selected port. Dashboard port (18789) remains fail-hard.

Changes

Cohort / File(s) Summary
Port discovery & onboarding logic
bin/lib/onboard.js
Added findFreePort(startPort, maxAttempts=100). preflight() reads/validates NEMOCLAW_PORT (default 8080), auto-selects the next free gateway port if occupied (logs warning), returns { gpu, gatewayPort }. startGateway signature changed to startGateway(gpu, gatewayPort) and launches openshell gateway with --port <gatewayPort>. Dashboard port (18789) kept as a fail-hard check.
Docs
docs/reference/commands.md
Documented the dynamic gateway port handling, fallback behavior, and the NEMOCLAW_PORT environment override.
E2E tests
test/e2e/test-double-onboard.sh
Updated assertions to detect the new "Port 8080 in use, trying" log line, added explicit failure when the old "Port 8080 is not available" appears, and adjusted success/failure messaging. Minor header comment added.
CI/status report
commit_output.txt
Added lint/format check output capturing multiple repository checks; notes an ESLint failure due to missing @eslint/js module.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User as User/CLI
  participant Onboard as Onboard script
  participant OS as OS/Network
  participant Gateway as OpenShell Gateway

  rect rgba(200,230,201,0.5)
    User->>Onboard: run `nemoclaw onboard`
    Onboard->>Onboard: call preflight()
    Onboard->>OS: check NEMOCLAW_PORT (default 8080)
    alt port free
      OS-->>Onboard: port available
    else port in use
      OS-->>Onboard: port occupied
      Onboard->>Onboard: call findFreePort(start+1)
      Onboard->>OS: probe successive ports
      OS-->>Onboard: returns free port
      Onboard-->>User: log warning about fallback
    end
    Onboard->>Gateway: start gateway with chosen port (--port <gatewayPort>)
    Gateway-->>Onboard: gateway started
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I sniffed for ports both near and far,
8080 closed? I hopped to the next star.
I poked each socket until a gap I found,
Gave a tiny warning, then spun code around.
Gateway up — happy hops and a fresh start. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly describes the main change: implementing auto-fallback for port 8080 conflicts, which is the primary objective of the changeset.
Linked Issues check ✅ Passed The PR successfully addresses issue #51 by implementing automatic port validation and fallback: detecting when port 8080 is occupied, auto-selecting the next free port (with 100-attempt limit), and supporting NEMOCLAW_PORT environment variable override.
Out of Scope Changes check ✅ Passed All changes are directly related to the port conflict resolution objective: core logic in onboard.js, documentation updates, E2E test adjustments, and a lint report. The commit_output.txt appears to be an artifact but contains no functional code changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
docs/reference/commands.md (1)

55-55: Multiple sentences on one line.

The coding guidelines require one sentence per line in source files to improve diff readability. This line contains two sentences.

Proposed fix
-It also checks if the default gateway port (8080) is available, and automatically increments to the next free port if it is in use. You can explicitly override the target port by setting the `NEMOCLAW_PORT` environment variable.
+It also checks if the default gateway port (8080) is available, and automatically increments to the next free port if it is in use.
+You can explicitly override the target port by setting the `NEMOCLAW_PORT` environment variable.

As per coding guidelines: "One sentence per line in source (makes diffs readable). Flag paragraphs where multiple sentences appear on the same line."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/commands.md` at line 55, This line contains two sentences;
split it into two separate lines so each sentence is on its own line: one line
stating "It also checks if the default gateway port (8080) is available, and
automatically increments to the next free port if it is in use." and the
following line stating "You can explicitly override the target port by setting
the `NEMOCLAW_PORT` environment variable." Update the
`docs/reference/commands.md` entry that currently contains both sentences
together (the sentence mentioning default gateway port 8080 and the sentence
mentioning NEMOCLAW_PORT) to follow the one-sentence-per-line guideline.
bin/lib/onboard.js (1)

15-27: Potential unhandled promise rejection and misleading error message.

Two issues with findFreePort:

  1. Error message is misleading: When maxAttempts <= 0, the error message says "after 100 attempts starting from ${startPort - 100}" but this assumes the function always starts with 100 attempts. If called with a different maxAttempts value, this message would be incorrect.

  2. Recursive promise not chained properly: On line 25, the server.on('error') handler resolves with findFreePort(...) which returns a Promise. This works but if the recursive call rejects, that rejection could go unhandled depending on how the outer promise is consumed.

Proposed fix
 function findFreePort(startPort, maxAttempts = 100) {
+  const originalStart = arguments.length > 2 ? arguments[2] : startPort;
   return new Promise((resolve, reject) => {
     if (maxAttempts <= 0) {
-      return reject(new Error(`Could not find a free port after 100 attempts starting from ${startPort - 100}`));
+      return reject(new Error(`Could not find a free port after 100 attempts starting from ${originalStart}`));
     }
     const server = net.createServer();
     server.listen(startPort, () => {
       const port = server.address().port;
       server.close(() => resolve(port));
     });
-    server.on('error', () => resolve(findFreePort(startPort + 1, maxAttempts - 1)));
+    server.on('error', () => {
+      findFreePort(startPort + 1, maxAttempts - 1, originalStart).then(resolve, reject);
+    });
   });
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 15 - 27, The findFreePort function has a
misleading error message and an unchained recursive Promise on
server.on('error'); update the exhaustion branch to reject with a message that
uses the actual maxAttempts and startPort values (e.g., `new Error(\`Could not
find a free port after ${maxAttempts} attempts starting from ${startPort}\`)`)
and change the server.on('error') handler to properly chain the recursive call
by invoking findFreePort(startPort + 1, maxAttempts -
1).then(resolve).catch(reject) (or await/return the Promise) instead of
resolve(findFreePort(...)), ensuring rejections bubble to the outer Promise;
reference findFreePort and the server.on('error') handler when locating the
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 387-388: The code uses parseInt on process.env.NEMOCLAW_PORT into
targetGatewayPort and assigns gatewayPort without validating the result; change
the logic around parseInt/process.env.NEMOCLAW_PORT to validate that the parsed
value is a finite integer in the valid port range (1–65535), and if not, fall
back to the default 8080 (or handle as desired) and log or surface a clear
error; update the handling where targetGatewayPort and gatewayPort are used so
gatewayPort is always a valid number and does not propagate NaN.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 15-27: The findFreePort function has a misleading error message
and an unchained recursive Promise on server.on('error'); update the exhaustion
branch to reject with a message that uses the actual maxAttempts and startPort
values (e.g., `new Error(\`Could not find a free port after ${maxAttempts}
attempts starting from ${startPort}\`)`) and change the server.on('error')
handler to properly chain the recursive call by invoking findFreePort(startPort
+ 1, maxAttempts - 1).then(resolve).catch(reject) (or await/return the Promise)
instead of resolve(findFreePort(...)), ensuring rejections bubble to the outer
Promise; reference findFreePort and the server.on('error') handler when locating
the change.

In `@docs/reference/commands.md`:
- Line 55: This line contains two sentences; split it into two separate lines so
each sentence is on its own line: one line stating "It also checks if the
default gateway port (8080) is available, and automatically increments to the
next free port if it is in use." and the following line stating "You can
explicitly override the target port by setting the `NEMOCLAW_PORT` environment
variable." Update the `docs/reference/commands.md` entry that currently contains
both sentences together (the sentence mentioning default gateway port 8080 and
the sentence mentioning NEMOCLAW_PORT) to follow the one-sentence-per-line
guideline.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f098e31e-4d3d-4657-abff-b7f8274bfa9b

📥 Commits

Reviewing files that changed from the base of the PR and between c55a309 and 40c739a.

📒 Files selected for processing (3)
  • bin/lib/onboard.js
  • docs/reference/commands.md
  • test/e2e/test-double-onboard.sh

Comment thread bin/lib/onboard.js Outdated
@wscurran wscurran added enhancement New capability or improvement request NemoClaw CLI labels Mar 23, 2026
@wscurran

Copy link
Copy Markdown
Contributor

Thanks for submitting this, an automatic port fallback mechanism could make it easier for users to install and run NemoClaw.

@hmyousuf2010 hmyousuf2010 force-pushed the fix-port-8080-conflict branch from 8c94321 to 5ff120a Compare March 25, 2026 21:57

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 2306-2309: preflight() currently selects gatewayPort too early
(before long-running setupNim()), which allows another process to steal the
port; fix by re-running the port probe immediately before starting the gateway
or by making startGateway() resilient: after await setupNim(...), call the
port-selection routine again to get a fresh gatewayPort (or add a retry loop
inside startGateway() that catches bind errors, re-probes a new port, and
retries binding a few times), updating the gatewayPort passed into
startGateway() so the binding happens on a freshly-checked port.
- Around line 1277-1295: Reject NEMOCLAW_PORT when it equals the dashboard port
and ensure the gateway fallback never picks dashboardPort: after parsing
targetGatewayPort (from NEMOCLAW_PORT) add a check against dashboardPort (18789)
and exit with an error if they match; when calling findFreePort(gatewayPort + 1)
pass or use a variant that skips dashboardPort (or post-filter returned ports)
so the auto-increment fallback for gatewayPort will never return 18789; update
references in this block (dashboardPort, targetGatewayPort, gatewayPort,
checkPortAvailable, findFreePort) accordingly.
- Around line 22-33: The findFreePort function must guard against startPort >
65535 and only retry on EADDRINUSE: update the initial bounds check in
findFreePort(startPort, maxAttempts) to reject if maxAttempts <= 0 OR startPort
> 65535 (with a clear error message), and change the server 'error' handler to
examine the error (e.g., err.code) and only recurse/resolve with
findFreePort(startPort + 1, maxAttempts - 1) when err.code === 'EADDRINUSE'; for
any other error, reject the promise with that error so permission/other failures
are not swallowed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2840e9f7-c107-4105-92be-3ea465de3264

📥 Commits

Reviewing files that changed from the base of the PR and between 5ff120a and 8c94321.

📒 Files selected for processing (3)
  • bin/lib/onboard.js
  • docs/reference/commands.md
  • test/e2e/test-double-onboard.sh
✅ Files skipped from review due to trivial changes (1)
  • docs/reference/commands.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/test-double-onboard.sh

Comment thread bin/lib/onboard.js Outdated
Comment thread bin/lib/onboard.js
Comment thread bin/lib/onboard.js Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
bin/lib/onboard.js (3)

2307-2310: ⚠️ Potential issue | 🔴 Critical

Re-probe the gateway port immediately before gateway start.

Line 2307 chooses the port before Line 2308, which can spend minutes pulling models or starting local runtimes. Another process can claim the port in that gap, so onboarding can still fail on bind even when preflight succeeded.

Move the gateway-port selection into a helper you call after setupNim(), or make startGateway() retry on bind conflicts with a freshly probed port.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 2307 - 2310, preflight() currently probes
gatewayPort too early (before long-running setupNim()), allowing races where
another process steals the port; move the port probe so you choose the port
immediately before binding: either call the port-probing helper after setupNim()
and pass that fresh gatewayPort into startGateway(), or change startGateway() to
re-probe and retry binding on EADDRINUSE by selecting a new port and retrying a
few times; update call-sites to use the new helper or the retrying
startGateway() and reference preflight(), setupNim(), gatewayPort and
startGateway() when making the change.

1278-1325: ⚠️ Potential issue | 🟠 Major

Keep the dashboard port out of the gateway search window.

Lines 1280-1286 still accept NEMOCLAW_PORT=18789, and Line 1292 can auto-select 18789 for a user-supplied base in that search window below the dashboard port. Line 1299 won't catch it because the gateway has not bound yet, so the failure is deferred until the dashboard forward later tries to claim the same port.

Reject NEMOCLAW_PORT === 18789, and make the fallback probe explicitly skip the reserved dashboard port before accepting a new gateway port.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1278 - 1325, The gateway port selection
currently allows NEMOCLAW_PORT to equal the hardcoded dashboardPort (18789) and
findFreePort may return that reserved port; reject NEMOCLAW_PORT ===
dashboardPort up-front and ensure the auto-fallback logic for gatewayPort skips
dashboardPort when probing. Specifically, add a check after parsing
targetGatewayPort to error out if targetGatewayPort === dashboardPort (referring
to NEMOCLAW_PORT and dashboardPort), and change the fallback path that uses
findFreePort(gatewayPort + 1) (and/or the logic around gatewayPort/gatewayCheck
and findFreePort) to loop or call a variant that ignores dashboardPort so the
returned newPort will never equal dashboardPort before assigning gatewayPort.

22-33: ⚠️ Potential issue | 🟠 Major

Only retry findFreePort() on actual port conflicts.

Line 24 lets probes walk past 65535, and Line 33 treats every listen() failure as "port in use". That means a high user-supplied base can abort the fallback path, while EACCES / EADDRNOTAVAIL get silently converted into a different port choice.

Suggested fix
 function findFreePort(startPort, maxAttempts = 100) {
   return new Promise((resolve, reject) => {
-    if (maxAttempts <= 0) {
+    if (maxAttempts <= 0 || startPort > 65535) {
       return reject(new Error(`Could not find a free port after 100 attempts starting from ${startPort - 100}`));
     }
     const server = net.createServer();
+    server.once("error", (error) => {
+      if (error && error.code === "EADDRINUSE") {
+        return resolve(findFreePort(startPort + 1, maxAttempts - 1));
+      }
+      return reject(error);
+    });
     server.listen(startPort, () => {
       const addr = server.address();
       const port = (addr && typeof addr !== "string") ? addr.port : startPort;
       server.close(() => resolve(port));
     });
-    server.on('error', () => resolve(findFreePort(startPort + 1, maxAttempts - 1)));
   });
 }
According to the official Node.js documentation for `net.Server.listen()`, what happens when the `port` argument is outside `0..65535`, and which failures are emitted through the server's `"error"` event versus thrown synchronously?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 22 - 33, findFreePort currently lets probes
go beyond valid port range and treats every server.listen() failure as "port in
use"; before creating the server validate startPort is within 0..65535 and
immediately reject if out of range, and wrap server.listen in a try/catch to
handle synchronous RangeError/throw from Node; in the server.on('error') handler
only retry (call findFreePort with startPort+1) when err.code === 'EADDRINUSE'
(port conflict), and for other errors such as 'EACCES' or 'EADDRNOTAVAIL' reject
the promise with the original error so the caller sees the real failure;
reference function findFreePort and its server.listen()/server.on('error')
handling when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@commit_output.txt`:
- Around line 1-41: commit_output.txt is a transient local lint artifact that
shouldn't be in the PR; remove it from the branch and prevent future commits by
deleting the file from the repo (git rm commit_output.txt and commit) or
untracking it (git rm --cached commit_output.txt && commit) and add
commit_output.txt to .gitignore, then push the branch so the PR no longer
contains the file or the leaked absolute path.

---

Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 2307-2310: preflight() currently probes gatewayPort too early
(before long-running setupNim()), allowing races where another process steals
the port; move the port probe so you choose the port immediately before binding:
either call the port-probing helper after setupNim() and pass that fresh
gatewayPort into startGateway(), or change startGateway() to re-probe and retry
binding on EADDRINUSE by selecting a new port and retrying a few times; update
call-sites to use the new helper or the retrying startGateway() and reference
preflight(), setupNim(), gatewayPort and startGateway() when making the change.
- Around line 1278-1325: The gateway port selection currently allows
NEMOCLAW_PORT to equal the hardcoded dashboardPort (18789) and findFreePort may
return that reserved port; reject NEMOCLAW_PORT === dashboardPort up-front and
ensure the auto-fallback logic for gatewayPort skips dashboardPort when probing.
Specifically, add a check after parsing targetGatewayPort to error out if
targetGatewayPort === dashboardPort (referring to NEMOCLAW_PORT and
dashboardPort), and change the fallback path that uses findFreePort(gatewayPort
+ 1) (and/or the logic around gatewayPort/gatewayCheck and findFreePort) to loop
or call a variant that ignores dashboardPort so the returned newPort will never
equal dashboardPort before assigning gatewayPort.
- Around line 22-33: findFreePort currently lets probes go beyond valid port
range and treats every server.listen() failure as "port in use"; before creating
the server validate startPort is within 0..65535 and immediately reject if out
of range, and wrap server.listen in a try/catch to handle synchronous
RangeError/throw from Node; in the server.on('error') handler only retry (call
findFreePort with startPort+1) when err.code === 'EADDRINUSE' (port conflict),
and for other errors such as 'EACCES' or 'EADDRNOTAVAIL' reject the promise with
the original error so the caller sees the real failure; reference function
findFreePort and its server.listen()/server.on('error') handling when making
these changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bbd10a51-af4c-4960-994e-181dddc89d83

📥 Commits

Reviewing files that changed from the base of the PR and between 8c94321 and ba413c1.

📒 Files selected for processing (4)
  • bin/lib/onboard.js
  • commit_output.txt
  • docs/reference/commands.md
  • test/e2e/test-double-onboard.sh
✅ Files skipped from review due to trivial changes (1)
  • docs/reference/commands.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/test-double-onboard.sh

Comment thread commit_output.txt Outdated
@cv

cv commented Apr 9, 2026

Copy link
Copy Markdown
Collaborator

I attempted to port this branch across the JS→TS migration and merge the latest main, but it still needs manual follow-up.

Please start with:

git fetch origin
git merge origin/main
npx tsx scripts/ts-migration-assist.ts --base origin/main --write
npm run build:cli
npm run typecheck:cli
npm run lint
npm test

cv added a commit that referenced this pull request Apr 13, 2026
…1645)

## Summary

- Add `bin/lib/ports.js` as central port configuration module
- All hardcoded service ports can now be overridden via environment
variables
- Defaults are unchanged — zero regression for existing setups

| Env var | Default | Service |
|---------|---------|---------|
| `NEMOCLAW_GATEWAY_PORT` | 8080 | OpenShell gateway |
| `NEMOCLAW_DASHBOARD_PORT` | 18789 | Dashboard UI |
| `NEMOCLAW_VLLM_PORT` | 8000 | vLLM/NIM inference |
| `NEMOCLAW_OLLAMA_PORT` | 11434 | Ollama inference |

Ports are validated to be non-privileged (1024–65535).

### Usage

```bash
export NEMOCLAW_DASHBOARD_PORT=19000
nemoclaw onboard
```

## Related Issue

Closes #684
Based on the approach from PR #683 by @jnun — thank you for the original
design.

Supersedes #683 (rebased on latest main with correct defaults).
Consider closing #703, #357 (subsumed by this).

## Test plan

- [ ] `npm test` passes (135 tests in cli + onboard)
- [ ] `nemoclaw onboard` works with default ports (no regression)
- [ ] Set `NEMOCLAW_DASHBOARD_PORT=19000`, verify dashboard uses port
19000
- [ ] Set invalid port (e.g., `NEMOCLAW_GATEWAY_PORT=abc`), verify
startup fails with clear error

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Centralized, configurable ports for dashboard, gateway, vLLM and
Ollama via environment variables with validated defaults.

* **Improvements**
* Startup, health checks, diagnostics, uninstall and helper flows now
honor configured ports and enforce numeric/range validation with
fail-fast errors.

* **Tests**
* Added unit and end-to-end tests covering port parsing, boundaries and
override scenarios; CI job added to run E2E port-override tests.

* **Chore**
* Added compatibility shim to surface built artifacts and documented the
default dashboard port in the container build.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: jnun <imjasonn@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@prekshivyas

Copy link
Copy Markdown
Contributor

Closing — this is now fully covered by #1645 which landed configurable port overrides via NEMOCLAW_GATEWAY_PORT, NEMOCLAW_DASHBOARD_PORT, NEMOCLAW_VLLM_PORT, and NEMOCLAW_OLLAMA_PORT environment variables, along with preflight port-conflict checks. Thank you for the contribution!

ericksoa added a commit to cheese-head/NemoClaw that referenced this pull request Apr 14, 2026
…VIDIA#1645)

## Summary

- Add `bin/lib/ports.js` as central port configuration module
- All hardcoded service ports can now be overridden via environment
variables
- Defaults are unchanged — zero regression for existing setups

| Env var | Default | Service |
|---------|---------|---------|
| `NEMOCLAW_GATEWAY_PORT` | 8080 | OpenShell gateway |
| `NEMOCLAW_DASHBOARD_PORT` | 18789 | Dashboard UI |
| `NEMOCLAW_VLLM_PORT` | 8000 | vLLM/NIM inference |
| `NEMOCLAW_OLLAMA_PORT` | 11434 | Ollama inference |

Ports are validated to be non-privileged (1024–65535).

### Usage

```bash
export NEMOCLAW_DASHBOARD_PORT=19000
nemoclaw onboard
```

## Related Issue

Closes NVIDIA#684
Based on the approach from PR NVIDIA#683 by @jnun — thank you for the original
design.

Supersedes NVIDIA#683 (rebased on latest main with correct defaults).
Consider closing NVIDIA#703, NVIDIA#357 (subsumed by this).

## Test plan

- [ ] `npm test` passes (135 tests in cli + onboard)
- [ ] `nemoclaw onboard` works with default ports (no regression)
- [ ] Set `NEMOCLAW_DASHBOARD_PORT=19000`, verify dashboard uses port
19000
- [ ] Set invalid port (e.g., `NEMOCLAW_GATEWAY_PORT=abc`), verify
startup fails with clear error

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Centralized, configurable ports for dashboard, gateway, vLLM and
Ollama via environment variables with validated defaults.

* **Improvements**
* Startup, health checks, diagnostics, uninstall and helper flows now
honor configured ports and enforce numeric/range validation with
fail-fast errors.

* **Tests**
* Added unit and end-to-end tests covering port parsing, boundaries and
override scenarios; CI job added to run E2E port-override tests.

* **Chore**
* Added compatibility shim to surface built artifacts and documented the
default dashboard port in the container build.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: jnun <imjasonn@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
ColinM-sys pushed a commit to ColinM-sys/NemoClaw that referenced this pull request Apr 14, 2026
…VIDIA#1645)

## Summary

- Add `bin/lib/ports.js` as central port configuration module
- All hardcoded service ports can now be overridden via environment
variables
- Defaults are unchanged — zero regression for existing setups

| Env var | Default | Service |
|---------|---------|---------|
| `NEMOCLAW_GATEWAY_PORT` | 8080 | OpenShell gateway |
| `NEMOCLAW_DASHBOARD_PORT` | 18789 | Dashboard UI |
| `NEMOCLAW_VLLM_PORT` | 8000 | vLLM/NIM inference |
| `NEMOCLAW_OLLAMA_PORT` | 11434 | Ollama inference |

Ports are validated to be non-privileged (1024–65535).

### Usage

```bash
export NEMOCLAW_DASHBOARD_PORT=19000
nemoclaw onboard
```

## Related Issue

Closes NVIDIA#684
Based on the approach from PR NVIDIA#683 by @jnun — thank you for the original
design.

Supersedes NVIDIA#683 (rebased on latest main with correct defaults).
Consider closing NVIDIA#703, NVIDIA#357 (subsumed by this).

## Test plan

- [ ] `npm test` passes (135 tests in cli + onboard)
- [ ] `nemoclaw onboard` works with default ports (no regression)
- [ ] Set `NEMOCLAW_DASHBOARD_PORT=19000`, verify dashboard uses port
19000
- [ ] Set invalid port (e.g., `NEMOCLAW_GATEWAY_PORT=abc`), verify
startup fails with clear error

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Centralized, configurable ports for dashboard, gateway, vLLM and
Ollama via environment variables with validated defaults.

* **Improvements**
* Startup, health checks, diagnostics, uninstall and helper flows now
honor configured ports and enforce numeric/range validation with
fail-fast errors.

* **Tests**
* Added unit and end-to-end tests covering port parsing, boundaries and
override scenarios; CI job added to run E2E port-override tests.

* **Chore**
* Added compatibility shim to surface built artifacts and documented the
default dashboard port in the container build.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: jnun <imjasonn@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ColinM-sys <cmcdonough@50words.com>
@wscurran wscurran added area: cli Command line interface, flags, terminal UX, or output feature PR adds or expands user-visible functionality and removed NemoClaw CLI enhancement New capability or improvement request labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: cli Command line interface, flags, terminal UX, or output feature PR adds or expands user-visible functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

./install.sh assumes port 8080 isn't already in use

4 participants