Common Failures

These are the most common operator-facing failures when bringing up VibeRecall locally or in a hosted environment.

`404 Session not found`

Cause:

the client is sending a stale or expired mcp-session-id

Fix:

reconnect or restart the MCP client so it performs initialize again

Why this happens:

the MCP transport is stateful
backend restarts invalidate server-side session state
some clients keep retrying with dead session identifiers unless you reconnect cleanly

Fast recovery path:

remove or disconnect the dead MCP server entry if the client exposes that action
reconnect it
rerun viberecall_get_status

Graph-backed tools fail locally

If graph-backed calls like viberecall_save, viberecall_search, viberecall_get_facts, or viberecall_timeline fail with FalkorDB connection errors:

start the local runtime dependencies with docker compose -f ops/docker-compose.runtime.yml up -d
verify http://localhost:8010/healthz
confirm your local env points at the expected FalkorDB and Redis hosts

Do not debug the client before you confirm the local backing services are actually running.

Control-plane requests fail between web and API

If project pages cannot load control-plane data:

make sure CONTROL_PLANE_INTERNAL_SECRET matches across web and API
make sure PUBLIC_WEB_URL and ALLOWED_ORIGINS are configured consistently
verify the API base URL exposed to the web app matches the public deployment

If browser pages fail but direct MCP calls still work, the problem is often in control-plane wiring rather than the MCP runtime itself.

Docs links point at the wrong host

If the control plane opens stale or incorrect docs URLs:

verify NEXT_PUBLIC_DOCS_URL before building apps/web
rebuild the control plane after changing the value
confirm the docs Vercel project is serving the expected docs.<your-domain> domain

Remember that NEXT_PUBLIC_* values are baked in at build time. Changing the env without rebuilding the web app will not fix already-built links.

Token works in one project and fails in another

Cause:

the token belongs to a different project
the endpoint path points at the wrong project_id

Fix:

confirm the token was minted for the intended project
confirm the MCP endpoint path includes the same project_id
reconnect the client after correcting either side

Project scoping is explicit by design. A valid token is still the wrong credential if it is paired with the wrong project path.

Hosted server cannot see local uncommitted code

Cause:

the hosted MCP runtime cannot read your laptop filesystem
the agent assumed remote indexing could inspect a dirty local worktree directly

Fix:

keep the hosted core server for memory
use a local packaging or bridge flow for local workspace material
submit a bundle or other explicit repo source instead of a raw local path

See Local Workspace Bridge for the safe model.

Index run stays `QUEUED`

Cause:

the index request was accepted but no worker has started processing it yet
the queue may be paused, unhealthy, or pointed at the wrong backend

Fix:

confirm the project really received an index_run_id
wait briefly and recheck viberecall_get_index_status
if the run stays queued past the expected worker pickup window, inspect worker health and queue delivery

Interpretation:

this is usually a runtime readiness problem
it is not evidence that the client-side bundle upload or repo-source payload was malformed

Index run stays `RUNNING` too long

Cause:

a worker picked up the run but indexing is not completing
the worker may be stuck on repository fetch, parsing, or persistence

Fix:

recheck viberecall_get_index_status to confirm the run is still non-terminal
inspect worker logs and API logs before retrying
only rerun indexing after you know whether the previous run is genuinely stuck or simply slow

Interpretation:

long-running indexing should be treated as a worker/runtime diagnosis problem first
avoid repeated retries until you understand why the current run is not finishing

Index run returns `FAILED`

Cause:

the runtime reached a terminal indexing failure
the failure may come from repo access, bundle validation, parsing, or persistence

Fix:

capture the error.code and error.message from viberecall_get_index_status
inspect API logs and worker logs for the same index_run_id
retry only after the failure class is understood

Interpretation:

failed indexing is actionable because the run reached a terminal state
do not collapse this into a generic “try again later” diagnosis

Deployed smoke command behaves differently than the docs

Cause:

the wrapper command and the direct Python entrypoint were historically inconsistent

Fix:

prefer the documented form: pnpm smoke:mcp:deployed -- --base-url <...> --project-id <...> --token <...>
if you are debugging the script itself, call the Python entrypoint directly
compare the exact profile flags and token flags before assuming the runtime is at fault

Interpretation:

the pnpm ... -- ... wrapper is the supported path
the direct Python call is a debugging fallback, not the primary operator workflow

Optional prompts or resources do not appear

Cause:

the client does not surface them
the current integration path only exposes tools well
the deployment intentionally treats those features as optional

Fix:

continue with the tool-first workflow
verify viberecall_get_status, search, context, and save flows first
treat prompts and resources as optional UX, not as the core compatibility contract

This is expected more often than many teams assume.

Tool spam produces noisy or unhelpful memory

Cause:

the agent saves every intermediate thought
the tool set is too broad for the current workflow
instructions do not distinguish meaningful observations from scratch work

Fix:

narrow the installed tool subset
adopt a rules template from Playbooks & Rules
save only stable findings, decisions, and evidence worth reuse

Memory quality problems are usually workflow problems before they are storage problems.

404 Session not found​

Graph-backed tools fail locally​

Control-plane requests fail between web and API​

Docs links point at the wrong host​

Token works in one project and fails in another​

Hosted server cannot see local uncommitted code​

Index run stays QUEUED​

Index run stays RUNNING too long​

Index run returns FAILED​

Deployed smoke command behaves differently than the docs​

Optional prompts or resources do not appear​

Tool spam produces noisy or unhelpful memory​

`404 Session not found`

Graph-backed tools fail locally

Control-plane requests fail between web and API

Docs links point at the wrong host

Token works in one project and fails in another

Hosted server cannot see local uncommitted code

Index run stays `QUEUED`

Index run stays `RUNNING` too long

Index run returns `FAILED`

Deployed smoke command behaves differently than the docs

Optional prompts or resources do not appear

Tool spam produces noisy or unhelpful memory