[embeddings] Failed HTTP endpoint latched for entire process lifetime with no auto-retry #766

New issue

Closed

opened 2026-06-03 00:22:44 +02:00 by sleepy · 1 comment

sleepy commented

2026-06-03 00:22:44 +02:00

Owner

"File: src/embeddings.py line 203 python _http_embed_down = False # process-level latch Once the HTTP embedding endpoint fails, _http_embed_down = True is set for the entire process lifetime. The only way to reset it is calling reset_http_embed_state() — which is only triggered by manual admin panel saves. This means: 1. If the embedding endpoint is briefly down during startup, the process runs on FastEmbed forever 2. If the endpoint recovers, no automatic retry occurs 3. In a long-running server, this can cause degraded quality (FastEmbed may use a different/smaller model than the configured endpoint) The rag_singleton.py has a better pattern — it retries every 30 seconds. embeddings.py should adopt a similar approach. Action: Replace the boolean latch with a time-based retry (e.g., re-probe every N seconds after failure), similar to rag_singleton.py's _RETRY_INTERVAL."

"**File**: `src/embeddings.py` line 203 ```python _http_embed_down = False # process-level latch ``` Once the HTTP embedding endpoint fails, `_http_embed_down = True` is set for the **entire process lifetime**. The only way to reset it is calling `reset_http_embed_state()` — which is only triggered by manual admin panel saves. This means: 1. If the embedding endpoint is briefly down during startup, the process runs on FastEmbed forever 2. If the endpoint recovers, no automatic retry occurs 3. In a long-running server, this can cause degraded quality (FastEmbed may use a different/smaller model than the configured endpoint) The `rag_singleton.py` has a better pattern — it retries every 30 seconds. `embeddings.py` should adopt a similar approach. **Action**: Replace the boolean latch with a time-based retry (e.g., re-probe every N seconds after failure), similar to `rag_singleton.py`'s `_RETRY_INTERVAL`."

sleepy referenced this issue from a commit

2026-06-03 15:54:03 +02:00

Replace boolean latch with time-based retry for HTTP embedding endpoint

sleepy referenced this issue from a pull request that will close it,

2026-06-03 15:54:31 +02:00

Replace boolean latch with time-based retry for HTTP embedding endpoint (#766) #802

sleepy commented

2026-06-03 15:54:44 +02:00

Author

Owner

Fixed in PR #802 — replaced boolean latch with time-based retry (30s re-probe interval).

sleepy closed this issue