docs: add English README + bilingual usage manual

This commit is contained in:
youzini
2026-05-31 18:20:31 +00:00
parent c3023aff78
commit f67358e024
11 changed files with 633 additions and 5 deletions

178
README.en.md Normal file
View File

@@ -0,0 +1,178 @@
# ST-BME — SillyTavern Bionic Memory Ecology
> Let the AI truly remember your story.
[中文](README.md) · **English**
ST-BME (Bionic Memory Ecology) is a **SillyTavern third-party frontend extension**. It distills the characters, events, locations, rules, plot threads, reflections, and summaries that appear over a long chat into a visual memory graph, then automatically recalls the most relevant memories and injects them into the prompt before each generation.
---
## Documentation
This README is a **slim entry point**. The details live in [`docs/`](docs/README.md):
| What you want | Where to look |
| --- | --- |
| Usage: configuration, panel, troubleshooting, storage | [`docs/usage/`](docs/usage/) |
| Architecture, control plane, data formats | [`docs/architecture/`](docs/architecture/) |
| Algorithm internals (retrieval / extraction / vectors) | [`docs/algorithms/`](docs/algorithms/) |
| How each feature works and its boundaries | [`docs/features/`](docs/features/) |
| Development, testing, contribution conventions | [`docs/contributing/`](docs/contributing/) |
Quick links: [Configuration](docs/usage/configuration.md) · [Panel guide](docs/usage/panel.md) · [Troubleshooting](docs/usage/troubleshooting.md) · [Memory model](docs/features/memory-model.md) · [History safety](docs/features/history-safety.md)
> Developer docs (architecture / algorithms / features / contributing) are currently Chinese-only. The English docs cover the README and the `docs/usage/` user manual.
---
## Core capabilities
- **Automatic memory extraction** — After each AI reply, ST-BME extracts structured nodes and relations from the conversation (characters, events, locations, rules, plot threads, reflections, subjective memories), excluding reasoning tags like `think`/`analysis`/`reasoning` by default.
- **Multi-layer hybrid recall** — Before generation, relevant memories are recalled through vector prefilter, graph diffusion, lexical boosting, multi-intent splitting, DPP diversity sampling, and optional LLM reranking; per-message persistent recall cards are supported.
- **Cognitive architecture** — Character POV / user POV / objective world memory, spatial region weighting, and a story timeline.
- **Summarization & maintenance** — Small summaries, summary rollup, reflection, consolidation, automatic compression, active forgetting — all logged and reversible.
- **Graph visualization** — A built-in canvas force-directed graph with realtime / cognitive / summary views and a mobile view.
- **Task preset system** — Extraction, recall, compression, summary, reflection, consolidation, and planning all run through a unified task profile, with regex, world info, and EJS rendering.
- **ENA Planner integration** — Pre-send story planning, integrated into the config page and the `planner` task preset.
- **Persistence & sync** — Local-first (IndexedDB), with cloud mirroring, backup/restore, rebuild, and repair.
- **History safety** — Detects message deletion / edits / swipes, automatically rolls back affected batches and recovers from the change point; protects against truncated "render only the last N" views.
- **Long-chat optimization** — Hide old turns to control tokens, limit rendered turns to reduce lag, and accelerate key computations with a Native/WASM rollout.
---
## How it works
ST-BME can be understood as three pipelines: **write** (conversation → memory), **read** (memory → injection), and **safety** (history change → recovery).
```mermaid
flowchart LR
subgraph Write["Write: conversation → memory"]
A["AI reply"] --> B["Structured message preprocessing"]
B --> C["LLM extracts nodes/edges"]
C --> D["Nearest-neighbor reconciliation + cognitive scoping"]
D --> E["Write graph + vector sync + timeline"]
E --> F["Consolidate / compress / summarize / reflect"]
end
subgraph Read["Read: memory → injection"]
G["User about to generate"] --> H["Multi-intent + context-blended query"]
H --> I["Vector prefilter + graph diffusion + lexical boost"]
I --> J["Cognitive boundary filter + hybrid scoring"]
J --> K["Optional LLM rerank + bucketed injection"]
end
subgraph Safe["Safety: history change → recovery"]
L["Delete / edit / swipe"] --> M["Message hash detection"]
M --> N["Locate affected turns"]
N --> O["Roll back batches and vectors"]
O --> P["Re-extract from the change point"]
end
F -.-> G
P -.-> E
```
- **Write**: the conversation is normalized into structured messages (reasoning tags excluded by default) → the LLM emits structured graph operations → nodes are written, vectors synced, timeline updated → post-processing (consolidation, compression, summary, reflection, forgetting).
- **Read**: resolve the recall target → vector prefilter + graph diffusion + lexical boost → rank and filter by fusing multiple signals → bucketed injection into the prompt, optionally writing a persistent recall card.
- **Safety**: a hash is recorded for each processed message; when a history change is detected, ST-BME prefers rollback-and-replay from the maintenance log, falling back to a full rebuild only when a safe rollback is not possible.
> Algorithm details (formulas, parameters, thresholds) are in [`docs/algorithms/`](docs/algorithms/); architecture and data paths are in [`docs/architecture/overview.md`](docs/architecture/overview.md).
---
## Installation
### Option 1: install via SillyTavern extensions
Open SillyTavern → Extensions → Install third-party extension, and enter the repository URL:
```text
https://github.com/Youzini-afk/ST-Bionic-Memory-Ecology
```
Refresh the page after installation.
> Paste the repository root URL, not a GitHub sub-page URL.
### Option 2: manual installation
```bash
cd SillyTavern/data/default-user/extensions/third-party
git clone https://github.com/Youzini-afk/ST-Bionic-Memory-Ecology.git st-bme
```
Then restart or refresh SillyTavern.
---
## Quick start
1. **Open the panel** — Click "Memory Graph" in the top-left menu.
2. **Enable the plugin** — Config → Feature toggles, confirm the main switch is on.
3. **Configure the model** — Leave the memory LLM blank to reuse the current chat model; or fill in an independent OpenAI-compatible URL/key/model under "API config".
4. **Configure embedding** — Backend mode is recommended (reuses SillyTavern's configured vector provider); direct mode also works but you must handle CORS yourself.
5. **Start chatting** — Just chat normally. Extraction runs after each AI reply, and recall runs before the next generation.
6. **Check results** — "Overview" for status, "Tasks → Memory browser" for nodes, the graph area for the relation network; a recall card may appear under user messages.
> Minimum viable setup: enable the plugin and ensure the current chat model works. Recall quality drops noticeably when embedding is unavailable, so configure it early.
>
> See [Configuration](docs/usage/configuration.md) for full settings and [Panel guide](docs/usage/panel.md) for what each panel area does.
---
## Common actions
| Action | Location | Description |
| --- | --- | --- |
| Re-extract | Actions → Memory ops | Extract unprocessed turns or rerun a range |
| Manual compress | Actions → Memory ops | Merge redundant high-level nodes |
| Generate small summary | Actions → Memory ops | Produce a staged summary for the recent text window |
| Run summary rollup | Actions → Memory ops | Fold multiple active summaries into a higher-level one |
| Rebuild summary state | Actions → Memory ops | Rebuild summaryState from extraction batches |
| Force evolution | Actions → Memory ops | Let new memories actively affect old ones |
| Run forgetting | Actions → Memory ops | Archive or down-weight low-value nodes |
| Undo recent maintenance | Actions → Memory ops | Roll back the most recent reversible maintenance |
| Rebuild vectors | Actions → Vector ops | Rebuild all node embeddings |
| Range rebuild | Actions → Vector ops | Rebuild only nodes related to a turn range |
| Direct re-embed | Actions → Vector ops | Re-embed using the direct embedding config |
| Export / import / rebuild graph | Actions → Graph management | Graph management and destructive ops |
| Backup / restore cloud | Config → Cloud storage mode | Manually upload/restore in manual mode |
| Unhide all | Config → Hide old turns | Restore turns hidden by ST-BME |
> After switching embedding mode or model, run "Rebuild vectors". Per-action details and danger notes are in [Configuration](docs/usage/configuration.md) and [Panel guide](docs/usage/panel.md).
---
## Data storage & history safety (highlights)
- **Local-first**: primary storage uses IndexedDB, isolated per chat (`STBME_{chatId}`), with incremental commits on the hot path.
- **Cloud mirroring**: reuses SillyTavern's file API, supports auto/manual modes, requires no custom backend.
- **History safety**: detects delete/edit/swipe, prefers rollback-and-replay, falls back to full rebuild when needed; protects against render-truncated views to avoid wrongly clearing the graph.
- **Forward compatibility**: durable snapshots have a frozen top-level shape, tolerant parsing, and upgrade-on-read — extending the data structure means "add a field", not a big migration.
> See [Storage & sync](docs/usage/storage-and-sync.md), [History safety](docs/features/history-safety.md), and [Data formats & forward compatibility](docs/architecture/storage-and-formats.md).
---
## Having trouble?
Step-by-step troubleshooting for common situations (panel won't open, no auto-extraction, poor recall, nodes appear cleared, recall cards missing, direct embedding fails, etc.) is in [Troubleshooting](docs/usage/troubleshooting.md).
---
## Known limitations
- **Memory quality depends on the LLM** — if the extraction model misunderstands, the memory will be wrong too.
- **Embedding sets the recall floor** — without high-quality vectors, recall leans more on lexical and graph structure.
- **Direct mode may be affected by CORS** — browser security policy may block requests.
- **Very long chats still have a cost** — hiding/render limits/summary rollup reduce pressure but can't eliminate all overhead.
- **History recovery prioritizes correctness** — it falls back to a full rebuild when the log is insufficient, which can be slow.
- **Third-party themes may affect recall card mounting** — cards may skip mounting if a theme removes the standard message DOM or turn-index attributes.
- **Native acceleration is a rollout capability** — it fails open to JS by default and can be force-disabled in the panel.
---
## License
AGPLv3 — see [LICENSE](./LICENSE).

View File

@@ -2,6 +2,8 @@
> 让 AI 真正记住你们的故事。
**中文** · [English](README.en.md)
ST-BMEBionic Memory Ecology是一个 **SillyTavern 第三方前端扩展**。它会把长期聊天中出现的角色、事件、地点、规则、主线、反思和总结抽取成一张可视化记忆图谱,并在下一轮生成前自动召回最相关的记忆注入 prompt。
---

View File

@@ -8,12 +8,12 @@
## 文档地图
### usage/ — 使用手册
面向用户:"怎么配、怎么用、出问题怎么查"。从精简后的 README 下沉的详细内容。
面向用户:"怎么配、怎么用、出问题怎么查"。从精简后的 README 下沉的详细内容。中英双语(`.md` 中文 / `.en.md` English
- [`configuration.md`](usage/configuration.md) — 完整配置参考:记忆 LLM、Embedding、提取/召回/认知/维护设置、任务预设、ENA、隐藏/渲染、Native
- [`panel.md`](usage/panel.md) — 面板导览:总览、任务、操作、配置、图谱区域
- [`troubleshooting.md`](usage/troubleshooting.md) — 排障指南
- [`storage-and-sync.md`](usage/storage-and-sync.md) — 数据存储、云端镜像、兼容兜底、持久召回卡片
- [`configuration.md`](usage/configuration.md) · [EN](usage/configuration.en.md) — 完整配置参考:记忆 LLM、Embedding、提取/召回/认知/维护设置、任务预设、ENA、隐藏/渲染、Native
- [`panel.md`](usage/panel.md) · [EN](usage/panel.en.md) — 面板导览:总览、任务、操作、配置、图谱区域
- [`troubleshooting.md`](usage/troubleshooting.md) · [EN](usage/troubleshooting.en.md) — 排障指南
- [`storage-and-sync.md`](usage/storage-and-sync.md) · [EN](usage/storage-and-sync.en.md) — 数据存储、云端镜像、兼容兜底、持久召回卡片
### architecture/ — 架构与控制平面
跨文件的结构、数据路径、不变量。这些内容变化慢,是理解"为什么这样组织"的入口。
@@ -56,3 +56,9 @@
1. **离代码越近,腐烂越慢。** 单个函数的 API 细节留在模块头注释里(改代码自然会改它),不抄进这里。本目录只写"跨文件的算法原理、不变量、功能行为"。
2. **不写一改就过期的内容。** 避免"某函数第几行做什么"这种描述;算法文档引用文件位置时,描述的是"哪个算法在哪个文件",而非逐行。
3. **改了行为就更新对应文档。** 算法参数、不变量、功能边界发生变化时,更新这里;纯重构(不改行为)通常不需要动文档。
## 双语约定
- **中文为源,英文跟随。** `.md` 是中文权威版,`.en.md` 是英文翻译。改文档先改中文 `.md`,再同步对应 `.en.md`
- 目前英文覆盖范围:根 `README``docs/usage/` 用户手册。`architecture/` / `algorithms/` / `features/` / `contributing/` 暂为中文,按需再加英文。
- 英文文件里指向其它有 `.en.md` 的文档时,链到英文兄弟文件;指向暂无英文版的开发者文档时,链到中文版即可。

View File

@@ -0,0 +1,204 @@
# Configuration
[中文](configuration.md) · **English**
This page is split out from the [README](../../README.en.md) as the main ST-BME user configuration reference, preserving setting names, defaults, and tables for quick lookup by feature.
### Memory LLM
The memory LLM is used for:
- Memory extraction.
- Recall reranking.
- Consolidation.
- Compression.
- Small summaries.
- Summary rollup.
- Reflection.
- ENA Planner planning.
Configuration options:
- **Leave blank**
- Reuse the current SillyTavern chat model.
- **Fill in OpenAI-compatible config**
- Use an independent model for memory tasks.
- Useful when you want to separate the main chat model from the background maintenance model.
Security recommendations:
- Do not publicly share exported `extension_settings` or browser storage that contains API keys.
- Debug logs are off by default; enable them temporarily only when troubleshooting.
### Embedding
Embedding is the core of smart recall.
#### Backend mode
Backend mode is recommended first:
- Reuse SillyTavern backend's embedding provider.
- Usually avoids storing the embedding API key directly in the browser.
- Can use sources already supported by SillyTavern, such as OpenAI, Cohere, Mistral, Ollama, LlamaCpp, and vLLM.
#### Direct mode
In direct mode, the browser requests the embedding service directly:
- Requires filling in the API URL, key, and model.
- May hit CORS restrictions.
- Suitable for a self-hosted gateway or independent embedding service.
> After switching embedding mode or model, run "rebuild vectors".
### Extraction settings
| Setting | Default | Description |
| --- | --- | --- |
| 每 N 条回复提取 | `1` | Trigger extraction every N assistant replies |
| 提取上下文轮数 | `2` | Number of conversation rounds to look back during extraction |
| 自动延后最新助手 | `false` | Allows the latest reply to stabilize before extraction |
| Assistant 排除标签 | `think,analysis,reasoning` | Excludes reasoning tags by default |
| 提取消息上限 | `0` | `0` means unlimited |
| 提取 Prompt 结构模式 | `both` | Provides both transcript and structured messages |
| 提取世界书模式 | `active` | Reuses the currently active world info context |
| 包含故事时间 | `true` | Provides the story timeline during extraction |
| 包含总结快照 | `true` | Provides active summaries during extraction |
| 手动提取模式 | `pending` | Default extraction mode in the panel |
### Recall settings
| Setting | Default | Description |
| --- | --- | --- |
| 启用召回 | `true` | Automatically retrieve memories before generation |
| 向量预筛 | `true` | Use embedding to find candidates first |
| 图扩散 | `true` | Diffuse along graph relations to related nodes |
| LLM 精排 | `true` | Let the LLM select final results from candidates |
| 召回 Top-K | `20` | Vector prefilter count |
| 最终节点上限 | `12` | Maximum number of nodes kept before injection |
| 图扩散 Top-K | `100` | Graph diffusion candidate count |
| LLM 候选池 | `30` | Candidate pool size for reranking |
| 多意图拆分 | `true` | Split one input into multiple retrieval intents |
| 上下文混合查询 | `true` | Blend the current input, previous assistant reply, and previous user message |
| 词法增强 | `true` | Weight exact keyword matches |
| 时序链接 | `true` | Mutually boost temporally nearby nodes |
| 多样性采样 | `true` | Avoid overly homogeneous recall results |
### Cognitive and spatial settings
| Setting | Default | Description |
| --- | --- | --- |
| Scoped Memory | `true` | Enable scoped memory |
| POV Memory | `true` | Enable character/user POV memory |
| 区域目标 | `true` | Distinguish current region, adjacent regions, and global |
| 认知记忆 | `true` | Enable subjective/objective cognitive attribution |
| 空间邻接 | `true` | Allow adjacency relations between regions |
| 故事时间线 | `true` | Enable story timeline tags |
| 注入故事时间标签 | `true` | Hint the current story time in injection |
| 软时间引导 | `true` | Guide by prompting, without forcing rewrites |
### Maintenance settings
| Setting | Default | Description |
| --- | --- | --- |
| 启用整合 | `true` | Similar/conflicting memory analysis and merge |
| 整合阈值 | `0.85` | Similarity trigger threshold |
| 启用小总结 | `true` | Compatible with the old `synopsis` name |
| 启用层级总结 | `true` | Use a small summary + rollup summary system |
| 小总结频率 | `3` | Generate a small summary every N extractions |
| 总结折叠扇入 | `3` | Roll up summaries when this many exist at the same layer |
| 启用智能触发 | `false` | Enhance extraction only in high-information scenes |
| 启用主动遗忘 | `false` | Periodically lower the priority of low-value nodes |
| 启用概率召回 | `false` | Allow a small number of weakly related memories to enter by probability |
| 启用反思 | `true` | Periodically summarize long-term trends |
| 启用自动压缩 | `true` | Compress similar memories by extraction cycle |
### Task presets and regex cleanup
Task preset types:
- **`extract`**
- Memory extraction.
- **`recall`**
- Recall reranking.
- **`compress`**
- Memory compression.
- **`synopsis`**
- Small summary generation.
- **`summary_rollup`**
- Summary rollup.
- **`reflection`**
- Long-term reflection.
- **`consolidation`**
- Memory consolidation.
- **`planner`**
- ENA Planner planning.
Regex cleanup reduces polluted tags from entering extraction, recall, and injection:
- `thinking` / `think` / `analysis` / `reasoning`
- `choice`
- `UpdateVariable`
- `status_current_variable`
- `StatusPlaceHolderImpl`
Users can adjust global regex rules and task-local rules in "Task presets". When an empty rule set is explicitly saved, the plugin will not automatically add the default rules back.
### ENA Planner
ENA Planner is now integrated through the `planner` task preset. For deeper implementation and flow details, see the [ENA Planner feature doc](../features/ena-planner.md). It can use:
- Character card blocks.
- World info blocks.
- Recent chat blocks.
- BME recalled memory blocks.
- Historical `<plot>` blocks.
- Current player input blocks.
Recommendations:
- Configure the base API and enabled state in "Config → ENA Planner".
- Adjust the planning prompt structure and generation parameters in "Config → Task presets → planner".
### Hide old turns and render limit
These are two separate features; for deeper implementation and boundary details, see the [Hide old turns and render limit feature doc](../features/hide-and-render.md):
- **Hide old turns**
- Controls context tokens.
- Does not delete chat content.
- Uses SillyTavern's hide mechanism so earlier turns no longer participate in the main reply or ST-BME reads.
- **Limit rendered chat turns**
- Reduces lag in very long chat UIs.
- Syncs to SillyTavern's `chat_truncation`.
- Only controls how many recent turns the frontend loads at most.
- It is not context hiding and is not message deletion.
Important notes:
- If you need to run "rerun extraction range" or full history recovery on very old turns, temporarily disable the render limit or increase the count and refresh.
- When ST-BME detects that the current `context.chat` is likely only a recent N-turn render slice, it pauses destructive history recovery to avoid wrongly clearing the runtime graph.
### Native acceleration
Native acceleration is currently a gradual rollout capability. For deeper implementation and fallback strategy details, see the [Native acceleration feature doc](../features/native-acceleration.md). It covers:
- Graph layout.
- Persist Delta.
- Snapshot Hydrate.
Default strategy:
- Automatically activates based on thresholds for node count, edge count, record count, structural changes, and serialized size.
- `Fail-open` is enabled by default; when Native is unavailable or fails, ST-BME falls back to JS.
- You can use "globally force-disable Native" to fall back to JS everywhere.

View File

@@ -1,5 +1,7 @@
# 主要配置
**中文** · [English](configuration.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 的主要用户配置说明,保留设置名称、默认值和表格,便于按功能查阅。
### 记忆 LLM

113
docs/usage/panel.en.md Normal file
View File

@@ -0,0 +1,113 @@
# Panel guide
[中文](panel.md) · **English**
This page is split out from the [README](../../README.en.md) as a user guide to the ST-BME panel areas, preserving the original item structure for daily lookup.
### Overview
- **Active nodes, edge connections, archived, fragmentation ratio**
- **Current chat ID**
- **History status**
- **Vector status**
- **Recent recovery**
- **Recent extraction**
- **Recent persistence**
- **Recent vector**
- **Recent recall**
- **Cognitive / spatial status**
### Tasks
The tasks page is used to observe ST-BME's background task flow in realtime.
- **Pipeline overview**
- Stage status for extraction, recall, persistence, vectors, and more.
- **Task timeline**
- Timeline and stage results for recent tasks.
- **Memory browser**
- Browse, filter, and inspect node details.
- **Injection preview**
- View the currently constructed injection text and token estimate.
- **Message tracing**
- Trace turns, extraction ranges, recall sources, and persistent records.
- **Persistence**
- View diagnostics for IndexedDB, sync, recovery, sidecar, native hydrate, and more.
### Actions
- **Re-extract**
- `提取未处理`: only process assistant turns that have not been extracted yet.
- `重新提取范围`: rerun a specified range by start/end turn.
- **Manual compression**
- Compress redundant or similar memories.
- **Generate small summary**
- Generate a staged summary based on a recent source text window.
- **Run summary rollup**
- Fold multiple active summaries into a higher-level summary.
- **Rebuild summary state**
- Rebuild summary state from extraction batches.
- **Force evolution**
- Let new memories actively affect old memories.
- **Run forgetting**
- Lower the priority of long-unused nodes or archive them.
- **Undo recent maintenance**
- Roll back the most recent reversible maintenance action.
- **Rebuild vectors / Range rebuild / Direct re-embed**
- Rebuild node vectors to fix recall quality or inconsistencies after switching vector models.
- **Export / import / rebuild graph**
- Graph management and dangerous operations.
- **Persistence repair**
- Retry persistence, re-detect the graph, rebuild the local cache, and repair/compact the main sidecar.
### Config
The config page contains these workspaces:
- **API config**
- Memory LLM.
- Embedding backend mode/direct mode.
- **Feature toggles**
- Main capabilities such as extraction, recall, consolidation, summary, reflection, compression, forgetting, and probabilistic recall.
- Cloud storage mode.
- World info filtering.
- Hide old turns and limit rendered chat turns.
- **Detailed parameters**
- Extraction frequency, context window, recall Top-K, graph diffusion, cognitive weights, maintenance thresholds, and more.
- **Task presets**
- Prompt blocks, generation parameters, regex, world info, and EJS templates for each task type.
- **ENA Planner**
- API, model, planning config, and task preset entry point for ENA Planner.
- **Panel appearance**
- Theme, notification style, debug logs, and Native acceleration.
- **Data cleanup**
- Cleanup entry points for local cache, legacy data, debug state, and more.
### Graph area
Desktop shows a realtime graph area. Mobile provides subview switching:
- **Realtime graph**
- **Cognitive view**
- **Summary view**

View File

@@ -1,5 +1,7 @@
# 面板导览
**中文** · [English](panel.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 面板各区域的用户导览,保留原有条目结构,方便日常查找。
### 总览

View File

@@ -0,0 +1,49 @@
# Storage & sync
[中文](storage-and-sync.md) · **English**
This page is split out from the [README](../../README.en.md) with ST-BME data storage, cloud mirroring, and persistent recall card notes; durable snapshot contract and forward-compat details are in the [storage and formats architecture doc](../architecture/storage-and-formats.md).
### Local primary storage
- Primary storage uses IndexedDB.
- Databases are isolated per chat and named like `STBME_{chatId}`.
- The hot path uses incremental commits to avoid replacing the whole graph.
- On load, the graph is restored from the local database first.
### Cloud mirroring
Cloud sync uses SillyTavern's existing file API and requires no custom backend route.
- Automatic mode:
- After local writes, sync according to the current mirroring logic.
- Manual mode:
- Local writes still work normally.
- Does not write to the cloud automatically.
- Requires clicking "backup to cloud" or "fetch backup from cloud".
### Compatibility and fallback
- Old `chat_metadata.st_bme_graph` is only used as a migration and fallback source.
- shadow snapshot and metadata-full are recoverable anchors, not the preferred primary storage.
- tombstone is used to sync deletion state and prevent old data from coming back.
- Plugin settings are stored in SillyTavern's `extension_settings.st_bme`.
- Message-level recall is stored in the corresponding user message's `message.extra.bme_recall`.
### Persistent recall cards
User messages with valid `message.extra.bme_recall` display recall cards:
- Expand to view the recall text.
- View the recall subgraph.
- Click nodes to view details.
- Edit the injection text.
- Delete persistent recall.
- Re-run recall and overwrite the record.
Priority:
1. When a new recall succeeds in this round, use the new recall and write it back to the target user turn.
2. When there is no new recall in this round, read persistent recall from the user turn corresponding to the current generation as fallback.
3. When neither exists, clear the injection.

View File

@@ -1,5 +1,7 @@
# 数据存储与同步
**中文** · [English](storage-and-sync.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 的数据存储、云端镜像与持久召回卡片说明durable snapshot contract 和 forward-compat 细节见 [存储与格式架构文档](../architecture/storage-and-formats.md)。
### 本地主存储

View File

@@ -0,0 +1,68 @@
# Troubleshooting
[中文](troubleshooting.md) · **English**
This page is split out from the [README](../../README.en.md) with common ST-BME user issues and fixes, so you can locate problems by symptom.
### Panel won't open
- Refresh the SillyTavern page.
- Confirm the extension directory contains `manifest.json`, `index.js`, and `style.css`.
- Open the browser console and search for `[ST-BME]`.
- Check whether another extension has overridden the top-left menu structure.
### No automatic extraction
- Confirm the plugin is enabled.
- Confirm the current chat already has assistant replies.
- Check "Overview → Recent extraction" and "Tasks → Pipeline overview".
- Check whether the memory LLM is available.
- If smart triggering is enabled, confirm the current content meets the trigger conditions.
- If a restore lock or persistence loading is active, wait for the state to recover.
### Poor recall quality
- Configure or repair Embedding.
- Run "rebuild vectors".
- Check whether recall Top-K, final node limit, and LLM reranking are enabled.
- Check whether nodes are too many or too scattered; you can run consolidation or compression.
- Check the per-message recall card to confirm the actual injection content.
### The model still sees too much content after old turns are hidden
- "Limit rendered chat turns" only reduces frontend loading; it does not save tokens.
- To actually control context, enable "hide old turns".
- After changing the setting, click "re-apply current hiding".
### Manual extraction says history recovery is paused
This is usually because "limit rendered chat turns" is enabled, so the frontend currently loads only the latest N turns.
How to handle it:
1. Temporarily disable "limit rendered chat turns", or increase N enough to cover the range you need to process.
2. Refresh the current chat.
3. Then run "extract unprocessed" or "rerun extraction range".
This is a protection mechanism; it does not mean the graph was lost.
### Nodes suddenly look cleared
- Refresh the page first.
- If it recovers after refresh, it is usually a temporary runtime state inconsistency; the persisted graph was not lost.
- Check "Overview → Recent recovery" and "Tasks → Persistence".
- Do not immediately run "rebuild graph" unless you confirm you want to regenerate all memories from the chat history.
### Recall cards are not displayed
- Confirm the target turn is a user message.
- Confirm `message.extra.bme_recall.injectionText` is not empty.
- Third-party themes must keep `#chat .mes` message nodes and stable turn-index attributes, such as `mesid`, `data-mesid`, or `data-message-id`.
- After enabling debug logs, search for `[ST-BME] Recall Card UI`.
### Direct Embedding fails
- Check the API URL and model name.
- Check the key.
- Check browser CORS.
- Prefer backend mode first.

View File

@@ -1,5 +1,7 @@
# 排障指南
**中文** · [English](troubleshooting.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 常见用户问题与处理方式,便于按症状快速定位。
### 面板打不开