docs: add English README + bilingual usage manual

2026-06-14 02:40:45 +08:00 · 2026-05-31 18:20:31 +00:00
parent c3023aff78
commit f67358e024
11 changed files with 633 additions and 5 deletions
--- a/docs/usage/configuration.en.md
+++ b/docs/usage/configuration.en.md
@@ -0,0 +1,204 @@
+# Configuration
+
+[中文](configuration.md) · **English**
+
+This page is split out from the [README](../../README.en.md) as the main ST-BME user configuration reference, preserving setting names, defaults, and tables for quick lookup by feature.
+
+### Memory LLM
+
+The memory LLM is used for:
+
+- Memory extraction.
+- Recall reranking.
+- Consolidation.
+- Compression.
+- Small summaries.
+- Summary rollup.
+- Reflection.
+- ENA Planner planning.
+
+Configuration options:
+
+- **Leave blank**
+  - Reuse the current SillyTavern chat model.
+
+- **Fill in OpenAI-compatible config**
+  - Use an independent model for memory tasks.
+  - Useful when you want to separate the main chat model from the background maintenance model.
+
+Security recommendations:
+
+- Do not publicly share exported `extension_settings` or browser storage that contains API keys.
+- Debug logs are off by default; enable them temporarily only when troubleshooting.
+
+### Embedding
+
+Embedding is the core of smart recall.
+
+#### Backend mode
+
+Backend mode is recommended first:
+
+- Reuse SillyTavern backend's embedding provider.
+- Usually avoids storing the embedding API key directly in the browser.
+- Can use sources already supported by SillyTavern, such as OpenAI, Cohere, Mistral, Ollama, LlamaCpp, and vLLM.
+
+#### Direct mode
+
+In direct mode, the browser requests the embedding service directly:
+
+- Requires filling in the API URL, key, and model.
+- May hit CORS restrictions.
+- Suitable for a self-hosted gateway or independent embedding service.
+
+> After switching embedding mode or model, run "rebuild vectors".
+
+### Extraction settings
+
+| Setting | Default | Description |
+| --- | --- | --- |
+| 每 N 条回复提取 | `1` | Trigger extraction every N assistant replies |
+| 提取上下文轮数 | `2` | Number of conversation rounds to look back during extraction |
+| 自动延后最新助手 | `false` | Allows the latest reply to stabilize before extraction |
+| Assistant 排除标签 | `think,analysis,reasoning` | Excludes reasoning tags by default |
+| 提取消息上限 | `0` | `0` means unlimited |
+| 提取 Prompt 结构模式 | `both` | Provides both transcript and structured messages |
+| 提取世界书模式 | `active` | Reuses the currently active world info context |
+| 包含故事时间 | `true` | Provides the story timeline during extraction |
+| 包含总结快照 | `true` | Provides active summaries during extraction |
+| 手动提取模式 | `pending` | Default extraction mode in the panel |
+
+### Recall settings
+
+| Setting | Default | Description |
+| --- | --- | --- |
+| 启用召回 | `true` | Automatically retrieve memories before generation |
+| 向量预筛 | `true` | Use embedding to find candidates first |
+| 图扩散 | `true` | Diffuse along graph relations to related nodes |
+| LLM 精排 | `true` | Let the LLM select final results from candidates |
+| 召回 Top-K | `20` | Vector prefilter count |
+| 最终节点上限 | `12` | Maximum number of nodes kept before injection |
+| 图扩散 Top-K | `100` | Graph diffusion candidate count |
+| LLM 候选池 | `30` | Candidate pool size for reranking |
+| 多意图拆分 | `true` | Split one input into multiple retrieval intents |
+| 上下文混合查询 | `true` | Blend the current input, previous assistant reply, and previous user message |
+| 词法增强 | `true` | Weight exact keyword matches |
+| 时序链接 | `true` | Mutually boost temporally nearby nodes |
+| 多样性采样 | `true` | Avoid overly homogeneous recall results |
+
+### Cognitive and spatial settings
+
+| Setting | Default | Description |
+| --- | --- | --- |
+| Scoped Memory | `true` | Enable scoped memory |
+| POV Memory | `true` | Enable character/user POV memory |
+| 区域目标 | `true` | Distinguish current region, adjacent regions, and global |
+| 认知记忆 | `true` | Enable subjective/objective cognitive attribution |
+| 空间邻接 | `true` | Allow adjacency relations between regions |
+| 故事时间线 | `true` | Enable story timeline tags |
+| 注入故事时间标签 | `true` | Hint the current story time in injection |
+| 软时间引导 | `true` | Guide by prompting, without forcing rewrites |
+
+### Maintenance settings
+
+| Setting | Default | Description |
+| --- | --- | --- |
+| 启用整合 | `true` | Similar/conflicting memory analysis and merge |
+| 整合阈值 | `0.85` | Similarity trigger threshold |
+| 启用小总结 | `true` | Compatible with the old `synopsis` name |
+| 启用层级总结 | `true` | Use a small summary + rollup summary system |
+| 小总结频率 | `3` | Generate a small summary every N extractions |
+| 总结折叠扇入 | `3` | Roll up summaries when this many exist at the same layer |
+| 启用智能触发 | `false` | Enhance extraction only in high-information scenes |
+| 启用主动遗忘 | `false` | Periodically lower the priority of low-value nodes |
+| 启用概率召回 | `false` | Allow a small number of weakly related memories to enter by probability |
+| 启用反思 | `true` | Periodically summarize long-term trends |
+| 启用自动压缩 | `true` | Compress similar memories by extraction cycle |
+
+### Task presets and regex cleanup
+
+Task preset types:
+
+- **`extract`**
+  - Memory extraction.
+
+- **`recall`**
+  - Recall reranking.
+
+- **`compress`**
+  - Memory compression.
+
+- **`synopsis`**
+  - Small summary generation.
+
+- **`summary_rollup`**
+  - Summary rollup.
+
+- **`reflection`**
+  - Long-term reflection.
+
+- **`consolidation`**
+  - Memory consolidation.
+
+- **`planner`**
+  - ENA Planner planning.
+
+Regex cleanup reduces polluted tags from entering extraction, recall, and injection:
+
+- `thinking` / `think` / `analysis` / `reasoning`
+- `choice`
+- `UpdateVariable`
+- `status_current_variable`
+- `StatusPlaceHolderImpl`
+
+Users can adjust global regex rules and task-local rules in "Task presets". When an empty rule set is explicitly saved, the plugin will not automatically add the default rules back.
+
+### ENA Planner
+
+ENA Planner is now integrated through the `planner` task preset. For deeper implementation and flow details, see the [ENA Planner feature doc](../features/ena-planner.md). It can use:
+
+- Character card blocks.
+- World info blocks.
+- Recent chat blocks.
+- BME recalled memory blocks.
+- Historical `<plot>` blocks.
+- Current player input blocks.
+
+Recommendations:
+
+- Configure the base API and enabled state in "Config → ENA Planner".
+- Adjust the planning prompt structure and generation parameters in "Config → Task presets → planner".
+
+### Hide old turns and render limit
+
+These are two separate features; for deeper implementation and boundary details, see the [Hide old turns and render limit feature doc](../features/hide-and-render.md):
+
+- **Hide old turns**
+  - Controls context tokens.
+  - Does not delete chat content.
+  - Uses SillyTavern's hide mechanism so earlier turns no longer participate in the main reply or ST-BME reads.
+
+- **Limit rendered chat turns**
+  - Reduces lag in very long chat UIs.
+  - Syncs to SillyTavern's `chat_truncation`.
+  - Only controls how many recent turns the frontend loads at most.
+  - It is not context hiding and is not message deletion.
+
+Important notes:
+
+- If you need to run "rerun extraction range" or full history recovery on very old turns, temporarily disable the render limit or increase the count and refresh.
+- When ST-BME detects that the current `context.chat` is likely only a recent N-turn render slice, it pauses destructive history recovery to avoid wrongly clearing the runtime graph.
+
+### Native acceleration
+
+Native acceleration is currently a gradual rollout capability. For deeper implementation and fallback strategy details, see the [Native acceleration feature doc](../features/native-acceleration.md). It covers:
+
+- Graph layout.
+- Persist Delta.
+- Snapshot Hydrate.
+
+Default strategy:
+
+- Automatically activates based on thresholds for node count, edge count, record count, structural changes, and serialized size.
+- `Fail-open` is enabled by default; when Native is unavailable or fails, ST-BME falls back to JS.
+- You can use "globally force-disable Native" to fall back to JS everywhere.
--- a/docs/usage/configuration.md
+++ b/docs/usage/configuration.md
@@ -1,5 +1,7 @@
 # 主要配置

+**中文** · [English](configuration.en.md)
+
 本文从 [README](../../README.md) 拆出 ST-BME 的主要用户配置说明，保留设置名称、默认值和表格，便于按功能查阅。

 ### 记忆 LLM
--- a/docs/usage/panel.en.md
+++ b/docs/usage/panel.en.md
@@ -0,0 +1,113 @@
+# Panel guide
+
+[中文](panel.md) · **English**
+
+This page is split out from the [README](../../README.en.md) as a user guide to the ST-BME panel areas, preserving the original item structure for daily lookup.
+
+### Overview
+
+- **Active nodes, edge connections, archived, fragmentation ratio**
+- **Current chat ID**
+- **History status**
+- **Vector status**
+- **Recent recovery**
+- **Recent extraction**
+- **Recent persistence**
+- **Recent vector**
+- **Recent recall**
+- **Cognitive / spatial status**
+
+### Tasks
+
+The tasks page is used to observe ST-BME's background task flow in realtime.
+
+- **Pipeline overview**
+  - Stage status for extraction, recall, persistence, vectors, and more.
+
+- **Task timeline**
+  - Timeline and stage results for recent tasks.
+
+- **Memory browser**
+  - Browse, filter, and inspect node details.
+
+- **Injection preview**
+  - View the currently constructed injection text and token estimate.
+
+- **Message tracing**
+  - Trace turns, extraction ranges, recall sources, and persistent records.
+
+- **Persistence**
+  - View diagnostics for IndexedDB, sync, recovery, sidecar, native hydrate, and more.
+
+### Actions
+
+- **Re-extract**
+  - `提取未处理`: only process assistant turns that have not been extracted yet.
+  - `重新提取范围`: rerun a specified range by start/end turn.
+
+- **Manual compression**
+  - Compress redundant or similar memories.
+
+- **Generate small summary**
+  - Generate a staged summary based on a recent source text window.
+
+- **Run summary rollup**
+  - Fold multiple active summaries into a higher-level summary.
+
+- **Rebuild summary state**
+  - Rebuild summary state from extraction batches.
+
+- **Force evolution**
+  - Let new memories actively affect old memories.
+
+- **Run forgetting**
+  - Lower the priority of long-unused nodes or archive them.
+
+- **Undo recent maintenance**
+  - Roll back the most recent reversible maintenance action.
+
+- **Rebuild vectors / Range rebuild / Direct re-embed**
+  - Rebuild node vectors to fix recall quality or inconsistencies after switching vector models.
+
+- **Export / import / rebuild graph**
+  - Graph management and dangerous operations.
+
+- **Persistence repair**
+  - Retry persistence, re-detect the graph, rebuild the local cache, and repair/compact the main sidecar.
+
+### Config
+
+The config page contains these workspaces:
+
+- **API config**
+  - Memory LLM.
+  - Embedding backend mode/direct mode.
+
+- **Feature toggles**
+  - Main capabilities such as extraction, recall, consolidation, summary, reflection, compression, forgetting, and probabilistic recall.
+  - Cloud storage mode.
+  - World info filtering.
+  - Hide old turns and limit rendered chat turns.
+
+- **Detailed parameters**
+  - Extraction frequency, context window, recall Top-K, graph diffusion, cognitive weights, maintenance thresholds, and more.
+
+- **Task presets**
+  - Prompt blocks, generation parameters, regex, world info, and EJS templates for each task type.
+
+- **ENA Planner**
+  - API, model, planning config, and task preset entry point for ENA Planner.
+
+- **Panel appearance**
+  - Theme, notification style, debug logs, and Native acceleration.
+
+- **Data cleanup**
+  - Cleanup entry points for local cache, legacy data, debug state, and more.
+
+### Graph area
+
+Desktop shows a realtime graph area. Mobile provides subview switching:
+
+- **Realtime graph**
+- **Cognitive view**
+- **Summary view**
--- a/docs/usage/panel.md
+++ b/docs/usage/panel.md
@@ -1,5 +1,7 @@
 # 面板导览

+**中文** · [English](panel.en.md)
+
 本文从 [README](../../README.md) 拆出 ST-BME 面板各区域的用户导览，保留原有条目结构，方便日常查找。

 ### 总览
--- a/docs/usage/storage-and-sync.en.md
+++ b/docs/usage/storage-and-sync.en.md
@@ -0,0 +1,49 @@
+# Storage & sync
+
+[中文](storage-and-sync.md) · **English**
+
+This page is split out from the [README](../../README.en.md) with ST-BME data storage, cloud mirroring, and persistent recall card notes; durable snapshot contract and forward-compat details are in the [storage and formats architecture doc](../architecture/storage-and-formats.md).
+
+### Local primary storage
+
+- Primary storage uses IndexedDB.
+- Databases are isolated per chat and named like `STBME_{chatId}`.
+- The hot path uses incremental commits to avoid replacing the whole graph.
+- On load, the graph is restored from the local database first.
+
+### Cloud mirroring
+
+Cloud sync uses SillyTavern's existing file API and requires no custom backend route.
+
+- Automatic mode:
+  - After local writes, sync according to the current mirroring logic.
+
+- Manual mode:
+  - Local writes still work normally.
+  - Does not write to the cloud automatically.
+  - Requires clicking "backup to cloud" or "fetch backup from cloud".
+
+### Compatibility and fallback
+
+- Old `chat_metadata.st_bme_graph` is only used as a migration and fallback source.
+- shadow snapshot and metadata-full are recoverable anchors, not the preferred primary storage.
+- tombstone is used to sync deletion state and prevent old data from coming back.
+- Plugin settings are stored in SillyTavern's `extension_settings.st_bme`.
+- Message-level recall is stored in the corresponding user message's `message.extra.bme_recall`.
+
+### Persistent recall cards
+
+User messages with valid `message.extra.bme_recall` display recall cards:
+
+- Expand to view the recall text.
+- View the recall subgraph.
+- Click nodes to view details.
+- Edit the injection text.
+- Delete persistent recall.
+- Re-run recall and overwrite the record.
+
+Priority:
+
+1. When a new recall succeeds in this round, use the new recall and write it back to the target user turn.
+2. When there is no new recall in this round, read persistent recall from the user turn corresponding to the current generation as fallback.
+3. When neither exists, clear the injection.
--- a/docs/usage/storage-and-sync.md
+++ b/docs/usage/storage-and-sync.md
@@ -1,5 +1,7 @@
 # 数据存储与同步

+**中文** · [English](storage-and-sync.en.md)
+
 本文从 [README](../../README.md) 拆出 ST-BME 的数据存储、云端镜像与持久召回卡片说明；durable snapshot contract 和 forward-compat 细节见 [存储与格式架构文档](../architecture/storage-and-formats.md)。

 ### 本地主存储
--- a/docs/usage/troubleshooting.en.md
+++ b/docs/usage/troubleshooting.en.md
@@ -0,0 +1,68 @@
+# Troubleshooting
+
+[中文](troubleshooting.md) · **English**
+
+This page is split out from the [README](../../README.en.md) with common ST-BME user issues and fixes, so you can locate problems by symptom.
+
+### Panel won't open
+
+- Refresh the SillyTavern page.
+- Confirm the extension directory contains `manifest.json`, `index.js`, and `style.css`.
+- Open the browser console and search for `[ST-BME]`.
+- Check whether another extension has overridden the top-left menu structure.
+
+### No automatic extraction
+
+- Confirm the plugin is enabled.
+- Confirm the current chat already has assistant replies.
+- Check "Overview → Recent extraction" and "Tasks → Pipeline overview".
+- Check whether the memory LLM is available.
+- If smart triggering is enabled, confirm the current content meets the trigger conditions.
+- If a restore lock or persistence loading is active, wait for the state to recover.
+
+### Poor recall quality
+
+- Configure or repair Embedding.
+- Run "rebuild vectors".
+- Check whether recall Top-K, final node limit, and LLM reranking are enabled.
+- Check whether nodes are too many or too scattered; you can run consolidation or compression.
+- Check the per-message recall card to confirm the actual injection content.
+
+### The model still sees too much content after old turns are hidden
+
+- "Limit rendered chat turns" only reduces frontend loading; it does not save tokens.
+- To actually control context, enable "hide old turns".
+- After changing the setting, click "re-apply current hiding".
+
+### Manual extraction says history recovery is paused
+
+This is usually because "limit rendered chat turns" is enabled, so the frontend currently loads only the latest N turns.
+
+How to handle it:
+
+1. Temporarily disable "limit rendered chat turns", or increase N enough to cover the range you need to process.
+2. Refresh the current chat.
+3. Then run "extract unprocessed" or "rerun extraction range".
+
+This is a protection mechanism; it does not mean the graph was lost.
+
+### Nodes suddenly look cleared
+
+- Refresh the page first.
+- If it recovers after refresh, it is usually a temporary runtime state inconsistency; the persisted graph was not lost.
+- Check "Overview → Recent recovery" and "Tasks → Persistence".
+- Do not immediately run "rebuild graph" unless you confirm you want to regenerate all memories from the chat history.
+
+### Recall cards are not displayed
+
+- Confirm the target turn is a user message.
+- Confirm `message.extra.bme_recall.injectionText` is not empty.
+- Third-party themes must keep `#chat .mes` message nodes and stable turn-index attributes, such as `mesid`, `data-mesid`, or `data-message-id`.
+- After enabling debug logs, search for `[ST-BME] Recall Card UI`.
+
+### Direct Embedding fails
+
+- Check the API URL and model name.
+- Check the key.
+- Check browser CORS.
+- Prefer backend mode first.
--- a/docs/usage/troubleshooting.md
+++ b/docs/usage/troubleshooting.md
@@ -1,5 +1,7 @@
 # 排障指南

+**中文** · [English](troubleshooting.en.md)
+
 本文从 [README](../../README.md) 拆出 ST-BME 常见用户问题与处理方式，便于按症状快速定位。

 ### 面板打不开