docs: add English README + bilingual usage manual

This commit is contained in:
youzini
2026-05-31 18:20:31 +00:00
parent c3023aff78
commit f67358e024
11 changed files with 633 additions and 5 deletions

View File

@@ -8,12 +8,12 @@
## 文档地图
### usage/ — 使用手册
面向用户:"怎么配、怎么用、出问题怎么查"。从精简后的 README 下沉的详细内容。
面向用户:"怎么配、怎么用、出问题怎么查"。从精简后的 README 下沉的详细内容。中英双语(`.md` 中文 / `.en.md` English
- [`configuration.md`](usage/configuration.md) — 完整配置参考:记忆 LLM、Embedding、提取/召回/认知/维护设置、任务预设、ENA、隐藏/渲染、Native
- [`panel.md`](usage/panel.md) — 面板导览:总览、任务、操作、配置、图谱区域
- [`troubleshooting.md`](usage/troubleshooting.md) — 排障指南
- [`storage-and-sync.md`](usage/storage-and-sync.md) — 数据存储、云端镜像、兼容兜底、持久召回卡片
- [`configuration.md`](usage/configuration.md) · [EN](usage/configuration.en.md) — 完整配置参考:记忆 LLM、Embedding、提取/召回/认知/维护设置、任务预设、ENA、隐藏/渲染、Native
- [`panel.md`](usage/panel.md) · [EN](usage/panel.en.md) — 面板导览:总览、任务、操作、配置、图谱区域
- [`troubleshooting.md`](usage/troubleshooting.md) · [EN](usage/troubleshooting.en.md) — 排障指南
- [`storage-and-sync.md`](usage/storage-and-sync.md) · [EN](usage/storage-and-sync.en.md) — 数据存储、云端镜像、兼容兜底、持久召回卡片
### architecture/ — 架构与控制平面
跨文件的结构、数据路径、不变量。这些内容变化慢,是理解"为什么这样组织"的入口。
@@ -56,3 +56,9 @@
1. **离代码越近,腐烂越慢。** 单个函数的 API 细节留在模块头注释里(改代码自然会改它),不抄进这里。本目录只写"跨文件的算法原理、不变量、功能行为"。
2. **不写一改就过期的内容。** 避免"某函数第几行做什么"这种描述;算法文档引用文件位置时,描述的是"哪个算法在哪个文件",而非逐行。
3. **改了行为就更新对应文档。** 算法参数、不变量、功能边界发生变化时,更新这里;纯重构(不改行为)通常不需要动文档。
## 双语约定
- **中文为源,英文跟随。** `.md` 是中文权威版,`.en.md` 是英文翻译。改文档先改中文 `.md`,再同步对应 `.en.md`
- 目前英文覆盖范围:根 `README``docs/usage/` 用户手册。`architecture/` / `algorithms/` / `features/` / `contributing/` 暂为中文,按需再加英文。
- 英文文件里指向其它有 `.en.md` 的文档时,链到英文兄弟文件;指向暂无英文版的开发者文档时,链到中文版即可。

View File

@@ -0,0 +1,204 @@
# Configuration
[中文](configuration.md) · **English**
This page is split out from the [README](../../README.en.md) as the main ST-BME user configuration reference, preserving setting names, defaults, and tables for quick lookup by feature.
### Memory LLM
The memory LLM is used for:
- Memory extraction.
- Recall reranking.
- Consolidation.
- Compression.
- Small summaries.
- Summary rollup.
- Reflection.
- ENA Planner planning.
Configuration options:
- **Leave blank**
- Reuse the current SillyTavern chat model.
- **Fill in OpenAI-compatible config**
- Use an independent model for memory tasks.
- Useful when you want to separate the main chat model from the background maintenance model.
Security recommendations:
- Do not publicly share exported `extension_settings` or browser storage that contains API keys.
- Debug logs are off by default; enable them temporarily only when troubleshooting.
### Embedding
Embedding is the core of smart recall.
#### Backend mode
Backend mode is recommended first:
- Reuse SillyTavern backend's embedding provider.
- Usually avoids storing the embedding API key directly in the browser.
- Can use sources already supported by SillyTavern, such as OpenAI, Cohere, Mistral, Ollama, LlamaCpp, and vLLM.
#### Direct mode
In direct mode, the browser requests the embedding service directly:
- Requires filling in the API URL, key, and model.
- May hit CORS restrictions.
- Suitable for a self-hosted gateway or independent embedding service.
> After switching embedding mode or model, run "rebuild vectors".
### Extraction settings
| Setting | Default | Description |
| --- | --- | --- |
| 每 N 条回复提取 | `1` | Trigger extraction every N assistant replies |
| 提取上下文轮数 | `2` | Number of conversation rounds to look back during extraction |
| 自动延后最新助手 | `false` | Allows the latest reply to stabilize before extraction |
| Assistant 排除标签 | `think,analysis,reasoning` | Excludes reasoning tags by default |
| 提取消息上限 | `0` | `0` means unlimited |
| 提取 Prompt 结构模式 | `both` | Provides both transcript and structured messages |
| 提取世界书模式 | `active` | Reuses the currently active world info context |
| 包含故事时间 | `true` | Provides the story timeline during extraction |
| 包含总结快照 | `true` | Provides active summaries during extraction |
| 手动提取模式 | `pending` | Default extraction mode in the panel |
### Recall settings
| Setting | Default | Description |
| --- | --- | --- |
| 启用召回 | `true` | Automatically retrieve memories before generation |
| 向量预筛 | `true` | Use embedding to find candidates first |
| 图扩散 | `true` | Diffuse along graph relations to related nodes |
| LLM 精排 | `true` | Let the LLM select final results from candidates |
| 召回 Top-K | `20` | Vector prefilter count |
| 最终节点上限 | `12` | Maximum number of nodes kept before injection |
| 图扩散 Top-K | `100` | Graph diffusion candidate count |
| LLM 候选池 | `30` | Candidate pool size for reranking |
| 多意图拆分 | `true` | Split one input into multiple retrieval intents |
| 上下文混合查询 | `true` | Blend the current input, previous assistant reply, and previous user message |
| 词法增强 | `true` | Weight exact keyword matches |
| 时序链接 | `true` | Mutually boost temporally nearby nodes |
| 多样性采样 | `true` | Avoid overly homogeneous recall results |
### Cognitive and spatial settings
| Setting | Default | Description |
| --- | --- | --- |
| Scoped Memory | `true` | Enable scoped memory |
| POV Memory | `true` | Enable character/user POV memory |
| 区域目标 | `true` | Distinguish current region, adjacent regions, and global |
| 认知记忆 | `true` | Enable subjective/objective cognitive attribution |
| 空间邻接 | `true` | Allow adjacency relations between regions |
| 故事时间线 | `true` | Enable story timeline tags |
| 注入故事时间标签 | `true` | Hint the current story time in injection |
| 软时间引导 | `true` | Guide by prompting, without forcing rewrites |
### Maintenance settings
| Setting | Default | Description |
| --- | --- | --- |
| 启用整合 | `true` | Similar/conflicting memory analysis and merge |
| 整合阈值 | `0.85` | Similarity trigger threshold |
| 启用小总结 | `true` | Compatible with the old `synopsis` name |
| 启用层级总结 | `true` | Use a small summary + rollup summary system |
| 小总结频率 | `3` | Generate a small summary every N extractions |
| 总结折叠扇入 | `3` | Roll up summaries when this many exist at the same layer |
| 启用智能触发 | `false` | Enhance extraction only in high-information scenes |
| 启用主动遗忘 | `false` | Periodically lower the priority of low-value nodes |
| 启用概率召回 | `false` | Allow a small number of weakly related memories to enter by probability |
| 启用反思 | `true` | Periodically summarize long-term trends |
| 启用自动压缩 | `true` | Compress similar memories by extraction cycle |
### Task presets and regex cleanup
Task preset types:
- **`extract`**
- Memory extraction.
- **`recall`**
- Recall reranking.
- **`compress`**
- Memory compression.
- **`synopsis`**
- Small summary generation.
- **`summary_rollup`**
- Summary rollup.
- **`reflection`**
- Long-term reflection.
- **`consolidation`**
- Memory consolidation.
- **`planner`**
- ENA Planner planning.
Regex cleanup reduces polluted tags from entering extraction, recall, and injection:
- `thinking` / `think` / `analysis` / `reasoning`
- `choice`
- `UpdateVariable`
- `status_current_variable`
- `StatusPlaceHolderImpl`
Users can adjust global regex rules and task-local rules in "Task presets". When an empty rule set is explicitly saved, the plugin will not automatically add the default rules back.
### ENA Planner
ENA Planner is now integrated through the `planner` task preset. For deeper implementation and flow details, see the [ENA Planner feature doc](../features/ena-planner.md). It can use:
- Character card blocks.
- World info blocks.
- Recent chat blocks.
- BME recalled memory blocks.
- Historical `<plot>` blocks.
- Current player input blocks.
Recommendations:
- Configure the base API and enabled state in "Config → ENA Planner".
- Adjust the planning prompt structure and generation parameters in "Config → Task presets → planner".
### Hide old turns and render limit
These are two separate features; for deeper implementation and boundary details, see the [Hide old turns and render limit feature doc](../features/hide-and-render.md):
- **Hide old turns**
- Controls context tokens.
- Does not delete chat content.
- Uses SillyTavern's hide mechanism so earlier turns no longer participate in the main reply or ST-BME reads.
- **Limit rendered chat turns**
- Reduces lag in very long chat UIs.
- Syncs to SillyTavern's `chat_truncation`.
- Only controls how many recent turns the frontend loads at most.
- It is not context hiding and is not message deletion.
Important notes:
- If you need to run "rerun extraction range" or full history recovery on very old turns, temporarily disable the render limit or increase the count and refresh.
- When ST-BME detects that the current `context.chat` is likely only a recent N-turn render slice, it pauses destructive history recovery to avoid wrongly clearing the runtime graph.
### Native acceleration
Native acceleration is currently a gradual rollout capability. For deeper implementation and fallback strategy details, see the [Native acceleration feature doc](../features/native-acceleration.md). It covers:
- Graph layout.
- Persist Delta.
- Snapshot Hydrate.
Default strategy:
- Automatically activates based on thresholds for node count, edge count, record count, structural changes, and serialized size.
- `Fail-open` is enabled by default; when Native is unavailable or fails, ST-BME falls back to JS.
- You can use "globally force-disable Native" to fall back to JS everywhere.

View File

@@ -1,5 +1,7 @@
# 主要配置
**中文** · [English](configuration.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 的主要用户配置说明,保留设置名称、默认值和表格,便于按功能查阅。
### 记忆 LLM

113
docs/usage/panel.en.md Normal file
View File

@@ -0,0 +1,113 @@
# Panel guide
[中文](panel.md) · **English**
This page is split out from the [README](../../README.en.md) as a user guide to the ST-BME panel areas, preserving the original item structure for daily lookup.
### Overview
- **Active nodes, edge connections, archived, fragmentation ratio**
- **Current chat ID**
- **History status**
- **Vector status**
- **Recent recovery**
- **Recent extraction**
- **Recent persistence**
- **Recent vector**
- **Recent recall**
- **Cognitive / spatial status**
### Tasks
The tasks page is used to observe ST-BME's background task flow in realtime.
- **Pipeline overview**
- Stage status for extraction, recall, persistence, vectors, and more.
- **Task timeline**
- Timeline and stage results for recent tasks.
- **Memory browser**
- Browse, filter, and inspect node details.
- **Injection preview**
- View the currently constructed injection text and token estimate.
- **Message tracing**
- Trace turns, extraction ranges, recall sources, and persistent records.
- **Persistence**
- View diagnostics for IndexedDB, sync, recovery, sidecar, native hydrate, and more.
### Actions
- **Re-extract**
- `提取未处理`: only process assistant turns that have not been extracted yet.
- `重新提取范围`: rerun a specified range by start/end turn.
- **Manual compression**
- Compress redundant or similar memories.
- **Generate small summary**
- Generate a staged summary based on a recent source text window.
- **Run summary rollup**
- Fold multiple active summaries into a higher-level summary.
- **Rebuild summary state**
- Rebuild summary state from extraction batches.
- **Force evolution**
- Let new memories actively affect old memories.
- **Run forgetting**
- Lower the priority of long-unused nodes or archive them.
- **Undo recent maintenance**
- Roll back the most recent reversible maintenance action.
- **Rebuild vectors / Range rebuild / Direct re-embed**
- Rebuild node vectors to fix recall quality or inconsistencies after switching vector models.
- **Export / import / rebuild graph**
- Graph management and dangerous operations.
- **Persistence repair**
- Retry persistence, re-detect the graph, rebuild the local cache, and repair/compact the main sidecar.
### Config
The config page contains these workspaces:
- **API config**
- Memory LLM.
- Embedding backend mode/direct mode.
- **Feature toggles**
- Main capabilities such as extraction, recall, consolidation, summary, reflection, compression, forgetting, and probabilistic recall.
- Cloud storage mode.
- World info filtering.
- Hide old turns and limit rendered chat turns.
- **Detailed parameters**
- Extraction frequency, context window, recall Top-K, graph diffusion, cognitive weights, maintenance thresholds, and more.
- **Task presets**
- Prompt blocks, generation parameters, regex, world info, and EJS templates for each task type.
- **ENA Planner**
- API, model, planning config, and task preset entry point for ENA Planner.
- **Panel appearance**
- Theme, notification style, debug logs, and Native acceleration.
- **Data cleanup**
- Cleanup entry points for local cache, legacy data, debug state, and more.
### Graph area
Desktop shows a realtime graph area. Mobile provides subview switching:
- **Realtime graph**
- **Cognitive view**
- **Summary view**

View File

@@ -1,5 +1,7 @@
# 面板导览
**中文** · [English](panel.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 面板各区域的用户导览,保留原有条目结构,方便日常查找。
### 总览

View File

@@ -0,0 +1,49 @@
# Storage & sync
[中文](storage-and-sync.md) · **English**
This page is split out from the [README](../../README.en.md) with ST-BME data storage, cloud mirroring, and persistent recall card notes; durable snapshot contract and forward-compat details are in the [storage and formats architecture doc](../architecture/storage-and-formats.md).
### Local primary storage
- Primary storage uses IndexedDB.
- Databases are isolated per chat and named like `STBME_{chatId}`.
- The hot path uses incremental commits to avoid replacing the whole graph.
- On load, the graph is restored from the local database first.
### Cloud mirroring
Cloud sync uses SillyTavern's existing file API and requires no custom backend route.
- Automatic mode:
- After local writes, sync according to the current mirroring logic.
- Manual mode:
- Local writes still work normally.
- Does not write to the cloud automatically.
- Requires clicking "backup to cloud" or "fetch backup from cloud".
### Compatibility and fallback
- Old `chat_metadata.st_bme_graph` is only used as a migration and fallback source.
- shadow snapshot and metadata-full are recoverable anchors, not the preferred primary storage.
- tombstone is used to sync deletion state and prevent old data from coming back.
- Plugin settings are stored in SillyTavern's `extension_settings.st_bme`.
- Message-level recall is stored in the corresponding user message's `message.extra.bme_recall`.
### Persistent recall cards
User messages with valid `message.extra.bme_recall` display recall cards:
- Expand to view the recall text.
- View the recall subgraph.
- Click nodes to view details.
- Edit the injection text.
- Delete persistent recall.
- Re-run recall and overwrite the record.
Priority:
1. When a new recall succeeds in this round, use the new recall and write it back to the target user turn.
2. When there is no new recall in this round, read persistent recall from the user turn corresponding to the current generation as fallback.
3. When neither exists, clear the injection.

View File

@@ -1,5 +1,7 @@
# 数据存储与同步
**中文** · [English](storage-and-sync.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 的数据存储、云端镜像与持久召回卡片说明durable snapshot contract 和 forward-compat 细节见 [存储与格式架构文档](../architecture/storage-and-formats.md)。
### 本地主存储

View File

@@ -0,0 +1,68 @@
# Troubleshooting
[中文](troubleshooting.md) · **English**
This page is split out from the [README](../../README.en.md) with common ST-BME user issues and fixes, so you can locate problems by symptom.
### Panel won't open
- Refresh the SillyTavern page.
- Confirm the extension directory contains `manifest.json`, `index.js`, and `style.css`.
- Open the browser console and search for `[ST-BME]`.
- Check whether another extension has overridden the top-left menu structure.
### No automatic extraction
- Confirm the plugin is enabled.
- Confirm the current chat already has assistant replies.
- Check "Overview → Recent extraction" and "Tasks → Pipeline overview".
- Check whether the memory LLM is available.
- If smart triggering is enabled, confirm the current content meets the trigger conditions.
- If a restore lock or persistence loading is active, wait for the state to recover.
### Poor recall quality
- Configure or repair Embedding.
- Run "rebuild vectors".
- Check whether recall Top-K, final node limit, and LLM reranking are enabled.
- Check whether nodes are too many or too scattered; you can run consolidation or compression.
- Check the per-message recall card to confirm the actual injection content.
### The model still sees too much content after old turns are hidden
- "Limit rendered chat turns" only reduces frontend loading; it does not save tokens.
- To actually control context, enable "hide old turns".
- After changing the setting, click "re-apply current hiding".
### Manual extraction says history recovery is paused
This is usually because "limit rendered chat turns" is enabled, so the frontend currently loads only the latest N turns.
How to handle it:
1. Temporarily disable "limit rendered chat turns", or increase N enough to cover the range you need to process.
2. Refresh the current chat.
3. Then run "extract unprocessed" or "rerun extraction range".
This is a protection mechanism; it does not mean the graph was lost.
### Nodes suddenly look cleared
- Refresh the page first.
- If it recovers after refresh, it is usually a temporary runtime state inconsistency; the persisted graph was not lost.
- Check "Overview → Recent recovery" and "Tasks → Persistence".
- Do not immediately run "rebuild graph" unless you confirm you want to regenerate all memories from the chat history.
### Recall cards are not displayed
- Confirm the target turn is a user message.
- Confirm `message.extra.bme_recall.injectionText` is not empty.
- Third-party themes must keep `#chat .mes` message nodes and stable turn-index attributes, such as `mesid`, `data-mesid`, or `data-message-id`.
- After enabling debug logs, search for `[ST-BME] Recall Card UI`.
### Direct Embedding fails
- Check the API URL and model name.
- Check the key.
- Check browser CORS.
- Prefer backend mode first.

View File

@@ -1,5 +1,7 @@
# 排障指南
**中文** · [English](troubleshooting.en.md)
本文从 [README](../../README.md) 拆出 ST-BME 常见用户问题与处理方式,便于按症状快速定位。
### 面板打不开