ST-BME (Bionic Memory Ecology) is a SillyTavern third-party frontend extension. It distills the characters, events, locations, rules, plot threads, reflections, and summaries that appear over a long chat into a visual memory graph, then automatically recalls the most relevant memories and injects them into the prompt before each generation.

Documentation

This README is a slim entry point. The details live in docs/:

What you want	Where to look
Usage: configuration, panel, troubleshooting, storage	`docs/usage/`
Architecture, control plane, data formats	`docs/architecture/`
Algorithm internals (retrieval / extraction / vectors)	`docs/algorithms/`
How each feature works and its boundaries	`docs/features/`
Development, testing, contribution conventions	`docs/contributing/`

Quick links: Configuration · Panel guide · Troubleshooting · Memory model · History safety

Developer docs (architecture / algorithms / features / contributing) are currently Chinese-only. The English docs cover the README and the docs/usage/ user manual.

Core capabilities

Automatic memory extraction — After each AI reply, ST-BME extracts structured nodes and relations from the conversation (characters, events, locations, rules, plot threads, reflections, subjective memories), using a default two-stage objective + subjective/POV commit pipeline and excluding reasoning tags like think/analysis/reasoning.
Multi-layer hybrid recall — Before generation, relevant memories are recalled through vector prefilter, graph diffusion, lexical boosting, multi-intent splitting, DPP diversity sampling, and optional LLM reranking; per-message persistent recall cards are supported.
Cognitive architecture — Character POV / user POV / objective world memory, spatial region weighting, and a story timeline.
Summarization & maintenance — Small summaries, summary rollup, reflection, consolidation, automatic compression, active forgetting — all logged and reversible.
Graph visualization — A built-in canvas force-directed graph with realtime / cognitive / summary views and a mobile view.
Task preset system — Extraction, recall, compression, summary, reflection, consolidation, and planning all run through a unified task profile, with regex, world info, and EJS rendering.
ENA Planner integration — Pre-send story planning, integrated into the config page and the planner task preset.
Persistence & sync — Local-first (IndexedDB), with cloud mirroring, backup/restore, rebuild, and repair.
History safety — Detects message deletion / edits / swipes, automatically rolls back affected batches and recovers from the change point; protects against truncated "render only the last N" views.
Long-chat optimization — Hide old turns to control tokens, limit rendered turns to reduce lag, and accelerate key computations with a Native/WASM rollout.

How it works

ST-BME can be understood as three pipelines: write (conversation → memory), read (memory → injection), and safety (history change → recovery).

flowchart LR
    subgraph Write["Write: conversation → memory"]
        A["AI reply"] --> B["Structured message preprocessing"]
        B --> C["LLM objective extraction + subjective/POV extraction"]
        C --> D["Nearest-neighbor reconciliation + cognitive scoping"]
        D --> E["Write graph + vector sync + timeline"]
        E --> F["Consolidate / compress / summarize / reflect"]
    end

    subgraph Read["Read: memory → injection"]
        G["User about to generate"] --> H["Multi-intent + context-blended query"]
        H --> I["Vector prefilter + graph diffusion + lexical boost"]
        I --> J["Cognitive boundary filter + hybrid scoring"]
        J --> K["Optional LLM rerank + bucketed injection"]
    end

    subgraph Safe["Safety: history change → recovery"]
        L["Delete / edit / swipe"] --> M["Message hash detection"]
        M --> N["Locate affected turns"]
        N --> O["Roll back batches and vectors"]
        O --> P["Re-extract from the change point"]
    end

    F -.-> G
    P -.-> E

Write: the conversation is normalized into structured messages (reasoning tags excluded by default) → the LLM emits structured graph operations → nodes are written, vectors synced, timeline updated → post-processing (consolidation, compression, summary, reflection, forgetting).
Read: resolve the recall target → vector prefilter + graph diffusion + lexical boost → rank and filter by fusing multiple signals → bucketed injection into the prompt, optionally writing a persistent recall card.
Safety: a hash is recorded for each processed message; when a history change is detected, ST-BME prefers rollback-and-replay from the maintenance log, falling back to a full rebuild only when a safe rollback is not possible.

Algorithm details (formulas, parameters, thresholds) are in docs/algorithms/; architecture and data paths are in docs/architecture/overview.md.

Installation

Option 1: install via SillyTavern extensions

Open SillyTavern → Extensions → Install third-party extension, and enter the repository URL:

https://github.com/Youzini-afk/ST-Bionic-Memory-Ecology

Refresh the page after installation.

Paste the repository root URL, not a GitHub sub-page URL.

Option 2: manual installation

cd SillyTavern/data/default-user/extensions/third-party
git clone https://github.com/Youzini-afk/ST-Bionic-Memory-Ecology.git st-bme

Then restart or refresh SillyTavern.

Quick start

Open the panel — Click "Memory Graph" in the top-left menu.
Enable the plugin — Config → Feature toggles, confirm the main switch is on.
Configure the model — Leave the memory LLM blank to reuse the current chat model; or fill in an independent OpenAI-compatible URL/key/model under "API config".
Configure embedding — Backend mode is recommended (reuses SillyTavern's configured vector provider); direct mode also works but you must handle CORS yourself.
Start chatting — Just chat normally. Extraction runs after each AI reply, and recall runs before the next generation.
Check results — "Overview" for status, "Tasks → Memory browser" for nodes, the graph area for the relation network; a recall card may appear under user messages.

Minimum viable setup: enable the plugin and ensure the current chat model works. Recall quality drops noticeably when embedding is unavailable, so configure it early.

See Configuration for full settings and Panel guide for what each panel area does.

Common actions

Action	Location	Description
Re-extract	Actions → Memory ops	Extract unprocessed turns or rerun a range
Manual compress	Actions → Memory ops	Merge redundant high-level nodes
Generate small summary	Actions → Memory ops	Produce a staged summary for the recent text window
Run summary rollup	Actions → Memory ops	Fold multiple active summaries into a higher-level one
Rebuild summary state	Actions → Memory ops	Rebuild summaryState from extraction batches
Force evolution	Actions → Memory ops	Let new memories actively affect old ones
Run forgetting	Actions → Memory ops	Archive or down-weight low-value nodes
Undo recent maintenance	Actions → Memory ops	Roll back the most recent reversible maintenance
Rebuild vectors	Actions → Vector ops	Rebuild all node embeddings
Range rebuild	Actions → Vector ops	Rebuild only nodes related to a turn range
Direct re-embed	Actions → Vector ops	Re-embed using the direct embedding config
Export / import / rebuild graph	Actions → Graph management	Graph management and destructive ops
Backup / restore cloud	Config → Cloud storage mode	Manually upload/restore in manual mode
Unhide all	Config → Hide old turns	Restore turns hidden by ST-BME

After switching embedding mode or model, run "Rebuild vectors". Per-action details and danger notes are in Configuration and Panel guide.

Data storage & history safety (highlights)

Local-first: primary storage uses IndexedDB, isolated per chat (STBME_{chatId}), with incremental commits on the hot path.
Cloud mirroring: reuses SillyTavern's file API, supports auto/manual modes, requires no custom backend.
History safety: detects delete/edit/swipe, prefers rollback-and-replay, falls back to full rebuild when needed; protects against render-truncated views to avoid wrongly clearing the graph.
Forward compatibility: durable snapshots have a frozen top-level shape, tolerant parsing, and upgrade-on-read — extending the data structure means "add a field", not a big migration.

See Storage & sync, History safety, and Data formats & forward compatibility.

Having trouble?

Step-by-step troubleshooting for common situations (panel won't open, no auto-extraction, poor recall, nodes appear cleared, recall cards missing, direct embedding fails, etc.) is in Troubleshooting.

Known limitations

Memory quality depends on the LLM — if the extraction model misunderstands, the memory will be wrong too.
Embedding sets the recall floor — without high-quality vectors, recall leans more on lexical and graph structure.
Direct mode may be affected by CORS — browser security policy may block requests.
Very long chats still have a cost — hiding/render limits/summary rollup reduce pressure but can't eliminate all overhead.
History recovery prioritizes correctness — it falls back to a full rebuild when the log is insufficient, which can be slow.
Third-party themes may affect recall card mounting — cards may skip mounting if a theme removes the standard message DOM or turn-index attributes.
Native acceleration is a rollout capability — it fails open to JS by default and can be force-disabled in the panel.

License

AGPLv3 — see LICENSE.

Description

ST-BME（Bionic Memory Ecology）是一个 SillyTavern 第三方前端扩展。它会把长期聊天中出现的角色、事件、地点、规则、主线、反思和总结抽取成一张可视化记忆图谱，并在下一轮生成前自动召回最相关的记忆注入 prompt。

Readme AGPL-3.0 12 MiB