API reference¶

Flat catalogue of every public class, method, and property in the Sauti.* namespaces. Each member links to the line in the source where it is defined.

Scope

This reference covers public members only. Private fields, internal helpers, and the test-only FakeRagBackend from SautiRagTests.cs are omitted. For private internals, read the source files directly — they are heavily commented.

Namespace map¶

Namespace	Assembly	What lives here
`Sauti.Memory`	`Sauti.Runtime`	Layer 2 and Layer 3 of the three-layer memory architecture.
`Sauti.Tts`	`Sauti.Runtime`	Kokoro ONNX runner + English G2P fallback.
`Sauti.Editor.Rag`	`Sauti.Editor`	Knowledge-base chunker, MiniLM embedder, WordPiece tokeniser, RAG database builder (Editor-only menu).
`Sauti.Experiments.*`	per-experiment	Reference MonoBehaviours under `experiments/`. Not part of the runtime API surface.

`Sauti.Memory` namespace¶

`TemporaryMemory`¶

Static class. Pure C# — no UnityEngine dependency. Unit-testable headlessly.

Source: Assets/Sauti/Runtime/Scripts/TemporaryMemory.cs:18

Member	Signature	Description
`Set`	`static void Set(string key, string value)`	Insert or overwrite a session-scoped key/value fact.
`Clear`	`static void Clear()`	Wipe every fact. Call on scene unload / app exit.
`BuildPromptBlock`	`static string BuildPromptBlock()`	Render facts as `"Known facts about this session: k1=v1, k2=v2.\n"`. Returns empty string when no facts are set. Designed to be `Append`-able to the prompt assembler.

Spec: see Memory layers — Layer 2 and voice_ai_architecture.md § 4.2.

`ISautiRagBackend`¶

Interface. The injection seam that lets SautiRag swap LLMUnity's DBSearch for any other vector backend (fake, custom ONNX cosine search, on-disk flat index, etc.).

Source: Assets/Sauti/Runtime/Scripts/ISautiRagBackend.cs:15

Member	Signature	Description
`IsLoaded`	`bool { get; }`	True once a database has been loaded into memory.
`LoadAsync`	`Task LoadAsync(string path)`	Load a pre-built vector database from disk.
`SearchAsync`	`Task<(string[] chunks, float[] scores)> SearchAsync(string query, int numResults)`	Return the top `numResults` chunks most similar to `query`, paired with cosine-similarity scores. Empty arrays if not loaded.

See Extending Sauti — ISautiRagBackend for a custom-implementation walkthrough.

`LlmUnityRagBackend`¶

Default ISautiRagBackend implementation. Delegates to LLMUnity's RAG MonoBehaviour façade (v3.0.3).

Source: Assets/Sauti/Runtime/Scripts/LlmUnityRagBackend.cs:32

Member	Signature	Description
ctor	`LlmUnityRagBackend(LLMUnity.RAG rag)`	Take a pre-initialised `RAG` component. Caller must run `AddComponent<RAG>()` + `rag.Init(SearchMethods.DBSearch, ChunkingMethods.NoChunking, llm)` before passing it here.
`IsLoaded`	`bool { get; }`	True after a successful `LoadAsync` call.
`LoadAsync`	`Task LoadAsync(string path)`	Awaits `RAG.Load(string)` and throws `InvalidOperationException` if it returns `false`.
`SearchAsync`	`Task<(string[] chunks, float[] scores)> SearchAsync(string query, int numResults)`	Awaits `RAG.Search(query, numResults)` and returns the result tuple.

Compile-time gate

LlmUnityRagBackend is wrapped in #if SAUTI_LLMUNITY_AVAILABLE. The fallback stub throws InvalidOperationException so the assembly compiles when LLMUnity is not yet imported. Define the symbol in Project Settings -> Player -> Scripting Define Symbols once you have wired LLMUnity into Sauti.Runtime.asmdef's references.

`SautiRag`¶

Sealed class. Public façade for Layer 3 (RAG). Wraps an ISautiRagBackend so consumers get a stable surface even if the underlying engine swaps.

Source: Assets/Sauti/Runtime/Scripts/SautiRag.cs:21

Member	Signature	Description
`MinNumResults`	`const int = 1`	Lower clamp on the top-K parameter.
`MaxNumResults`	`const int = 50`	Upper clamp on the top-K parameter. Absurdly large requests blow LLM context budgets.
`DefaultNumResults`	`const int = 3`	Default top-K. Matches `voice_ai_architecture.md § 4.3`.
ctor	`SautiRag(ISautiRagBackend backend)`	Inject a backend. Throws `ArgumentNullException` on a null backend.
`IsLoaded`	`bool { get; }`	Delegates to the backend.
`LoadAsync`	`Task LoadAsync(string path)`	Throws `FileNotFoundException` if `path` doesn't exist before forwarding to the backend.
`SearchAsync`	`Task<(string[] chunks, float[] scores)> SearchAsync(string query, int numResults = DefaultNumResults)`	Returns empty arrays for blank query or unloaded backend. Clamps `numResults` to `[MinNumResults, MaxNumResults]`.

Usage:

var rag = new SautiRag(new LlmUnityRagBackend(ragComponent));
await rag.LoadAsync(Path.Combine(Application.streamingAssetsPath, "VoiceAI/rag/knowledge.db"));
(string[] chunks, float[] scores) = await rag.SearchAsync(userQuery, numResults: 3);

`Sauti.Tts` namespace¶

`KokoroTtsRunner`¶

Sealed class, IDisposable. Hand-authored Kokoro-82M ONNX TTS runner. Modelled on the raw Microsoft.ML.OnnxRuntime.InferenceSession pattern from SupertonicTTS Helper.cs (the only verified raw-ORT TTS sample in asus4/onnxruntime-unity-examples).

Source: Assets/Sauti/Runtime/Scripts/Tts/KokoroTtsRunner.cs:66

Member	Signature	Description
`DefaultSampleRate`	`const int = 24000`	Sample rate of generated PCM. Per the upstream Kokoro-82M model card.
`StyleVectorDim`	`const int = 256`	Style vector dim. Each voice `.bin` is shape `(-1, 1, 256)`.
`MaxTokenSequence`	`const int = 512`	Maximum token sequence the model accepts including pad wrappers. Voice files ship 512 rows so the longest unwrapped sequence is 511 tokens.
ctor	`KokoroTtsRunner(string modelPath, string tokenizerPath, string voicesDirectoryPath)`	Argument validation only — the ONNX session and voice scan happen lazily on first synth call. `tokenizerPath` may be null/missing (the runner falls back to an embedded vocab).
`SampleRate`	`int { get; }`	Returns `DefaultSampleRate`. Surfaced as a property so callers wiring an `AudioClip` don't hard-code 24 kHz.
`AvailableVoiceIds`	`IReadOnlyList<string> { get; }`	Voice ids discovered from the voices directory (filename without `.bin`). Populated on first use.
`SynthesizeAsync`	`Task<float[]> SynthesizeAsync(string text, string voiceId, CancellationToken ct = default)`	Phonemise `text` via `EnglishG2P`, then call `SynthesizeFromPhonemesAsync`. Returns mono float PCM in `[-1, 1]` at `SampleRate`.
`SynthesizeFromPhonemesAsync`	`Task<float[]> SynthesizeFromPhonemesAsync(string phonemes, string voiceId, float speed = 1.0f, CancellationToken ct = default)`	Synthesise from a pre-phonemised IPA string. Each character must be in the Kokoro 177-entry vocab; unknown characters drop silently.
`Dispose`	`void Dispose()`	Disposes the underlying `InferenceSession` and `SessionOptions`; clears the voice cache.

Thread safety

SynthesizeAsync is not concurrent-safe — the underlying InferenceSession is single-use. Wrap calls in your own queue if you need parallel synthesis.

The discovery of ONNX input names (input_ids, style, speed) uses a dynamic, rank-based pattern so a re-export with capitalisation drift doesn't silently mis-fire. See the source for the full discovery logic.

`EnglishG2P`¶

Static class. Pure-C# best-effort English grapheme-to-phoneme converter for Kokoro. Marked [UNVERIFIED] in source — not a faithful reproduction of misaki or espeak-ng.

Source: Assets/Sauti/Runtime/Scripts/Tts/EnglishG2P.cs:37

Member	Signature	Description
`GraphemesToPhonemes`	`static string[] GraphemesToPhonemes(string englishText)`	Convert to an array of ARPABet-flavoured phoneme tokens (CMU-style).
`GraphemesToPhonemeString`	`static string GraphemesToPhonemeString(string englishText)`	Convenience: returns a single IPA string ready for the Kokoro tokeniser. Each ARPABet token is converted to its primary IPA equivalent.

Quality

Out-of-distribution words will sound robotic or wrong. The fallback ships a ~120-word common dictionary plus a per-letter spell-out for unknowns. For production-quality input, phonemise externally (misaki / espeak-ng) and call SynthesizeFromPhonemesAsync directly.

`Sauti.Editor.Rag` namespace¶

All members below live under the Sauti.Editor assembly and are not available to runtime code. They are the offline build pipeline that produces knowledge.db.

`KnowledgeChunk`¶

Plain data record. One source-derived chunk ready for embedding.

Source: Assets/Sauti/Editor/KnowledgeBaseChunker.cs:19

Field	Type	Description
`DocId`	`string`	Filename-stem-derived id, lowercased + sanitised.
`Title`	`string`	First non-blank line of the source body, with leading `#` chars stripped.
`Text`	`string`	The chunk body itself.
`SourceRelativePath`	`string`	Path relative to the `knowledge-base/` root.
`ChunkIndexWithinDoc`	`int`	0-indexed position of this chunk within its source document.

`KnowledgeBaseChunker`¶

Static class. Walks knowledge-base/, opens each .md / .txt body, splits into ~750-char chunks at paragraph boundaries. Pure C# — no Unity dependency.

Source: Assets/Sauti/Editor/KnowledgeBaseChunker.cs:28

Member	Signature	Description
`TargetChunkChars`	`const int = 750`	Target chunk length in characters. ~200 English tokens at ~3.7 chars/token.
`MaxChunkChars`	`const int = 1500`	Hard upper bound — a single sentence may overrun if it exceeds this on its own.
`EnumerateSourceFiles`	`static IReadOnlyList<string> EnumerateSourceFiles(string rootDir)`	Recursively list every `.md`/`.txt` file under `rootDir`, excluding `README.md` (case-sensitive). Stable lexical order.
`ChunkBody`	`static IReadOnlyList<string> ChunkBody(string body)`	Split into paragraph-boundary chunks. Never splits mid-paragraph unless a single paragraph exceeds `MaxChunkChars`.
`ExtractTitle`	`static string ExtractTitle(string body, string fallback)`	First non-blank line, stripped of leading `#`. Returns `fallback` if no usable line found.
`DeriveDocId`	`static string DeriveDocId(string filePath)`	Lowercase filename stem; non-`[a-z0-9_-]` chars collapse to `-`.
`ChunkFile`	`static IReadOnlyList<KnowledgeChunk> ChunkFile(string filePath, string rootDir)`	Read a single file and emit its chunks. High-level orchestration entry point.

`IRagEmbedder`¶

Interface. Encoder for both knowledge-base chunks (offline build) and runtime queries. The same encoder for both is mandatory — mixing encoders breaks semantic similarity.

Source: Assets/Sauti/Editor/IRagEmbedder.cs:13

Member	Signature	Description
`Dimensions`	`int { get; }`	Output dimensionality. `all-MiniLM-L6-v2` = 384.
`EmbedAsync`	`Task<float[]> EmbedAsync(string text)`	Encode a single string.
`EmbedBatchAsync`	`Task<float[][]> EmbedBatchAsync(string[] texts)`	Encode an array of strings.

See Extending Sauti — IRagEmbedder for a custom-implementation walkthrough.

`MiniLmRagEmbedder`¶

Sealed class, IDisposable. Default IRagEmbedder. Hand-authored against raw Microsoft.ML.OnnxRuntime.InferenceSession.

Source: Assets/Sauti/Editor/MiniLmRagEmbedder.cs:44

Member	Signature	Description
`OutputDimensions`	`const int = 384`	MiniLM hidden width.
`DefaultMaxSequenceLength`	`const int = 128`	Token cap per encode call.
ctor (3-arg)	`MiniLmRagEmbedder(string modelPath, string vocabPath, int maxSequenceLength = DefaultMaxSequenceLength)`	Explicit model + vocab paths. Throws if either file is missing.
ctor (1-arg)	`MiniLmRagEmbedder(string modelPath)`	Derives `vocab.txt` path by sibling-folder lookup against `modelPath`.
`Dimensions`	`int { get; }`	Returns `OutputDimensions`.
`EmbedAsync`	`Task<float[]> EmbedAsync(string text)`	Tokenise -> ONNX run -> attention-mask-weighted mean-pool -> L2 normalise. Returns a unit-length 384-dim vector.
`EmbedBatchAsync`	`Task<float[][]> EmbedBatchAsync(string[] texts)`	Per-text loop over `EmbedAsync`.
`Dispose`	`void Dispose()`	Disposes the underlying `InferenceSession` and `SessionOptions`.

Pipeline:

WordPieceTokenizer.Tokenize(text) -> (input_ids[seq], attention_mask[seq]).
Build DenseTensor<long> for input_ids, attention_mask, token_type_ids (zeros — single-sentence encoding).
session.Run(...) discovers the rank-3 output by metadata, not by name.
Attention-mask-weighted mean-pool across the seq dim -> float[384].
L2-normalise -> unit-length sentence vector.

Reference: Reimers & Gurevych 2019, "Sentence-BERT". Matches HuggingFace's canonical sentence-transformer post-process (mean_pooling + F.normalize).

`WordPieceTokenizer`¶

Sealed class. Standard BERT WordPiece tokeniser for all-MiniLM-L6-v2 (a bert-base-uncased-style sentence-transformer). Pure C#.

Source: Assets/Sauti/Editor/WordPieceTokenizer.cs:38

Member	Signature	Description
`DefaultMaxLength`	`const int = 128`	Default `maxLength` argument to `Tokenize`.
`PadToken` / `UnkToken` / `ClsToken` / `SepToken`	`const string`	Standard `[PAD]` / `[UNK]` / `[CLS]` / `[SEP]` strings.
ctor	`WordPieceTokenizer(string vocabPath)`	Loads `vocab.txt` (one token per line; line N -> id N).
`Vocab`	`IReadOnlyDictionary<string, int> { get; }`	The loaded vocabulary.
`VocabSize`	`int { get; }`	`Vocab.Count`.
`Tokenize`	`(int[] inputIds, int[] attentionMask) Tokenize(string text, int maxLength = DefaultMaxLength)`	BasicTokeniser (lowercase, whitespace + punctuation split) -> WordPiece (greedy longest-match-first) -> `[CLS] ... [SEP]` wrap -> pad to `maxLength`.
`FindSpecialTokenId`	`static int FindSpecialTokenId(IReadOnlyDictionary<string,int> vocab, string token)`	Lookup helper; throws `InvalidDataException` on missing special token.

Algorithm reference: Wu et al. 2016, "Google's Neural Machine Translation System". Matches HuggingFace's BertTokenizer behaviour with do_lower_case=True, tokenize_chinese_chars=False, no accent stripping.

`RagDatabaseBuilder`¶

Static class. The Editor MenuItem entry point. Walks knowledge-base/, chunks each file, embeds via IRagEmbedder, writes the resulting database to both ai-models/rag/knowledge.db and Assets/StreamingAssets/VoiceAI/rag/knowledge.db.

Source: Assets/Sauti/Editor/RagDatabaseBuilder.cs:28

Member	Signature	Description
`FileMagic`	`const uint = 0x01474152u`	Binary file magic: `"RAG\x01"` little-endian. Bumps on format change.
`OutputFileName`	`const string = "knowledge.db"`	The standard output filename.
`BuildFromMenu`	`[MenuItem("Sauti/Build Knowledge Base")] static void BuildFromMenu()`	Editor menu entry. Locates `ai-models/embeddings/model_int8.onnx`, runs the build, shows a confirmation dialog.
`BuildAsync`	`static Task BuildAsync(string knowledgeBaseDir, string[] outputPaths, IRagEmbedder embedder)`	Test-friendly async entry. Walks the directory, chunks, embeds, writes to every path in `outputPaths`.
`WriteDatabase`	`static void WriteDatabase(string outputPath, IReadOnlyList<KnowledgeChunk> chunks, float[][] embeddings, int dimensions)`	Pure-C# writer for the binary `knowledge.db` format.

Binary format (little-endian):

[uint32 magic = 0x01474152]
[uint32 dimensions]
[uint32 chunkCount]
for each chunk:
  [uint16 docIdLen] [bytes docId]
  [uint16 titleLen] [bytes title]
  [uint32 textLen]  [bytes text]
  [float32 x dimensions  embedding]

Reference experiments¶

The MonoBehaviours under experiments/*/ are not part of the runtime API surface, but they are the canonical reference implementations of the patterns documented above. When in doubt about how to wire something, read these:

Experiment	Public method to read	Pattern
01 — TTS Hello	`KokoroHello.SpeakAsync`	Direct `KokoroTtsRunner` instantiation + `AudioSource` playback.
02 — STT Loopback	`WhisperLoopback.StartListening`	`WhisperManager.GetTextAsync` over a mic-captured `AudioClip`.
03 — LLM Chat	`LlmChat.Ask`, `LlmChat.AssembleSystemPrompt`	LLMUnity streaming chat + sentence-boundary buffer + `/no_think`.
04 — RAG Grounding	`RagGroundedAsk.Ask`	`SautiRag.SearchAsync` + the § 4.5 prompt assembly + A/B toggle.
05 — Full Voice Loop	`FullVoiceLoop.BuildPrompt`, `FullVoiceLoop.RunOneTurn`	All four stages composed. The reference orchestrator.
06 — VR Quest NPC	`QuestVrCompanion.StartListening`	Quest controller trigger driving the same orchestration as EXP-005, with Kokoro on a spatial `AudioSource`.

Source paths: experiments/01-tts-hello/KokoroHello.cs through experiments/06-vr-quest-npc/QuestVrCompanion.cs.

Upstream APIs Sauti relies on¶

These are not Sauti types but Sauti calls into them. Listed here so the catalogue is self-contained.

`LLMUnity.LLM` (MonoBehaviour)¶

void SetModel(string path)
Task WaitUntilReady()
void SetReasoning(bool reasoning)
bool reasoning field — toggles chain-of-thought handling. /no_think is a Qwen3 prompt-level convention, not this flag.

`LLMUnity.LLMAgent` (MonoBehaviour, extends `LLMClient`)¶

string systemPrompt field
List<ChatMessage> chat field
ContextOverflowStrategy overflowStrategy field
float overflowTargetRatio field
Task<string> Chat(string query, Action<string> callback = null, Action completionCallback = null, bool addToHistory = true) — first callback receives the cumulative response text, not per-token deltas.
Task ClearHistory()

`LLMUnity.RAG` (MonoBehaviour)¶

void Init(SearchMethods searchMethod, ChunkingMethods chunkingMethod, LLM llm)
Task<int> Add(string inputString, string group = "")
Task<(string[], float[])> Search(string queryString, int k, string group = "")
Task<bool> Load(string filePath)
void Save(string filePath)

`Whisper.WhisperManager` (MonoBehaviour)¶

string ModelPath { get; set; }
bool IsModelPathInStreamingAssets { get; set; }
string language field ("en", "ja", "auto")
Task InitModel()
Task<WhisperResult> GetTextAsync(AudioClip clip)
Task<WhisperResult> GetTextAsync(float[] samples, int frequency, int channels)

`Microsoft.ML.OnnxRuntime.InferenceSession`¶

InferenceSession(string modelPath, SessionOptions opts)
InputMetadata / OutputMetadata — used by Sauti for dynamic input/output discovery.
IDisposableReadOnlyCollection<DisposableNamedOnnxValue> Run(IEnumerable<NamedOnnxValue> inputs) — the canonical raw-ORT pattern.

Verified upstream-API details: memory/api_surfaces.md.

API reference¶

Namespace map¶

Sauti.Memory namespace¶

TemporaryMemory¶

ISautiRagBackend¶

LlmUnityRagBackend¶

SautiRag¶

Sauti.Tts namespace¶

KokoroTtsRunner¶

EnglishG2P¶

Sauti.Editor.Rag namespace¶

KnowledgeChunk¶

KnowledgeBaseChunker¶

IRagEmbedder¶

MiniLmRagEmbedder¶

WordPieceTokenizer¶

RagDatabaseBuilder¶