Vaults & knowledge sources¶
A vault is a content source the runtime can read and search. Hive distinguishes two flavors:
- Local vaults — a folder on this machine, possibly an Obsidian library. Each peer configures their own; not shared automatically.
- Workspace-shared vaults — a pointer to an upstream source (GitHub today; more later). Every peer in the workspace sees the same source list and fetches independently with their own credentials.
Local sources¶
[[vaults]]
id = "alice-notes"
name = "My research notes"
kind = "folder" # or "obsidian"
location = "/Users/alice/Notes"
index = true
Or via Settings → Vaults → Add → "Folder (local)" / "Obsidian (local)". The path is local to this peer; Bob doesn't see Alice's notes by joining the workspace.
Workspace-shared sources¶
Configure once; every peer in the workspace gets the source automatically via the event log.
In Settings → Vaults → Add → "GitHub (shared)":
- Repo:
owner/nameslug. - Ref: branch, tag, or commit SHA. Defaults to
main. - Paths: comma-separated globs to narrow the fetch (empty = whole repo).
- PAT env var: optional. Without it, public repos work; private repos require it.
Each peer's Hive fetches independently using its own env-var
PAT. Alice sets HIVE_GH_TOKEN to her PAT scoped to private
repos; Bob has his own PAT scoped to whatever he can access.
Neither token leaves the peer that owns it.
The fetched content lands at
~/Library/Caches/Hive/vaults/<vaultID>/ on each peer.
Local-only. Each peer configures their own path; not shared.
Same as Folder, scoped to an Obsidian library directory.
How shared sources sync¶
sequenceDiagram
autonumber
participant A as Alice (owner)
participant LogA as Workspace event log
participant B as Bob
participant GH as GitHub
A->>LogA: vaultSourceAdded { id, source: github(...) }
LogA->>B: envelope arrives via P2P
Note over B: Bob's session.vaults now contains the new entry
B->>GH: fetch with Bob's PAT
GH-->>B: tree + files
Note over B: cache at ~/Library/Caches/Hive/vaults/<id>/
Only owners and admins emit vaultSourceAdded /
vaultSourceRemoved events — same authz threshold as MCP server
catalog changes. Contributors and viewers see the resulting source
list but can't mutate it.
Authentication¶
Auth is per-peer. Hive never transmits tokens between peers. The workspace event records only the env-var name (not the value); each peer's runtime reads its own env at fetch time.
If a peer's env doesn't have the named token set:
- Public content fetches succeed anonymously.
- Private content fetches return 403; the file is missing from that peer's cache.
- The Files pane surfaces "auth needed for these N files" rather than silently empty.
Caching & refresh¶
- Cache location:
~/Library/Caches/Hive/vaults/<vaultID>/. - Refresh: on-demand. The cache warms automatically the first time a workspace is opened in a session, and you can re-fetch any vault from its card in Settings → Vaults. Periodic polling + webhook-driven refresh are on the roadmap.
- Eviction: never automatic; clear the cache manually if you want to force a re-fetch:
Indexing¶
Each peer indexes their own cached content for retrieval (semantic
search, keyword match). The index lives at
~/Library/Caches/Hive/vault-indices/<vaultID>/ and is rebuilt
when the cache changes.
Why per-peer indexing instead of shared:
- Embeddings are tied to a specific model; if Alice uses text-embedding-3-large and Bob uses bge-large, their indices aren't interoperable.
- Index size > content size in typical setups; we'd be syncing more bytes than we save.
- Computing the index uses your own runtime quota.
The trade-off: indexing duplicates compute. We think that's the right call given the heterogeneity of LLM stacks.
Future source kinds¶
The VaultSource enum is designed to grow. Likely additions:
- GitLab — same shape as GitHub, different API.
- Notion — pages + databases.
- Google Drive — folders with OAuth.
- HTTPS — a single URL with optional bearer token.
- S3 / R2 — bucket + prefix.
If you need a source that isn't supported yet:
- Write a small fetcher conforming to
VaultFetcherinhive-runtime. - Add the case to
VaultSourcewith its own associated config struct. - The Settings UI gains a new source-tag option automatically
when you extend
VaultSourceTag.
Or: route the source through an MCP server that exposes vault-style
tools (read_vault_file, search_vault). MCP works today without
adding a new source kind.
On-disk format¶
The persisted vault entry uses a discriminated source field:
{
"id": "alice-notes",
"name": "My research notes",
"source": { "tag": "folder", "path": "/Users/alice/Notes" },
"indexed": true
}
A hand-written kind + path shape from an older config also
decodes transparently — useful when migrating an existing
hive.config.toml by hand. The encoder always writes the
discriminated form, so any save normalizes the entry.