Design notes¶
The long-form companion to the README, covering the algebraic structure of the library, the functional-programming commitments, and the JAX idioms it relies on. Read this if you want to extend the library or understand the trade-offs behind the API.
1. The algebra¶
Why an algebra at all? Fodor & Pylyshyn (1988) argued that any architecture aspiring to model cognition must support combinatorial syntax-and-semantics and structure-sensitive processes — the two commitments that distinguish “Classical” symbol systems from spreading-activation networks. VSAs answer that critique with a connectionist substrate that is nonetheless a closed algebra over a fixed-dimensional vector space.
A Vector Symbolic Architecture (Gayler 2003) is a compact algebraic object on \(\mathbb{R}^d\): a commutative binding \(\star\), an associative bundling \(\oplus\), a cyclic group action \(T_k\), and a similarity measure (cosine). The choice of binding selects the VSA family — MAP uses element-wise product, HRR uses circular convolution, BSC uses XOR — but the interface is uniform. All three are fixed-dimensional compressions of Smolensky’s (1990) tensor-product binding, which produces a \(d^2\)-dimensional output and which the modern VSA family deliberately avoids.
What the laws say¶
For any hypervectors \(x, y, z \in \mathbb{R}^d\):
Commutative bind: \(x \star y = y \star x\).
Associative bundle: \((x \oplus y) \oplus z = x \oplus (y \oplus z)\).
Distributivity: \(x \star (y \oplus z) \approx (x \star y) \oplus (x \star z)\) (exact in HRR, approximate after normalisation in MAP).
Self-inverse bind (MAP/BSC): \(x \star x \approx \mathbf{1}\) up to the codebook.
Quasi-orthogonality (Kanerva 2009): for random \(x, y\), \(\mathbb{E}[\cos(x, y)] \approx 0\) with variance \(1/d\).
tests/test_functional.py checks these at realistic dimensions. The ones
that hold exactly are checked with jnp.allclose; the ones that hold up
to dimension-dependent concentration are checked with tolerances chosen to
flag violations without being flaky.
The group action¶
For fixed \(d\), cyclic shift by \(k\) defines an action of \(\mathbb{Z}/d\) on \(\mathbb{R}^d\):
The action is faithful, additive, and isometric:
\(T_k(x) = x \ \forall x \iff k \equiv 0 \pmod{d}\) (faithful).
\(T_j \circ T_k = T_{j+k}\) (additive).
\(\|T_k(x)\| = \|x\|\) and \(\langle T_k(x), T_k(y)\rangle = \langle x, y\rangle\) (isometric).
The HDC primitives have two flavours of equivariance with respect to this action, and conflating them is a common mistake:
Diagonal equivariance — shifting every argument shifts the output. Element-wise binding and bundling satisfy this.
Single-argument equivariance — shifting one argument shifts the output; under the diagonal action the output picks up a double shift. Circular-convolution binding satisfies this.
Both are correct, and both are useful. The module bayes_hdc.equivariance
documents both, names them separately, and provides verifiers for each.
Test file: tests/test_equivariance.py.
2. PVSA: lifting to measures¶
The probabilistic layer replaces each hypervector \(x \in \mathbb{R}^d\) with a posterior distribution \(X\). A GaussianHV carries a mean \(\mu \in \mathbb{R}^d\) and a per-dimension variance \(\sigma^2 \in \mathbb{R}_{\ge 0}^d\).
Closed-form moments¶
For independent \(X \sim \mathcal{N}(\mu_x, \mathrm{diag}(\sigma_x^2))\) and \(Y \sim \mathcal{N}(\mu_y, \mathrm{diag}(\sigma_y^2))\), the first and second moments of the element-wise product \(Z = X \cdot Y\) are exact:
bind_gaussian returns a GaussianHV with exactly these moments. It is not a Monte Carlo estimate. It is not a delta-method approximation. It is the analytic answer.
The sum (bundle) is trivial under independence:
Normalisation onto the unit sphere uses the delta method for the variance term, which is the dominant source of approximation in bundle_gaussian — the cost of insisting the library always returns objects on the manifold classical HDC uses.
KL divergences¶
Gaussian–Gaussian and Dirichlet–Dirichlet KL divergences have closed forms:
kl_gaussian and kl_dirichlet return this analytically. They are differentiable end-to-end under jax.grad, which is what makes them useful in a variational objective.
Reparameterisation¶
Every GaussianHV has a .sample(key) method that uses the standard
reparameterisation trick. jax.grad composes through every distributional
op by construction; there is no hidden non-differentiable step.
3. Functional programming commitments¶
The library is deliberately FP-shaped.
Immutable values. Every type is a frozen dataclass. There is no
- code:
__setattr__, no hidden mutable state. Updates return new objects. The exception —AdaptiveHDC— uses JAX’slax.scanunder the hood so the appearance of mutation is an illusion over pure folds.
Pure functions. Core ops —
bind_gaussian,bundle_gaussian,permute_gaussian,cleanup_gaussian,inverse_gaussian,kl_gaussian— have no side effects, no global state, no dependency on time.Small, typed API. Every public function has explicit argument types and a return type.
Anyis avoided. Shape contracts are stated in the docstring and checked by tests.Pytree-native. Every dataclass is registered via
jax.tree_util.register_dataclass.jit,vmap,grad,pmap,shard_mapcompose unconditionally without user-sideflatten_utilboilerplate.No custom VJPs where the default works. The library prefers to lean on JAX’s autodiff. Custom
jvp/vjprules appear only when there is a measurable speedup and they never change the numerical output.
4. JAX idioms¶
jitat the boundary, not the interior. Individual primitives are cheap; composite operations (e.g.BayesianCentroidClassifier.fit)jittheir inner workers but leave the constructor and the dispatch un-jitted so traceable error messages survive.vmapfor batching, not manual loops. Every op has a natural batched form viavmap.encode_batch,bind_batch, etc. are thin wrappers.shard_mapat scale.pmap_bind_gaussianandshard_map_bind_gaussianfall back to single-device when only one device is visible, so the same code runs on a laptop and on a pod.jax.random.PRNGKeysplitting is explicit. Keys are never reused. Every function that consumes randomness takes a key as an argument; there is no hidden global key.Float32 default, float64 available. The library is careful about dtypes;
jnp.astypeis used where promotion matters; no accidental upcasts.
5. Versioning and stability¶
The library is at 0.4.0a0. The public API (everything in
bayes_hdc/__init__.py) follows semver loosely — breaking changes are
called out in CHANGELOG.md. Before 1.0, names may be renamed or
reorganised; behaviours will not silently change.
The internal layout is pytree-first and module-local: new distributional
types belong in bayes_hdc/distributions.py, new inference primitives in
bayes_hdc/inference.py, new group-theoretic helpers in
bayes_hdc/equivariance.py. Public names are re-exported from
bayes_hdc/__init__.py with an explicit __all__.
6. When to reach for this library¶
Reach for it when you need one or more of:
a well-typed hypervector algebra with pytree-native composition;
closed-form moment propagation for a probabilistic HDC pipeline;
coverage-guaranteed or calibrated predictions on top of an HDC classifier;
a distribution-valued representation of classifier weights;
structured representations for task-conditioned agents;
bounded-memory streaming inference under distribution shift.
StreamingBayesianHDCmaintains exponential-moving-average posteriors per class inO(K·d)memory regardless of stream length — useful for non-stationary biomedical-signal streams (drifting EMG calibration, EEG montage shifts, wearable-sensor recalibration) where a Kalman-style fixed-shrinkage filter (BayesianAdaptiveHDC) is too rigid and a full posterior fit is too expensive.
Reach for something else when the task calls for deep end-to-end learning on natural images at ImageNet scale, or when the uncertainty in your model is irreducibly aleatoric and no amount of Bayesian machinery will surface it.