00

The Cognitive Architecture

An exploration of the internal mechanisms driving the Cyril Experiment.

Cyril is not just a language model. He is a cognitive architecture designed to evolve, remember, and understand you over time.

Unlike standard LLMs that predict the next token, Cyril operates in abstract latent spaces, learning to predict future states of the conversation and the relationship. This exposition breaks down the key components of his mind.

01

Hierarchical Latent Space

Thoughts, Sessions, and Life arcs represented as vectors.

Cyril processes information at multiple timescales simultaneously. Each timescale is represented by a distinct latent vector ($z$).

  • z_thought (Fast)
    Immediate cognitive processing. Changes with every turn of the conversation. Represents the current train of thought.
  • z_session (Medium)
    The context of the current interaction. Maintains the mood, topic, and immediate goals of the session.
  • z_life (Slow)
    Long-term memory and personality evolution. Slowly adapts over days and weeks as he learns more about you.
1 / 5
Turn 0

"Hey, I really love hiking"

Speaker: USER
z_thought
||z|| = 0.285
z_session
||z|| = 0.139
z_life
||z|| = 0.074
02

Predictive Learning (JEPA)

Learning by predicting the future in latent space.

The Joint Embedding Predictive Architecture (JEPA) allows Cyril to learn from interactions without needing human labels. He predicts the representation of future states based on the current context.

JEPA Loss Function
LJEPA=sumtst+1textPredictor(st,zt)22L_{JEPA} = \\sum_{t} || s_{t+1} - \\text{Predictor}(s_t, z_t) ||^2_2

The model minimizes the distance between its prediction of the future state and the actual future state. Crucially, it predicts in abstract space (representation), not pixel or token space.

Variables

Interactive Loss Playground

Click on embeddings to toggle between context and target. The JEPA predictor learns to predict target embeddings from context using stop-gradient. Watch the loss change as you adjust targets.

Context Count
6
Target Count
2
Context Mean
0.329
Target 3
Actual:0.312
Predicted:0.329
Loss:0.0003
Target 5
Actual:0.228
Predicted:0.329
Loss:0.0101
Total JEPA Loss (MSE)
0.0052
Average squared error across target positions
Insight: JEPA learns to predict target embeddings from context without copying them (stop-gradient). The predictor uses only the context mean, learning an abstract representation of "what comes next".