Full Model Flow Visualization

Model: knock_6_1_36_words
Probe Sentence: "knock knock whos there cat"

Input Embeddings

WordToken EmbeddingCombined Embedding (Input to Block 0)
'knock'
(pos 0)
'knock'
(pos 1)
'whos'
(pos 2)
'there'
(pos 3)
'cat'
(pos 4)

Transformer Block 0

WordInput (x)After LN1Query (q)Key (k)Value (v)Attn Out (z)AttentionAttn ProjAfter Resid1After LN2MLP OutBlock Output
'knock'
(pos 0)
knock
0.096
knock
0.064
whos
0.244
there
0.309
cat
0.288
'knock'
(pos 1)
knock
0.104
knock
0.060
whos
0.199
there
0.335
cat
0.302
'whos'
(pos 2)
knock
0.090
knock
0.060
whos
0.196
there
0.358
cat
0.296
'there'
(pos 3)
knock
0.066
knock
0.042
whos
0.259
there
0.198
cat
0.435
'cat'
(pos 4)
knock
0.119
knock
0.086
whos
0.295
there
0.299
cat
0.201

Transformer Block 1

WordInput (x)After LN1Query (q)Key (k)Value (v)Attn Out (z)AttentionAttn ProjAfter Resid1After LN2MLP OutBlock Output
'knock'
(pos 0)
knock
0.195
knock
0.119
whos
0.105
there
0.186
cat
0.395
'knock'
(pos 1)
knock
0.222
knock
0.111
whos
0.086
there
0.206
cat
0.374
'whos'
(pos 2)
knock
0.153
knock
0.088
whos
0.085
there
0.153
cat
0.520
'there'
(pos 3)
knock
0.171
knock
0.147
whos
0.137
there
0.124
cat
0.420
'cat'
(pos 4)
knock
0.197
knock
0.123
whos
0.162
there
0.204
cat
0.314

Transformer Block 2

WordInput (x)After LN1Query (q)Key (k)Value (v)Attn Out (z)AttentionAttn ProjAfter Resid1After LN2MLP OutBlock Output
'knock'
(pos 0)
knock
0.195
knock
0.142
whos
0.036
there
0.124
cat
0.503
'knock'
(pos 1)
knock
0.215
knock
0.149
whos
0.030
there
0.149
cat
0.457
'whos'
(pos 2)
knock
0.201
knock
0.160
whos
0.037
there
0.082
cat
0.520
'there'
(pos 3)
knock
0.093
knock
0.059
whos
0.006
there
0.006
cat
0.835
'cat'
(pos 4)
knock
0.102
knock
0.069
whos
0.050
there
0.050
cat
0.729

Transformer Block 3

WordInput (x)After LN1Query (q)Key (k)Value (v)Attn Out (z)AttentionAttn ProjAfter Resid1After LN2MLP OutBlock Output
'knock'
(pos 0)
knock
0.159
knock
0.120
whos
0.039
there
0.093
cat
0.588
'knock'
(pos 1)
knock
0.164
knock
0.122
whos
0.031
there
0.100
cat
0.584
'whos'
(pos 2)
knock
0.196
knock
0.149
whos
0.020
there
0.068
cat
0.566
'there'
(pos 3)
knock
0.072
knock
0.043
whos
0.003
there
0.009
cat
0.872
'cat'
(pos 4)
knock
0.073
knock
0.047
whos
0.024
there
0.056
cat
0.800

Transformer Block 4

WordInput (x)After LN1Query (q)Key (k)Value (v)Attn Out (z)AttentionAttn ProjAfter Resid1After LN2MLP OutBlock Output
'knock'
(pos 0)
knock
0.063
knock
0.088
whos
0.033
there
0.758
cat
0.057
'knock'
(pos 1)
knock
0.033
knock
0.050
whos
0.014
there
0.885
cat
0.018
'whos'
(pos 2)
knock
0.157
knock
0.207
whos
0.189
there
0.265
cat
0.181
'there'
(pos 3)
knock
0.049
knock
0.022
whos
0.031
there
0.006
cat
0.892
'cat'
(pos 4)
knock
0.127
knock
0.089
whos
0.066
there
0.109
cat
0.610

Transformer Block 5

WordInput (x)After LN1Query (q)Key (k)Value (v)Attn Out (z)AttentionAttn ProjAfter Resid1After LN2MLP OutBlock Output
'knock'
(pos 0)
knock
0.159
knock
0.091
whos
0.192
there
0.102
cat
0.456
'knock'
(pos 1)
knock
0.170
knock
0.095
whos
0.221
there
0.054
cat
0.461
'whos'
(pos 2)
knock
0.102
knock
0.062
whos
0.062
there
0.186
cat
0.589
'there'
(pos 3)
knock
0.090
knock
0.085
whos
0.010
there
0.097
cat
0.718
'cat'
(pos 4)
knock
0.033
knock
0.011
whos
0.066
there
0.277
cat
0.612

Final Projection

WordAfter Final Layer NormalisationFinal Linear LayerDot Product Breakdown for cat
'knock'
(pos 0)
'knock'
(pos 1)
'whos'
(pos 2)
'there'
(pos 3)
'cat'
(pos 4)

Next Token Prediction

Below are the raw embedding representations for the top 10 potential next tokens. You can compare the visualization for the top prediction with the 'Dot Product Breakdown' visualization in the 'Final Projection' section above. The 'Dot Product Breakdown' shows how the model's final internal state aligns with a token's embedding to produce a high logit score.

TokenProbabilityEmbedding Visualization
cat
12.78%
dog
2.79%
catch
2.50%
can
2.43%
like
1.44%
dead
1.40%
hot
1.35%
day
1.26%
new
1.18%
as
1.08%