Embeddings and Positional Encoding

1 / 5

Token embeddings

Lookup table W of shape (V, D)

2 / 5

Positional embeddings

Table P of shape (Tmax, D)

3 / 5

Combine

input[i] = token_emb[i] + pos_emb[i]

4 / 5

Init

k = 1/sqrt(D); sample U(-k, k)

5 / 5
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.