MLP Language Model

1 / 5

Fixed context

Input window length T Target is next token

2 / 5

Embed + concat

T embeddings -> vector length T*D

3 / 5

MLP head

T*D -> hidden -> V logits

4 / 5

Generation

Predict next token, slide window

5 / 5
Use arrow keys or click edges to navigate. Press H to toggle help, F for fullscreen.