Word Similarity with Embeddings
Word Similarity with Embeddings
Work with pre-computed word embeddings to find similar words and solve word analogies.
Background
Word embeddings represent words as dense vectors where similar words have similar vectors. You'll implement functions to work with these embeddings.
Functions to implement
1. cosine_similarity(v1, v2)
Compute cosine similarity between two vectors.
2. find_most_similar(word, embeddings, top_k)
Find the k most similar words to a given word.
embeddingsis a dict mapping words to their vectors- Return list of (word, similarity) tuples, sorted by similarity descending
3. word_analogy(a, b, c, embeddings)
Solve "a is to b as c is to ?" using vector arithmetic.
- Compute: result = embeddings[b] - embeddings[a] + embeddings[c]
- Return the word most similar to the result vector (excluding a, b, c)
Examples
embeddings = {
"king": [0.5, 0.7, 0.1],
"queen": [0.6, 0.8, 0.1],
"man": [0.4, 0.2, 0.1],
"woman": [0.5, 0.3, 0.1],
}
# king - man + woman ≈ queen
word_analogy("man", "king", "woman", embeddings) # "queen"
Run tests to see results
No issues detected