Compute the scaled dot-product score between a query and key vector — the fundamental operation before softmax in attention.
score(Q, K) = Q · K / √d_kWhere d_k = length of Q (and K).
Without scaling, dot products grow large in high dimensions → softmax becomes peaky → vanishing gradients.
attention_score([1,0,1,0],[0,1,0,1]) → 0.0 attention_score([1,1],[1,1]) → 1.41421
Round to **5 decimal places**.
Test Cases (2 visible · 2 hidden)
Case 1: Orthogonal vectors
Input: attention_score([1,0,1,0],[0,1,0,1])
Expected: 0.0
Case 2: 2D parallel
Input: attention_score([1,1],[1,1])
Expected: 1.41421
⌘↵ Run · ⌘⇧↵ Submit