You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I debug the codes, I had a question about the tensor shape in K tensor.
between rope-k and SDP,there are two steps for processing k tensor blew:
step 1:
when in llm_build_kv_store() function, k_cur tensor has shape [batch, seq_len, k_head, head_dim],and k_cur will be copied into kv-cache.
step 2:
but in llm_build_kqv() function,when extracting the k_cur fom kv-cache, the shape will be [batch, k_head, seq_len, head_dim]
Q: I don't know where is k_cur tensor permuted from [batch, seq_len, k_head, head_dim] to [batch, k_head, seq_len, head_dim] because there is only "copy k_cur to kv-cache" and "view k_cur from kv-cache". I did not find any codes about k tensor permute.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
When I debug the codes, I had a question about the tensor shape in K tensor.
between rope-k and SDP,there are two steps for processing k tensor blew:
step 1:
when in
llm_build_kv_store()
function, k_cur tensor has shape [batch, seq_len, k_head, head_dim],and k_cur will be copied into kv-cache.step 2:
but in
llm_build_kqv()
function,when extracting the k_cur fom kv-cache, the shape will be [batch, k_head, seq_len, head_dim]Q: I don't know where is k_cur tensor permuted from [batch, seq_len, k_head, head_dim] to [batch, k_head, seq_len, head_dim] because there is only "copy k_cur to kv-cache" and "view k_cur from kv-cache". I did not find any codes about k tensor permute.
Could you please help me about this Question?
Beta Was this translation helpful? Give feedback.
All reactions