Skip to content

fix embed_tokens for last layer in qwen models #458

fix embed_tokens for last layer in qwen models

fix embed_tokens for last layer in qwen models #458

single-m4-pro (llama-3.2-1b)  /  run-distributed-job (M4PRO_GPU16_24GB)

succeeded Jan 28, 2025 in 54s