{"type":"link","version":"1.0","title":"研究发现在 Llama、Deep… | The Curse of Depth… · HotDaily 每日晚报","description":"研究发现在 Llama、DeepSeek、Qwen 等主流模型中普遍存在深层网络无效的现象。分析指出 Pre-LN 机制导致输出方差随深度指数增长…","url":"https://hotdaily.top/item/9bbbbba3-3196-46bf-83b0-94fc16273c37","provider_name":"HotDaily","provider_url":"https://hotdaily.top","thumbnail_url":"https://hotdaily.top/og.png","thumbnail_width":1200,"thumbnail_height":630}