Front Page Detectives on MSN2 天
Researchers Use Cosmic Rays to Discover Mysterious 30-Metre-Long Space Inside the Giza ...Researchers Use Cosmic Rays to Discover Mysterious 30-Metre-Long Space Inside the Giza Pyramid, Baffled by Its Purpose In ...
High-resolution muon imaging and AI are being used to unlock the potential of one of the largest undeveloped primary zinc ...
Neutrinos have always been difficult to study because their small mass and neutral charge make them especially elusive.
Muon tomography, or muography, is the practice of using muons generated by cosmic rays interacting with Earth’s atmosphere to ...
South Korean cable firm LS Eco Energy seeks to secure a stable rare earth supply from Vietnam, after its partners in a deal signed in January 2024 were arrested in November 2024.
17 天
IEEE Spectrum on MSNThis $100 Muon Detector Lets You Harness the CosmosThese muon particles are heavyweight cousins of electrons that travel close to the speed of light. They can penetrate through many meters of solid rock, including the limestone and granite blocks used ...
机器之心报道编辑:陈陈、佳琪省一半算力跑出2倍效果,月之暗面开源优化器Muon,同预算下全面领先。月之暗面和 DeepSeek 这次又「撞车」了。上次是论文,两家几乎前后脚放出改进版的注意力机制,可参考《撞车 DeepSeek NSA,Kimi 杨植麟署名的新注意力架构 MoBA 发布,代码也公开》、《刚刚!DeepSeek ...
公司动态 经济观察网讯 2月24日,月之暗面Kimi发布了“Muon可扩展用于LLM训练”的新技术报告,并宣布推出“Moonlight”:一个在Muon上训练的30亿/160亿参数混合专家模型(MoE)。使用了5.7万亿个token,在更低的浮点运算次数(FLOPs)下实现了更好的性能,从而提升了帕累托效率边界。(编辑 ...
Moonlight模型的发布无疑为AI领域注入了一剂强心针。该模型在训练过程中采用了高达5.7万亿个token的数据量,同时通过减少浮点运算次数(FLOPs),实现了性能的显著提升。这一突破不仅提升了帕累托效率边界,更为未来的大规模语言模型训练提供了新的思路。月之暗面团队表示,Muon优化器通过引入权重衰减和精细调整每个参数更新幅度的技术,使得其在大规模训练中表现得更为高效。
IT之家 2 月 24 日消息,月之暗面 Kimi 昨日发布了“Muon 可扩展用于 LLM 训练”的新技术报告,并宣布推出“Moonlight”:一个在 Muon 上训练的 30 亿 / 160 亿参数混合专家模型(MoE)。使用了 5.7 万亿个 token,在更低的浮点运算次数(FLOPs)下实现了更好的性能,从而提升了帕累托效率边界。
一些您可能无法访问的结果已被隐去。
显示无法访问的结果