CUDA Programming - Search News

Hosted on MSN16d

DeepSeek's AI breakthrough bypasses Nvidia's industry-standard CUDA, uses assembly-like PTX programming instead

D eepSeek made quite a splash in the AI industry by training its Mixture-of-Experts (MoE) language model with 671 billion parameters using a cluster featuring 2,048 Nvidia H800 GP ...

mccormick.northwestern.edu9y

COMP_ENG 368, 468: Programming Massively Parallel Processors with CUDA

A hands-on introduction to parallel programming and optimizations for 1000+ core GPU processors, their architecture, the CUDA programming model, and performance analysis. Students implement various ...

Hackaday4mon

Learn GPU Programming With Simple Puzzles

Have you wanted to get into GPU programming with CUDA but found the usual textbooks and guides a bit too intense? Well, help is at hand in the form of a series of increasingly difficult ...

15don MSN

Chinese algorithm boosts Nvidia GPU performance 800-fold in science computing

A breakthrough by Chinese researchers could help solve complex problems in industries ranging from aerospace to bridge design ...

腾讯网16d

“DeepSeek甚至绕过了CUDA”，工程师灵魂提问：英伟达护城河还在吗？

来自 Mirae Asset Securities Research（韩国未来资产证券）的分析称，V3的硬件效率之所以能比Meta等高出10倍，可以总结为“他们从头开始重建了一切”。在使用英伟达的H800 GPU训练DeepSeek-V3时，他们针对自己的需求把132个流式多处理器（SMs）中的 20个修改成负责服务器间的通信，而不是计算任务。

新浪网14d

“DeepSeek 甚至绕过了英伟达 CUDA”，论文细节再引热议

硬件媒体 Tom's Hardware 带来开年最新热议：DeepSeek 甚至绕过了 CUDA，使用更底层的编程语言做优化。这一次是 DeepSeek-V3 论文中的更多细节，被人挖掘 ...

16d

DeepSeek绕开CUDA垄断，V3论文细节再挖出！英伟达护城河不存在了？

【新智元导读】DeepSeek模型开发竟绕过了CUDA？最新爆料称，DeepSeek团队走了一条不寻常的路——针对英伟达GPU低级汇编语言PTX进行优化实现最大性能。业界人士纷纷表示，CUDA护城河不存在了？

mccormick.northwestern.edu2mon

COMP_SCI 368, 468: Programming Massively Parallel Processors with CUDA

The initial part of the course will discuss a popular programming interface for graphics processors, the CUDA programming tools for NVIDIA processors. The course will continue with a closer view of ...

16d

“DeepSeek 甚至绕过了英伟达 CUDA”，论文细节再引热议

硬件媒体 Tom's Hardware 带来开年最新热议： DeepSeek 甚至绕过了 CUDA，使用更底层的编程语言做优化。来自 Mirae Asset Securities Research （韩国未来资产证券）的分析称，V3 的硬件效率之所以能比 Meta 等高出 10 倍，可以总结为“他们从头开始重建了一切”。

Some results have been hidden because they may be inaccessible to you

Show inaccessible results