Tencent’s tech team has optimized DeepSeek’s open-source DeepEP communication framework,horror movies with extreme eroticism boosting its performance across different network environments, according to the Chinese AI startup. Testing showed a 100% improvement on RoCE networks and a 30% gain on InfiniBand (IB), offering more efficient solutions for AI model training. On GitHub, DeepSeek acknowledged the Chinese tech giant’s contribution had led to a “huge speedup.” DeepEP is a communication library tailored for a mixture of experts (MoE) and expert parallelism (EP), supporting high-throughput, low-latency GPU kernels and low-precision computing, including FP8. Tencent’s Starlink Networking team identified two main bottlenecks: underutilized dual-port NIC bandwidth and CPU control latency. After targeted optimizations, performance doubled on RoCE and improved by 30% on IB. The enhanced framework is now fully open-source and has been successfully deployed in training Tencent’s Hunyuan large model, demonstrating strong versatility within environments built on Tencent’s Starlink and H20 servers, Chinese tech media outlet iThome reported. [iThome, in Chinese]
Related Articles
2025-06-26 21:57
756 views
Colman Domingo’s Craigslist love story with husband Raúl has the internet swooning
At the 2025 Oscars, no one had more fun than Colman Domingo, who took every opportunity to get up fr
Read More
2025-06-26 20:54
1414 views
Acer unveils StarVR One headset with built
Acer's latest VR headset sure isn't compact, but at least it's mighty powerful.The original StarVR h
Read More
2025-06-26 20:51
936 views
We now know what Frank Ocean has been up to this entire time
Where has Frank Ocean, who has still yet to debut his latest album, been?Doing a major ad campaign w
Read More