Tencent’s tech team has optimized DeepSeek’s open-source DeepEP communication framework,Aye Auto (2024) EP 3 Hindi Web Series boosting its performance across different network environments, according to the Chinese AI startup. Testing showed a 100% improvement on RoCE networks and a 30% gain on InfiniBand (IB), offering more efficient solutions for AI model training. On GitHub, DeepSeek acknowledged the Chinese tech giant’s contribution had led to a “huge speedup.” DeepEP is a communication library tailored for a mixture of experts (MoE) and expert parallelism (EP), supporting high-throughput, low-latency GPU kernels and low-precision computing, including FP8. Tencent’s Starlink Networking team identified two main bottlenecks: underutilized dual-port NIC bandwidth and CPU control latency. After targeted optimizations, performance doubled on RoCE and improved by 30% on IB. The enhanced framework is now fully open-source and has been successfully deployed in training Tencent’s Hunyuan large model, demonstrating strong versatility within environments built on Tencent’s Starlink and H20 servers, Chinese tech media outlet iThome reported. [iThome, in Chinese]
Related Articles
2025-06-26 13:51
2054 views
SpaceX's Starlink will provide free satellite internet to families in Texas school district
To help close the digital divide, SpaceX will offer satellite internet to the education sector.The c
Read More
2025-06-26 13:23
196 views
Soccer star punches himself in the face 9 times, instantly becomes meme
As someone used to scoring a lot of goals, Jamie Vardy clearly doesn't like it when he misses.On Tue
Read More
2025-06-26 12:18
1866 views
'The Rise of Skywalker' failed to care about Finn's story. That's a problem.
"Remove everything that has no relevance to the story," wrote Russian playwright Anton Chekhov. "It'
Read More