Networks are the performance bottleneck of server cluster platforms
Data to be transferred from one server node to any other server node in a cluster must travel through the cluster network. Since network cables can only carry a few copper wires or optical fibers, the network controllers must serialize the data to be transferred from parallel form - as it is handled by CPUs and memory sub-systems within each server node - to serial form - i.e. multiple packets that are sent one by one and one bit at a time from the source server node to the destination server node. Because of the serialization, data travels from point to point much more slowly than parallel data would. Therefore, regardless of how much the computing power and processing speed at server level go up (faster CPUs, faster clock rates, faster memory, faster storage, etc.), node-to-node data transfers must always travel through the "slow" network and so the network itself is effectively the most important factor in cluster performance (or lack threof).
Bandwidth vs. Latency
Beware of assuming that if network bandwidth capability is increased by any given percentage, so does the operational speed of software applications running on the cluster, as most vendors of status quo networking gear make every effort to convince industry and technology adopters of it.
In reality, not even close. Here below is a diagram excerpt from our "Latency Performance" overview (below), which illustrates our point.
In reality, 10GbE’s 10x multiplier Over GbE relates more closely to the Total Cost of Ownership
(TCO) penalty than to its performance gain. We expect this to be the main reason why 1 Gb/s
Ethernet has continued to thrive and 10GbE has been so slow in being adopted, as likely users
appear to have been waiting for a meaningful 10GbE hardware costs slashing to justify the move.
Effectively, increases in network bandwidth (the network "capacity" factor) do not
yield proportional amounts of latency reduction (the network's time to process the
and therefore the actual network's "speed"). Therefore, since the network is the data
transfer bottleneck of the cluster, network latency is the most important factor in
cluster performance. Not yet convinced about it? Then we recommend that you
assess the non technical, easy to absorb evidence in the two-page document on right.
How does HyperShare compare in hardware latency with status quo Ethernet networks in a sample cluster of more than 576 server nodes? By keeping in mind that with latency lower values mean better performance, here is another excerpt taken from our much more detailed "Latency Performance" overview that you can vew and download below.
Ethernet latency assumes no oversubscription, no blocking loops (data can
reach any node from any node) and no loss of packets.
Native HyperShare is
7.5x better than Standard 10 Gb/s Ethernet
and 12x better than Standard 1 Gb/s Ethernet
in Latency Performance
The overview below provides more details about 1 Gb/s Ethernet, 10 Gb/s Ethernet and Native HyperShare latency performance (also viewable and downloadable in PDF form)..
Native HyperShare Latency performance
coupled with the cluster-wide CapEx And OpEx savings opportunities makes HyperShare an undisputable
best choice for data center owners and CIOs searching for
next generation cluster network technology
(Click on Arrows to Scroll Through or on Image to Open in Full Size)