HyperTransport interconnect technology provides the lowest possible latency for processor-to-processor and processor-to-peripheral links.
HyperTransport, in contrast to other emerging I/O technologies, has been intentionally focused on creating a unified interconnect channel that exhibits the lowest possible latency and introduces the lowest possible overhead in supporting packet-based data streams.
One aspect of HyperTransport's low latency capability is the parallel nature of its link structure. A single forwarded clock is used per set of 8 data path bits, enabling a very low latency point-to-point data transfer. In contrast, serial links, such as Serial RapidIO and PCI Express, eliminate the single clock signal by adding extensive clock encoding/decoding requirements at both ends of the link. This introduces significant clocking overhead, forces the addition of serializing/deserializing logic and increases latency time well beyond ideal levels for high performance, chip-to-chip communication.
A second aspect of HyperTransport's low latency capabilities is the low data packet overhead. As compared to other packet-based approaches, HyperTransport provides the lowest packet header overhead: 8 bytes for a read operation and 12 bytes for a write operation (for a read request, there is an 8-byte Read Request control packet followed by the data packet; for a write request, there is an 8-byte Write Request control packet followed by a 4-byte Read Response control packet, followed by the data packet). This compares very favorably to PCI Express for example as shown below:
Packet overhead comparison between HyperTransport and PCI Express
HyperTransport needs only an 8-byte header (control packet) per packet data payload, while PCI Express uses multiple layers of encoding with 20 to 24 bytes of overhead to move even a small command or data payload. This multi-layer overhead is on top of the 20 percent clock encoding/decoding overhead of the link serializing/deserializing circuitry.
HyperTransport Priority Request Interleaving
Another aspect of HyperTransport's low latency feature is the provision of a native mechanism, Priority Request Interleaving™, or PRI, that enables a high priority request command (only 8-byte long) to be inserted within a potentially long, lower priority data transfer. A typical use is shown in the figure below. While data transfer 1 is underway between peripheral B and the host, the need arises for peripheral A to start a data transfer from the host. Without PRI, transfer 2 would have to wait until transfer 1 completes and, should transfer 1 be the answer to a cache miss, for instance, latency for transfer 2 would become prohibitive. With PRI, a control packet is promptly inserted within transfer 1's data stream, instructing the link to initiate data transfer 2 on the other link channel concurrently with the completion of data transfer 1. This mechanism, unique to HyperTransport technology, greatly reduces latency of HyperTransport-based systems.
White Paper: Latency Comparison between HyperTransport and PCIe in Communications Systems
This article discusses why latency is an important parameter of these interconnects; then contrasts and compares the latency experienced by transactions carried over two of the leading chip-to-chip interconnect standards, HyperTransport and PCI-Express (PCIe). It presents a couple of usage scenarios involving read accesses and derives the latency performance of each