Elements Interconnect Bus (Element Interconnect Bus EIB)
The EIB is an internal communication bus of the Cell processor that interconnects the various elements of on-chip system: the PPE processor, the memory controller (MIC), the eight SPE co-processors and two external interfaces I / O chip forming a total of 12 participants. The EIB includes an allocation unit that functions as a semaphore set. In some documents from IBM EIB participants are called ‘units’.
Currently, the EIB is implemented as a circular ring made of 4 channels 16 byte unidirectional rotating counterclockwise clockwise pairs. When the traffic patterns allow each channel can transmit up to a maximum of 3 concurrent transactions. As the EIB operates at half the clock speed of the system, the effective yield is 16 bytes every two clock cycles. With three active transactions in each of the four rings, ie, with a maximum attendance, the peak instantaneous EIB bandwidth is 96 bytes per clock cycle (12 simultaneous transactions 16 bytes / 2 clock cycles ). Although this value is usually quoted by IBM is unrealistic scale this number by the processor speed. The unit of assignment imposes additional constraints which are discussed below in the section assignment of bandwidth.
David Krolak, IBM’s chief engineer and director of design of the EIB, said the concurrency model
A ring can start an operation every three cycles. Each transfer takes 8 impulses. This was one of the simplifications we made. Thus, is optimized for transferring lots of data. If you make small transactions, does not work so well. Think of trains of eight cars driving on the via. Whenever the cars do not collide with each other, can coexist on the track.
Each of the participants in the EIB has a read port and one 16-byte write of 16 bytes. The limit for each individual participant is to read and write at a speed of 16 bytes per clock cycle (for simplicity often indicate 8 bytes per clock cycle). Note that each SPU contains a dedicated DMA queue administration can plan long sequences of transactions to various destinations without interfering with the load that the SPU being conducted. These queues DMA can be managed both locally and remotely, providing additional flexibility in the control model.
The data flow through a channel in the sense EIB clockwise around the ring. Since there are twelve participants, the total number of steps around the canal back to the origin is twelve. Six steps is the maximum distance between any pair of participants. A channel EIB is not allowed to communicate data requiring more than six steps. This type of data must take the shortest route in the other direction. The number of steps involved in sending a packet has little impact on the latency of transfer: the clock speed that controls all the steps is very fast relative to any other consideration. However, communication over long distances if they are harmful to the overall performance of the EIB, as it reduces the available concurrency.
Despite IBM’s original desire to implement the EIB as a switch (crossbar) more powerful, the circular configuration adopted to save money is rarely a limiting factor in the performance of the Cell chip as a whole. In the worst case, the programmer has to take extra care when planning a communications patterns where the EIB is able to operate with high levels of participation.
David Krolak explained:
Well, at first, early in the development process, several people promoting the idea of a switch (crossbar switch). By the way the bus was designed but could be withdrawn without the EIB and implement a switch if you were willing to devote more space on the chip wafer tracks. We had to find a balance between connectivity and simply did not exist space and enough room to put a switch. So we had this ring structure, which we think is quite interesting. It fits with space constraints and still provide enoughbandwidth noticeable.