Register Renaming
Register renaming is a method to eliminate WAW and WAR hazards dynamically on the hardware during execution. WAW hazards can arise frequently in an out-of-order architecture when instructions writing to the same register are reordered. This can result in a wrong architecture state, when the results are written in a different order as indented by the program. WAR hazards are an anti-dependence sometimes called name dependence. WAR hazards occur when an instruction overrides a value in the register file, which is still needed by another instruction.
In contrast, RAW hazards must be preserved. They represent a real data dependence, whereby an instruction must wait for the necessary data to be written by another instruction before it can proceed.
The limited size of the register file is the root cause of WAW and WAR hazards. The size limitation has an obvious pragmatic reason. While it would be possible to prevent all WAW and WAR hazards with an infinitely large register file, this would not be feasible in practice. A more practical reason for the limitation is binary compatibility. A register file with a fixed number of registers dictated by the ISA allows programs to be executed on all processors implementing the same ISA. In this particular case the RISC-V ISA1 defines 32 integer registers.
Dynamic register renaming offers a possibility to use more physical registers than architecture registers defined by the ISA. Originally developed as the Tomasulo's algorithm, it has now been extended into different forms. One of the main distinguishing features is the type of renaming, implicit or explicit renaming. While implicit renaming keeps the size of the register file as defined by the ISA and uses other components such as the reservation station entries as additional registers, explicit renaming uses a physically larger register file and manages table which holds the current mapping of architectural registers to physical registers. As Leszy uses explicit register renaming this documentation will focus and type and explain the implementation details of explicit renaming in Leszy.
Concept
(TODO: Umformulieren, andere gute Beispiele finden. Boom listet nur ein paar alte Kerne, ARM scheint es zu benutzen, aber es ist nicht expliziet Dokumentiert).
Leszy uses explicit register renaming, like BOOM2, ... . The register renaming in Leszy is heavily inspired by the BOOM2 core. Therefore, Leszy uses explicit register renaming with a physical register file larger than the architectural register file defined by the ISA. The number of physical registers is configurable. As hinted in the previous section a large physical register file can eliminate more WAW and WAR hazards. In practice a balance between resource usage and occasional stalls, in the case that all physical registers are in use, has to be found. The ideal size heavily depends on the number of instructions concurrently in execution, which depends on the number of execution units and the size of the reservation stations.
The implementation of the register renaming primarily consists of the Renaming-Table which holds the current mapping from architecture registers to physical registers and the Free-List which holds the currently unused registers available for renaming.
The concept of explicit register renaming
For a single issue setting the renaming is straightforward. As shown in the figure, all architecture register addresses are translated by the Rename-Table. The architectural register addresses can be used as indices for the Rename-Table. For the source registers (rs1, rs2, marked in green), the now physical register addresses can be used to read data from the physical register file. For the destination register (rd, marked in red) the physical destination register is read from the Free-List. This physical register will hold the result of this particular instruction. No other instruction will write to this physical register until it returns to the Free-List. A physical register returns to the Free-List when the data it contains is no longer needed. The physical destination register obtained for the Free-List also replaces the current renaming entry for the following instructions. Instructions requiring the results from the current instruction will therefore get an updated renaming and will read the right data.
In order to keep track of which physical registers can be returned to the Free-List, the architecture destination register is also renamed by the Rename-Table. The resulting physical stale register can return to the Free-List, when the corresponding instruction commits. In the architectural terms, this means that the data in an architectural register has been overwritten by this instruction. The data is lost and cannot be restored. For the physical register this means, that no other instruction will ever read from this physical register. It has to return to the Free-List in order to be reused for renaming.
Implementation
Schematic for the Register Renaming (TODO: Fertig zeichnen)
In a single-issue core the register renaming does not require additional components, then those described so far. This is different in a superscalar core. While the renaming for the first instruction in an issue-window is the same as in a single-issue core, subsequent instructions require additional bypasses to resolve the dependencies within the same issue-window. Dependencies within an issue-window occur when an instruction uses an architecture register for any of its registers (rs1, rs2 or rd) which is the destination register of any previous instruction in the same issue-window. In this case the corresponding physical register address must be forwarded from the Free-List. The figure shows the case for a two-issue-window, which results in three additional multiplexers and comparators. For even larger issue-windows additional multiplexers must be cascaded, as there may be dependencies with any previous instructions. This results in long combinatorial paths, which cannot be divided by registers. This is one aspect, which can limit the feasible maximum size of the issue-window.
Speculative Execution
TODO: Schreiben