Superscalar processor

Superscalarity is a means of increasing the instructions per cycle (IPC) of a processor pipeline and, consequently, its throughput. Unlike a scalar pipeline, a superscalar pipeline has a wider data path, enabling multiple instructions to advance in parallel through the pipeline. This results in instruction-level parallelism (ILP), whereby multiple instructions of the same program or thread are executed at the same time. It is important to distinguish ILP from simultaneous multithreading (SMT), in which instructions from different threads are executed in parallel on the same core. In contrast, ILP parallelizes the instructions of one thread only.

Comparison Scalar vs Superscalar — Comparison a scalar pipeline vs a superscalar pipeline

As shown in the diagram, in a scalar pipeline, only one instruction can be in one stage of the pipeline at any given time. In a superscalar pipeline, however, multiple instructions can be present at the same stage. The number of instructions depends on the width of the data path. In this example, the width is two instructions per stage, but wider paths are possible. Without any interruptions, such as branches, jumps or cache misses, this example would lead to a doubling of the IPC, or throughput for the pipeline. Another metric for ILP is the cycle per instruction (CPI). The theoretical minimum for a scalar pipeline is $1$, while superscalar pipelines can achieve a smaller CPI, depending on the width of the data path. Increasing the width of the data path requires more hardware resources. However, the increase in resources is not linear because additional hardware must be built into the pipeline to resolve dependencies between instructions advanced together in one instruction window.

The width of the data path can be divided further in metrics like issue width, dispatch width, etc.. These are specific to the concrete implementation.