Out-of-the Box Computing
docs / videos / slides
The Mill general purpose CPU architecture takes new approaches in most major areas of processor architecture. Below, in the coming months, we will add sections for the Belt, Operands and Data, the Memory Hierarchy, Protection, Software Pipelining, Branch Prediction, and other areas.
A major portion of the area and power budget of modern high-end CPU cores is devoted to fetching and decoding instructions, to feed the functional units and to figure out what to do next. The instruction encoding techniques of the Mill CPU architecture allow high-end Mill family members to fetch, decode and issue up to 30 opcodes per cycle, sustained, within a three cycle decode pipeline.
white paper, talk more…
The Belt is the data interchange mechanism for the Mill general purpose CPU architecture, replacing the general registers of other architectures. The Mill's belt is unique both in its programming model and its implementation at the micro-architecture level. Destination addressing is implicit, yielding more compact instruction encoding. The Belt is integrated with the function call mechanism; it eliminates caller/callee save conventions and callee pre-/postlude instructions, and it supports multi-result calls naturally. The Belt is Single-assignment, so rename registers and pipeline phases are unnecessary.
The Mill uses a novel load instruction that tolerates load misses as well as hardware out-of-order approaches can do, while avoiding the need for expensive load buffers and completely avoiding false aliasing. In addition, store misses are impossible on a Mill, and a large fraction of the memory traffic of a conventional processor can be omitted entirely.
The Mill uses a novel prediction mechanism; it predicts transfers rather than branches. It can do so for all code, including code that has not yet ever been executed, running well ahead of execution so as to mask all cache latency and most memory latency. It needs no area- and power-hungry instruction window, using instead a very short decode pipeline and direct in-order issue and execution.
The Mill conveys some of the semantics of execution in the form of metadata attached to the arguments of operations, in addition to that expressed by the operation encodings in the executed code stream. Metadata propagates through execution, following rules specified by the architecture, although it may be altered explicitly by code when needed.