docs / videos / slides

The Mill general purpose CPU architecture takes new approaches in most major areas of processor architecture. Below, in the coming months, we will add sections for the Belt, Operands and Data, the Memory Hierarchy, Protection, Software Pipelining, Branch Prediction, and other areas.


Topics
Memory

Instruction Encoding

A major portion of the area and power budget of modern high-end CPU cores is devoted to fetching and decoding instructions, to feed the functional units and to figure out what to do next. The instruction encoding techniques of the Mill CPU architecture allow high-end Mill family members to fetch, decode and issue up to 30 opcodes per cycle, sustained, within a three cycle decode pipeline.

white paper, talk  more…


The Belt

The Belt is the data interchange mechanism for the Mill general purpose CPU architecture, replacing the general registers of other architectures. The Mill’s belt is unique both in its programming model and its implementation at the micro-architecture level. Destination addressing is implicit, yielding more compact instruction encoding. The Belt is integrated with the function call mechanism; it eliminates caller/callee save conventions and callee pre-/postlude instructions, and it supports multi-result calls naturally. The Belt is Single-assignment, so rename registers and pipeline phases are unnecessary.

talk  more…


Memory

The Mill uses a novel load instruction that tolerates load misses as well as hardware out-of-order approaches can do, while avoiding the need for expensive load buffers and completely avoiding false aliasing. In addition, store misses are impossible on a Mill, and a large fraction of the memory traffic of a conventional processor can be omitted entirely.

talk  more…


Prediction

The Mill uses a novel prediction mechanism; it predicts transfers rather than branches. It can do so for all code, including code that has not yet ever been executed, running well ahead of execution so as to mask all cache latency and most memory latency. It needs no area- and power-hungry instruction window, using instead a very short decode pipeline and direct in-order issue and execution.

talk more…


Metadata

The Mill conveys some of the semantics of execution in the form of metadata attached to the arguments of operations, in addition to that expressed by the operation encodings in the executed code stream. Metadata propagates through execution, following rules specified by the architecture, although it may be altered explicitly by code when needed.

talk more…


Execution

A perennial objection to wide-issue CPU architectures such as VLIWs and the Mill is that there is insufficient instruction level parallelism (ILP) in programs to make effective use of the available functional width. This talk addresses the ILP issue, describing how the Mill is able to achieve much higher IPC even when the nominal ILP is relatively low.

talk more…


Security

Software bugs have always been a problem, but in recent years bugs have become an even more serious concern as they are exploited to breach system security for privacy violation, theft, and even terrorism or acts of war. The Mill CPU architecture addresses software robustness in three basic ways. This talk describes some of the Mill CPU features that defend against well-known error and exploit patterns.

talk more…


Specification

The Mill CPU architecture defines a generic Mill processor, from which a family of specific processors can be configured. A particular configuration for a Mill CPU family member is defined by a specification, which is processed by Mill configuration software to build a member-specific assembler, simulator, compiler back-ends, Verilog for the hardware implementation, documentation, and other tools and components.

talk more…


Pipelining

On a conventional machine, pipelining requires lengthy prelude and postlude instruction sequences to get the pipeline started and wound down, frequently destroying the benefit of pipelining the main body. Mill pipelines have neither prelude nor postlude, and early conditional exit has no added cost.

talk more…