Author: | Shing Kong, March 2001 |
---|---|
Original URL: | http://www.cs.wisc.edu/~markhill/kong/ |
Copyright: | Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. |
Revisions by: | Milo Martin, January 2006 (with permission) |
This document was originally writen by Shing Kong and published in March 2001. In January 2006, Milo Martin converted the document to html (via rST) and removed references to files not part of the distribution.
The goal of this document is to summarize some ideas I find useful in logic design and Verilog coding (Note 1). Logic design is not the same as Verilog coding. One common mistake of some inexperience logic designers is to treat logic design as a Verilog programming task. This often results in Verilog code that is hard to understand, hard to implement, and hard to debug.
Logic design is a process:
Verilog coding, on the other hand, is a modeling task. More specifically, after one has done some preliminary designs on the datapaths and controllers, Verilog code is then used to:
Note
Verilog is used as an example in this document. The ideas discussed in this document, however, should also applicable to other Hardware Description Language (such as VHDL) with minor adjustments.
The rest of this document is organized as follows:
Section 2: discusses the most important rule of logic design: keep it easy to understand. This section also introduces some basic Verilog coding guidelines. Section 3: discusses the art of dividing a design into high-level modules and then how these modules can be divided into datapaths and controllers. Section 4: discusses the logic design and Verilog coding guidelines for the datapath. Section 5: discusses the logic design and Verilog coding guidelines for the controller. Section 6: discusses some miscellaneous Verilog coding guidelines. Section 7: is a summary of all the logic design and Verilog coding guidelines introduced in this document. This summary serves as a quick reference for readers who either: (a) may not have the time to read this entire document, or (b) have already read this document once but want a quick reminder later on.
The example Verilog files that model a module, a datapath, and a controller are included in Appendix A, Appendix B, and Appendix C for those readers who are interested in looking at the structure of a complete Verilog file.
The most important logic design rule is more a philosophy than a rule :-)
Tip
Logic Design Guideline 2-1 (MOST IMPORTANT): The design MUST be as simple as possible and easy to understand!
If a design is hard to understand, then nobody will be able to help the original designer with his or her work. Also as time passes, the hard to understand design will become impossible to maintain and debug even for the original designer. Therefore, a logic designer must keep his or her design simple and easy to understand even if that means the design is slightly bigger or slightly slower as long as the design is still small enough and fast enough to meet the specification.
One important step in keeping a design simple and the Verilog code that models the design easy to understand is to use standard logic elements such as register, multiplexer, decoder, ... etc. Consequently, the first step in any Verilog coding project is:
Tip
Verilog Coding Guideline 2-1: Model all the standard logic elements in a library file to be SHARED by ALL engineers in the design team.
Below are some examples of the basic logic elements defined in such a library:
/*************************************************************** * Simple N-bit register with a 1 time-unit clock-to-q time ***************************************************************/ module v_reg( q, c, d ); parameter n = 1; output [n-1:0] q; input [n-1:0] d; input c; reg [n-1:0] state; assign #(1) q = state; always @(posedge c) begin state = d; end endmodule // v_reg /*************************************************************** * Simple N-bit latch with a 1 time-unit clock-to-q time ***************************************************************/ module v_latch ( q, c, d ); parameter n = 1; output [n-1:0] q; input [n-1:0] d; input c; reg [n-1:0] state; assign #(1) q = state; always @(c or d) begin if (c) begin state = d; end end endmodule // v_latch /*************************************************************** * Simple N-bit 2-to-1 Multiplexer ***************************************************************/ module v_mux2e( z, s, a, b ); parameter n = 1; output [n-1:0] z; input [n-1:0] a, b; input s; assign z = s ? b : a ; // s=1, z<-b; s=0, z<-a endmodule
One key observation from the logic elements defined in this library:
Tip
Verilog Coding Guideline 2-2: Only the storage elements (examples: register and latch) have non-zero clock-to-q time. All combinational logic (example: mux) has zero delay.
The non-zero clock-to-q time of the storage elements will prevent hold time problems at all registers' inputs. In general, a logic designer must NOT rely on a combinational logic block to have a certain minimum delay. The zero delay in the verilog model of the combinational logic elements will ensure logic designer does not rely on any minimum delay during simulation.
Once the basic logic elements have been modeled in the library file:
Tip
Verilog Coding Guideline 2-3: Use explicit register and latch (example: v_reg and v_latch as defined in the examples above ) in your verilog coding. Do not rely on logic synthesis tools to generate latches or registers for you.
By making the logic designer explicitly place the registers and/or latches, the logic designer is forced to consider timing implication of their logic early in the design cycle. In other words, the designer is forced to ask himself or herself questions such as: am I having too much logic in between registers so that it may not meet the cycle time? Also with explicit registers and latches in the the Verilog code, it will be much easier for those who read the code to draw a simple block diagrams showing all the registers in the design. Such a block diagram (see Section 4 and Section 5) is very useful in terms of understanding the design (remember the MOST important Logic Design Guideline above: the design must be easy to understand) as well as making timing tradeoffs when such tradeoffs are necessary.
At first glance, it seems ironic that the logic designer needs to always keep in mind how much combinational logic exists between any two storage elements (registers or latches) while in Verilog coding (see Verilog Coding Guideline 2-2), we want to treat all combinational logic to have zero delay. The reason for this apparent contradiction is that in logic design, the delay of the combinational logic between storage elements determines the cycle time. Consequently, it is important for the logic designer to be aware of the complexity of the logic between two storage components at all time. On the other hand, in order to reduce potential hold time problems, we also do not want the correct operation of the logic to depend on the logic having a certain minimum delay. The best way to make sure the logic can operate correctly without relying on the combinational logic blocks to have certain minimum delay is to run the Verilog simulation with all combinational logic blocks having zero delay and rely on the storage elements' (registers and/or latches) non-zero clock-to-q time to satisfy the hold time requirement of the next register.
Another important step in keeping a design simple and the Verilog code that models the design easy to understand is to adopt a hierarchal approach to the design process and then make the Verilog code follows the same hierarchy.
Hierarchal design, however, should not be carry to an extreme. For example, as pointed out by one of my colleagues Kyutaeg Oh [1], too deep an hierarchy can cause too many module instantiations, which will cause synthesis to run too slowly. Below is an hierarchal design strategy I find useful.
Tip
Logic Design Guideline 3-1: Use an hierarchal strategy that breaks the design into modules that consists of datapaths and controllers. More specifically:
One example for such an hierarchal approach can be found in the Serial ATA to Parallel ATA Converter for the Disk (Device Dongle). And as shown in Figure 3-1, the Device Dongle are divided into three modules:
+-----------+ +-------------+ +--------+ | | | | | | /-------\| Parallel |---> Transport |---> Link +--> To Serializer < ATA Bus > ATA | | Layer | | Layer | \-------/| Interface |<--+ |<--+ | | (ATAIF) | | (Transport) | | (Link) <--- From Deserializer | | | | | | +-----------+ +-------------+ +--------+ Figure 3-1: The Three Modules that Form the Device Dongle
The Parallel ATA Interace (ATAIF), the Transport Layer (Transport) and the Link Layer (Link) shown in Figure 3-1 are further divided into datapath and controller modules as described below and shown in Figure 3-2:
+----------------------+ +----------------------+ | Transport Layer | | Link Layer | | dtrans | | link | | +------------------+ | | +------------------+ | | | Transmit Engine | | | | Transmit Engine | | | | dtrans_tx | | | | link_tx | | | | +--------------+ | | | | +--------------+ | | | | | Datapath | | | | | | Datapath | | | | | | dtrans_txdp | | | | | | link_txdp | | | +------------------+ | | +--------------+ | | | | +--------------+ | | | Parallel ATA | | | | | | | | | | Interface | | | +--------------+ | | | | +--------------+ | | | dataif | | | | Controller | | | | | | Controller | | | | +-----------+ | | | | dtrans_txctl | | | | | | link_txctl | | | | | Datapath | | | | +--------------+ | | | | +--------------+ | | | | | | | | | | | | | | | | dataif_dp | | | | +--------------+ | | | | +--------------+ | | | +-----------+ | | | | Synchronizer | | | | | | Synchronizer | | | | | | | | dtrans_txsyn | | | | | | link_txsyn | | | | | | | +------------^-+ | | | | +------------^-+ | | | +------------+ | | +---+----------|---+ | | +---+----------|---+ | | | Controller | | | |(3) |(3) | | |(1) |(2) | | | | | | +---|----------+---+ | | +---|----------+---+ | | | dataif_ctl | | | | +-v------------+ | | | | +-v------------+ | | | +------------+ |(4)| | | Synchronizer | | | | | | Synchronizer | | | | +-------> | | | | | | | | | | | | | | dtrans_rxsyn | | | | | | link_rxsyn | | | | +--------------+ | | | +--------------+ | | | | +--------------+ | | | | Synchronizer | |(5)| | +--------------+ | | | | +--------------+ | | | | <-----+ | | Controller | | | | | | Controller | | | | | dataif_syn | | | | | dtrans_rxctl | | | | | | link_rxctl | | | | +--------------+ | | | +--------------+ | | | | +--------------+ | | +------------------+ | | | | | | | | | | +--------------+ | | | | +--------------+ | | | | | Datapath | | | | | | Datapath | | | | | | dtrans_rxdp | | | | | | link_rxdp | | | | | +--------------+ | | | | +--------------+ | | | | Receive Engine | | | | Receive Engine | | | | dtrans_rx | | | | dtrans_rx | | | +------------------+ | | +------------------+ | +----------------------+ +----------------------+ Figure 3-2: Further Divisions of the Device Dongle Modules
The Parallel ATA Interface, modeled by the "module dataif" in the Verilog file dataif.v (see Reference [2]), is further divided into the followings (see Reference [5]):
The Transport Layer, modeled by the "module dtrans" in the Verilog file dtrans.v (see Reference [3]), is further divided into the followings (see Reference [6]):
Transmit Engine: module dtrans_tx in the Verilog file dtrans_tx.v
This Transport Layer Transmit Engine is further divided into:
Receive Engine: module dtrans_rx in the Verilog file dtrans_rx.v
This Transport Layer Receive Engine is further divided into:
Similarly the Link Layer, modeled by the "module link" in the Verilog file link.v (see Reference [4]), is further divided into the followings (see Reference [7]):
Transmit Engine: module link_tx in the Verilog file link_tx.v
This Link Layer Transmit Engine is further divided into:
Receive Engine: module link_rx in the Verilog file link_rx.v
This Link Layer Receive Engine is further divided into:
The detail contents of these Verilog files (References [2 to 7]) are not needed to illustrate the following Verilog Coding Guideline:
Tip
Verilog Coding Guideline 3-1: A separate Verilog file is assigned to the Verilog code for:
A corollary of the above Verilog Coding Guideline is as follows:
Tip
Verilog Coding Guideline 3-2: In order to keep the number of Verilog files under control, one should try not to assign a separate Verilog file to any low level module that is at a hierarchy level lower than the datapath and the controller.
For example as I will show you in Section 4, the datapath will contain many datapath elements. Instead of assigning a separate Verilog file for each of these datapath elements, the datapath elements are all grouped into a single "library" file (link_library.v). Similarly, as I will show you in Section 5, the controller will contain a "Next State Logic" and an "Output Logic" blocks. Instead of assigning a separate Verilog file for each logic block, the logic blocks will be included in the Verilog file assigned to the controller.
Enclosed in Appendix A are the Verilog files dtrans.v, dtrans_tx.v, and dtrans_rx.v. Here is something worth noticing:
Tip
Verilog Coding Guideline 3-3: The Verilog code for the high level module, that is module at a hierarchy level higher than the datapath and the controller (examples: module dtrans_tx, module dtrans_rx, and module dtrans) should not contains any logic. It should only shows how the lower level modules are connected.
For example, if you look at the dtrans.v file in Appendix A, the "module dtrans" only shows how its transmit engine (dtrans_tx) and its receive engine (dtrans_rx) are connected. Similarly, if you look at the dtrans_tx.v file in Appendix A, the "module dtrans_tx" contains only the information on how its datapath (dtrans_txdp), its controller (dtrans_txctl), and its synchronizer (dtrans_txsyn) are connected together. In any case, neither the "module dtrans," the "module dtrans_tx," nor the "module dtrans_rx" contain any Verilog code that models raw logic.
Notice from Figure 3-2 that the ATA Interface module is divided into the datapath and the controller. On the other hand, the Transport Layer and the Link Layer are first partitioned into the Transmit Engine and the Receive Engine before further divided into controller and datapath. The reason for this extra level of hierarchy for the Transport Layer and the Link Layer is because their Transmit Engines and their Receive Engines work in different clock domains. More specifically, the ATA Interface, the Transmit Engine of the Link Layer, and the Transmit Engine of the Transport Layer all operates under the same clock, the transmit clock while the Receive Engines of the Link and Transport Layers both operates on a different clock, the receive clock. This leads to the following design guidelines:
Tip
Logic Design Guideline 3-2: Keep different clock domains separate and have an explicit synchronization module for signals that cross the clock domain.
For example, please refer to the places in Figure 3-2 labeled with numbers in parentheses as you read the numbered paragraph below:
Figure 4-1 is an example of a generalized datapath and the next paragraph describes some important observations from this figure:
|<--- Control Inputs from the Controller (2) -->| | | | | | | | |...| (3a) | | | ... | | +-v---v--+ |Select | +-v--------v-+ | Input N | See | N | | | Simple (4) | | A ---/--> Figure +-/-+ + | (5) | |Random Logic| | (1) | 4-2 | | |\v (3d) | +-+--------+-+ | +-+---+--+ | | \ +---+ |...| |...| (3b) |...| +-->0 + | | +-v---v--+ +-v---v-+-+ v v | | N | R | N | See | N | See | | N Output | +-/-> E +-/-> Figure +-/->Figure | +--/--> Q |...| (3a) | | ^ | G | ^ | 4-2 | | 4-3 | | (1) +-v---v--+ +-->1 + | | | | +-+---+--+ +-+---+-+^+ Input N | See | N | | / | +-^-+ | |...| |...| | (5) B ---/--> Figure +-/-+ |/ | | | v v v v CLK (1) | 4-2 | + | CLK | | +-+---+--+ (3c) | | (1) | |...| K (1) Y Internal Signals | v v | |<--- Control Outputs to the Controller (2) --->| Figure 4-1: Block Diagram of the General Datapath
When you read the numbered paragraphs below, please refer to the places in Figure 4-1 labeled with the same numbers in parentheses:
Control Inputs | |...| | | | | | +---v-v---v-v---+ | | N | Combinational | N -----/------> Datapath +------/-----> N-bit | Elements | N-bit Data Input | | Data Output +---+-+---+-+---+ | |...| | | | | | v v v v Control Outputs Figure 4-2: A Combinational Datapath Elements Control Inputs | |...| | | | | | +---------------------+ | | | | | | | +-v-v---v-v---+---+ | | | | | | +-> | +-+ N | Sequential | R | N -----/------> Datapath | E +------/-----> N-bit | Elements | G | N-bit Data Input | +-----+ | | Data Output | | REG < | | +-+-+-+-+-+-+-+-^-+ | |...| | | | | | | CLK v v v v Control Outputs Figure 4-3: A Sequential Datapath Elements with Register at its Outputs
The main function of the explicit pipeline register shown in Figure 4-1's Item 3d and Item 5 is to limit the datapath's critical path delay to a value less than the desired cycle time of the system. The effect of such pipeline register can be best understood with a timing diagram.
Tip
Logic Design Guideline 4-1: The best way to study the effect of the datapath's pipeline registers is to draw a timing diagram showing each register's effect on its outputs with respect to rising or falling edge of the register's input clock.
Figure 4-4 below is an example of such a timing diagram for the generalized datapath example shown in Figure 4-1. In this timing diagram example (when you read the numbered paragraphs below, please refer to the places in Figure 4-4 labeled with the same numbers in parentheses):
The N-bit Input A and Input B settle to their known values "A" and "B" sometimes after the rising edge of Cycle 2.
For the sake of simplicity, let's assume all the Control Inputs (likely generated by a controller similar to the one described in Section 5) of this datapath are stable prior to the rising edge of the Cycle 2 so that they are not factors in the critical delay path considerations. In actual design, such assumptions will be verified by static timing analysis.
Due to the assumption of the Control Inputs listed in Item 1, we only need to make sure Input A and Input B settle early enough to allow the two Combinational Datapath Elements (Item 3a in Figure 5-1) and the multiplexer (Item 3c in Figure 5-1) to produce the Internal Signal K at least one set-up time prior to the rising edge of Cycle 3.
If the condition listed in Item 2 is met, the pipeline register can then capture the value of Internal Signal K and set the Internal Signal Y to the value "Y" one clock-to-q time after the rising edge of Cycle 3.
Once again due to the assumption of the Control Inputs listed in Item 1, then as long as the Combinational Datapath Element after the pipeline register (Item 3d in Figure 4-1) together with the combinational logic within the Sequential Datapath Element (Item 3b in Figure 4-1) can produce the result for the Sequential Datapath Element's "implicit" register at least one set-up time prior to the rising edge of Cycle 4, then the Output of this datapath will be set to the stable value "Q" one clock-to-q time after the rising edge of Cycle 4
| 1 | 2 | 3 | 4 | 5 | 6 | | | | | | | | +----+ +----+ +----+ +----+ +----+ +----+ +- Clock | | | | | | | | | | | | | ----------+ +----+ +----+ +----+ +----+ +----+ +----+ | | | | | | | ---------------------+ +-------+ +-------------------------------------- Input A ///////////// X A X ////////////////////////////////////// ---------------------+ +-------+ +-------------------------------------- | | (1) | | | | | ---------------------+ +-------+ +-------------------------------------- Input B ///////////// X B X ////////////////////////////////////// ---------------------+ +-------+ +-------------------------------------- | | (2) | | | | | ---------------------------+ +-------+ +-------------------------------- Internal Signal K ///////// X Y X //////////////////////////////// ---------------------------+ +-------+ +-------------------------------- | | | (3) | | | | -------------------------------+ +-------+ +---------------------------- Internal Signal Y ///////////// X Y X //////////////////////////// -------------------------------+ +-------+ +---------------------------- | | | | (4) | | | -----------------------------------------+ +-------+ +------------------ Output Q //////////////////////////////// X Q X ////////////////// -----------------------------------------+ +-------+ +------------------ Figure 4-4: A Timing Diagram of the Datapath's Pipeline Register
Item 4 above brings up an interesting observation of the Sequential Datapath Element shown in Figure 4-3 where the implicit register of this datapath element is shown to be on the output side of the element. The placement of the register on the output side (versus the input side) in the drawing is intentional. It reflects the actual placement of the register in hardware. I like to place such a register at the output (versus input) so that all N-bit of the output will be stable at the same time at one clock-to-q time after each rising edge of the clock. Also shown in Figure 4-3 is that some Control Outputs of the Sequential Datapath Element can also be registered. This, however, is not as common as having the Control Outputs to be strictly combinational and allows the user of these signals (likely to be the controller, see Section 5) the flexibility of using these values one cycle earlier if the critical timing is not violated.
The above discussion of the timing diagram in Figure 4-4 illustrates that the logic designer cannot draw an accurate timing diagram unless he or she knows the exact location of the registers relative to the combinational logic. This brings us a corollary of the Logic Design Guideline 4-1:
Tip
Logic Design Guideline 4-2: The block diagram of the datapath should show ALL registers, including the implicit register of the Sequential Datapath Element.
Enclosed in Appendix B is the example Verilog file link_txdp.v which models the datapath for the Link Layer Transmit Engine (see Reference [8]). Let's take a look at some interesting observations from link_txdp.v:
Tip
Verilog Coding Guideline 4-1: Keep the verilog coding of the datapath simple and straight forward. Leave the fancy coding (IF any) to the datapath elements and place such elements in a separate (library) file.
For example, the Verilog coding of link_txdp.v is simplified by using the following two Sequential Datapath Elements:
/* * Scrambler */ l_scramble scrambler ( .scr_out (scr_out), .scr_in (32'hc2d2768d), .scr_init (txscr_init), .scr_run (txscr_run), .clk (txclk4x), .reset (lktx_reset)); /* * CRC Calculator */ l_crccal crc_calculator ( .crc_out (crc_out), .crc_in (32'h52325032), .datain (tp_txdata), .crc_init (txcrc_init), .crc_cal (txcrc_cal), .clk (txclk4x), .reset (lktx_reset));
As well a Combinational Datapath Element:
/* * Generate the primitive (prime_out) based on the selection (sel_prim) */ l_primgen primgen (.prim_out (prim_out), .sel_prim (sel_prim));
More specifically, the Verilog code in link_txdp.v only shows what the logic designer cares about the most at the datapath level: how the datapath elements (register, multiplexers, counters ... etc.) are connected together. The detailed modeling of these datapath elements are done in link_library.v which contains all library elements for the Link Layer. For your reference, link_library.v is also attached in Appendix B (see Reference [9]). Below are a few lines from link_library.v that defines the Scrambler:
/******************************************************************** * l_scramble: 32-bit scrambler that can be: * a. Reset to all zeros asynchronously * b. Load a fix pattern synchronously. * c. Keep its old value if scramble is not enable. * d. Update its output synchronously based on a LFSR algorithm. ********************************************************************/ module l_scramble (scr_out, scr_in, scr_init, scr_run, clk, reset); output [31:0] scr_out; // Scrambler's output input [31:0] scr_in; // Initial pattern to be loaded input scr_init; // Load the initial pattern input scr_run; // Update scr_out based on a LFSR input clk; input reset; reg [31:0] scram; // Scramble data pattern reg a15, a14, a13, // Intermediate scramble bits a12, a11, a10, a9, a8, a7, a6, a5, a4, a3, a2, a1, a0; wire [31:0] runmuxout; // Output of the scr_run MUX wire [31:0] lastmux; // Output of the final MUX /* * Combinational logic to produce the scramble pattern, * which should be updated whenever scr_out changes. * This logic was copied from Frank Lee's scramble.v */ always @(scr_out) begin a15 = scr_out[31] ^ scr_out[29] ^ scr_out[20] ^ scr_out[16]; a14 = scr_out[30] ^ scr_out[21] ^ scr_out[17]; a13 = scr_out[31] ^ scr_out[22] ^ scr_out[18]; : : scram[2] = a15^a14^a13; scram[1] = a15^a14; scram[0] = a15; end // Scrambling logic /* Priority: * scram scr_out ------------------------------- * | | reset (asynchronous): highest * +---v---v---+ scr_init (synchronous): middle * scr_run-->\S 1 0 / scr_run (synchronous): lowest * +---+---+ scr_in * | | * +---v-------v---+ * \ 0 1 S/<--scr_init (higher priority than scr_run) * +-----+-----+ * | * v * lastmux */ v_mux2e #(32) run_mux (runmuxout, scr_run, scr_out, scram); v_mux2e #(32) init_mux (lastmux, scr_init, runmuxout, scr_in); v_regre #(32) scr_ff (scr_out, clk, lastmux, (scr_run | scr_init), reset); endmodule // l_scramble
The definition of the Scrambler l_scramble (the l_ pre-fix indicates this is defined in link_library.v) illustrates another Logic Design Guideline:
Tip
Logic Design Guideline 4-3: While designing the Sequential Datapath Elements, separates the element into the two parts: (1) the combinational logic, and (2) the register.
For example in l_scramble.v, the combinational logic of the Scrambler is modeled by the "always" statement:
always @(scr_out) begin a15 = scr_out[31] ^ scr_out[29] ^ scr_out[20] ^ scr_out[16]; : : scram[2] = a15^a14^a13; end // Scrambling logic
while the register is modeled the 32-bit wide v_regre defined in library shared by the entire design team (see Verilog Coding Guideline 4-2 below):
v_regre #(32) scr_ff (scr_out, clk, lastmux, (scr_run | scr_init), reset);
The use of v_regre (the pre-fix v_ indicates this element is defined in the common library) illustrates the following Verilog Coding Guideline:
Tip
Verilog Coding Guideline 4-2: The Verilog coding of the datapath elements should make use of the standard logic elements (registers, multiplexers, ... etc.) already defined in the library discussed in Verilog Coding Guideline 2-1.
The last file included in Appendix B is "link_defs.v" (see Reference [10]) which defines all the "symbolic values" (i.e. assign a symbolic name to a given constant value) to be used by all the Verilog files for the Link layer. For example, this following line:
`include "link_defs.v"
is used in both the datapath file (link_txdp.v) and the Link Layer library file (link_library.v) so that all the symbolic values defined in link_defs.v. can be used by these two files. Below are some examples of these symbolic values that are specific to the datapath:
/* * Number of primitives and the bit position of the 1-hot encoded vector */ `define num_prim 18 // Basic Primitives `define B_ALIGN 0 `define B_SYNC 1 `define B_CONT 2 : : `define B_X_RDY 9 : : `define B_PMACK 16 `define B_PMNAK 17
These symbolic values are then used by datapath file (link_txdp.v) in the following way:
/* * Interconnections within this portion of the datapath */ wire [`num_prim:0] // Number of primitives + D10.2 sel_prim; // Select the proper primitives // Primitive send by the Transmit Controller assign sel_prim[`B_ALIGN] = txsn_align; assign sel_prim[`B_X_RDY] = txsn_xrdy;
It should be obvious that the Verilog code above is much easier to maintain and much easier to understand than the equivalent Verilog code:
/* * Interconnections within this portion of the datapath */ wire [18:0] sel_prim; // Primitive send by the Transmit Controller assign sel_prim[0] = txsn_align; assign sel_prim[9] = txsn_xrdy;
This example of how Verilog code uses symbolic values to improve its ease of maintenance leads us to the following Verilog Coding Guideline:
Tip
Verilog Coding Guideline 4-3: Define symbolic values (see also Verilog Coding Guideline 5-2) in a header file (example: link_defs.v) and include this header file in all files that can make use of these symbolic values to make the Verilog code easier to maintain and easier to understand.
Other symbolic values defined in link_defs.v such as:
// Number of TX states and bit position of the 1-hot state encoding `define num_lktxstate 15 `define B_NOCOMM 0 `define B_SENDALIGN 1 `define B_NOCOMMERR 2 : : : `define B_BUSYRCV 13 `define B_POWERDOWN 14 // State Values `define RESET 15'h0000 // All bits are zeros `define NOCOMM 15'h0001 // Bit 0 is set `define SENDALIGN 15'h0002 // Bit 1 is set : : : `define POWERSAVE 15'h4000 // Link layer is power down
are used for the Verilog code that models the controller for the Link Layer. How these symbolic values can be used to simplify the Verilog code of the controller will be explained in Section 5. More specifically, please refer to Verilog Coding Guideline 5-1 in Section 5.
Almost without exception, within the core of every controller is one or more finite state machine(s). This is shown in Figure 5-1 where only one finite state machine is shown for simplicity. Reader with enough imagination should be able to visualize how this picture can be generalized with multiple finite state machines.
+------------------------------------------------+ | A General Controller | | +--------------+---+---+ | (2a) Inputs | (1) | Finite State |S | | | Outputs -----------+---------> Machine |T R| +-+--------------------> | | +---+ | (4) |A E| | | +---+ | Type 1 | | | R | | |T G| | | | R | | | +-> E +-+->See Figure 5-2|E | | +-> E +-+------------> | | | G | | | or Figure 5-3| | | | | G | | | Outputs | | +-^-+ | +--------------+-^-+---+ | +-^-+ | | Type 2 | | | | | | | | | (2b) | | clk | clk | clk | | | | | | | | | | | +-v-------v-+ | | | | | | | | | +------------------------> Simple | | Outputs | | | Random +----------> | +--------------------------------> Logic (3) | | Type 3 | | | | (2c) | +-----------+ | +------------------------------------------------+ Figure 5-1: Block Diagram of the General Controller
Here are some important observations from Figure 5-1. When you read the numbered paragraphs below, please refer to the places in Figure 5-1 labeled with the same numbers in parentheses:
+---------------------------+ | +-------+ +---+ | +--------+ | N | Next | Next |S | | Current | | +--/--> | State |t R| | State | Output | | State +--/---->a e+-+---/-----> +--/--> Outputs Inputs -----/--> | N |t g| N | Logic | P M | Logic | |e | | | +-------+ +-^-+ +--------+ | Clock Figure 5-2: The Moore State Machine +---------------------------+ | +-------+ +---+ | +--------+ | N | Next | Next |S | | Current | | +--/--> | State |t R| | State | Output | | State +--/---->a e+-+---/-----> +--/--> Outputs Inputs --+--/--> | N |t g| N | Logic | P | M | Logic | |e | +--> | | +-------+ +-^-+ | | | | | | +--------+ | Q (Q <= M) Clock | +--/-------------------------------+ Figure 5-3: The Meally State Machine
One question raised by Figure 5-1's Item 1 and Item 2 (see Paragraph 1 and 2 above) is when and where should we use pipeline registers to stage the inputs or outputs? This leads us to the following logic design guideline:
Tip
Logic Design Guideline 5-1: The best way to decide when and where to use pipeline register or registers to stage the controller inputs and outputs is to draw a timing diagram showing each register's effect on its outputs with respect to rising or falling edge of the register's input clock.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | | | | | | | | | | | | +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +- Clock | | | | | | | | | | | | | | | | | | | | | -------+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ | | | | | | | | | | | --------------+ +---------+ +---------------------------------------- Inputs /////// X (1) A X //////////////////////////////////////// --------------+ +---------+ +---------------------------------------- | | | | | | | | | | | ---------------+ +---------+ +--------------------------------------- Next State //// X B (2) X /////////////////////////////////////// ---------------+ +---------+ +--------------------------------------- | | | | | | | | | | | --------------------+ +---------+ +---------------------------------- Current State ////// X B (3) X ////////////////////////////////// --------------------+ +---------+ +---------------------------------- Figure 5-4: A Timing Diagram Showing Relative Timing
One simple example of such a timing diagram is shown in Figure 5-4, which shows the effect of the State Register in Figure 5-2, the Moore State Machine. When you read the numbered paragraphs below, please refer to the places in Figure 5-4 labeled with the same numbers in parentheses:
In this simple example, only one register and three signals are shown. Needless to say, in a real timing diagram, one will have multiple registers and many more signals. The basic idea, however, remains the same: shows only the "relative timing," that is shows how the registers affect the timing of the signals with respect the clock edge(s) but not the absolute delay timing.
A corollary of the Design Guideline 5-1 is:
Tip
Logic Design Guideline 5-2: The block diagram of the controller should show ALL registers explicitly while the random logic can be represented by a simple black box.
By drawing all the registers EXPLICITLY in the block diagram, the designer will less likely to make a mistake when he or she attempt to draw the "relative timing" diagram similar to the one shown in Figure 5-4 (see note below) when the designer thinks about the sequence of events need to be controlled. Notice that in Figure 5-1, we try to meet the Design Guideline 5-2 by showing the State Register in the blackbox representing the Finite State Machine.
Note
Even if the designer does not draw such a timing diagram explicitly on paper, he or she may still has to "draw" it implicitly in his or her head.
Notice that both Figure 5-2 and Figure 5-3 show finite state machines with a M-bit input, a N-bit state register, and a P-bit output. The only difference is that in Figure 5-2, the Moore machine, the P-bit output is a function of the N-bit current state only while in Figure 5-3, the Meally Machine, the P-bit output depends on both the N-bit current state as well as a sub-set (Q is an integer smaller or equal to M) of the M-bit inputs. Depending on the state encoding, the N-bit state registers can represents a maximum of 2**N states or a minimum of N states if one-hot encoding is used.
Tip
Logic Design Guideline 5-3: If possible, use one-hot encoding for the finite state machine's state encoding to simplify the Output Logic as well as the Next State Logic.
One hot encoding refer to the encoding style where each bit of the State Register represents one state and the corresponding bit is asserted only when the finite state machine is at the state represents by that bit. Consequently, only ONE bit of the N-bit state register will be asserted at any given time. My experience is that one-hot encoding can greatly simplify the logic equations for the Output Logic block (in most cases, reduce to simple inverters, AND gates, and OR gates) as well as for the Next State Logic block. Philosophically, the reason why one-hot encoding can simplify the output logic is simple: when the finite state machine designer designs a finite state machine, he or she creates a state for one purpose: the state indicates the need to set the outputs to some values different than any other state (if not, there is no need to have a separate state!) Therefore if the state information is not one-hot encoded, the Output Logic must first decode the N-bit state register before it can generates the output. On the other hand, when one-hot encoding is used, the need for doing a N-to-2**N decode is eliminated. Similarly when one-hot encoding is used, the Next State Logic does not need to perform the equivalent of the N-to-2**N decode before deciding what is the next state and once the next state is decided, it does not need to perform the equivalent of a 2**N-to-N encoding of the next state.
One draw-back of one-hot encoding is that for a finite state machine with a large number of states (i.e. N is a big number is Figure 5-2 and Figure 5-3), the State Register can be very wide. A wide register, however, is usually not that bad a problem. In any case, in order to keep the design easy to understand and debug, one may want to avoid using "one BIG and complex" finite state machine anyway:
Tip
Logic Design Guideline 5-4: Instead of designing a controller with a giant and complex finite state machine at its core, it may be easier to break the controller into multiple smaller controllers, each with a smaller and simplier finite state machine at its core.
In both Figure 5-2 and Figure 5-3, it is possible to integrate the Output Logic block and the Next State Logic block into one single random logic block. However, in order to keep the logic design easy to understand:
Tip
Logic Design Guideline 5-5: For finite state machine design, keep the Next State Logic block separate from the Output Logic block.
As I will show you later in Verilog Coding Guideline 5-3 and 5-4, the Verilog code that models the finite state machine is also easier to read and understand if the Next State Logic block is kept separate from the Output Logic block.
One final word on the Meally Machine shown in Figure 5-3. The Output Logic's input are shown to come from both the Current State and Input. In order to simplify the Output Logic block, it is "logically" equivalent to use some of the Next State bits (i.e. output of the Next State Logic prior to the State Register) as input to the Output Logic block. This is shown in Figure 5-5. This, however, should be done with extreme care.
Tip
Logic Design Guideline 5-6: In a Meally Machine design, it is possible to use the Next State Logic block's output as inputs to the Output Logic block. This must be done with caution since the total delay of the two logic block may become the critical path of the controller.
+---------------------------+ | +-------+ +---+ |. +--------+ | N | Next | Next |S | | Current | | +--/--> | State |t R| | State | Output | | State +--/-+-->a e+-+----/-------> +--/--> Outputs Inputs --+--/--> | N | |t g| N | | P | M | Logic | | |e | +---> Logic | | +-------+ | +-^-+ | | | | | | (R <= M) | +-> | | | Clock R | | | | | +------------/----+ | +--------+ | Q (Q <= M) | +--/-----------------------------------+ Figure 5-5: An Alternate Form of the Meally State Machine
Enclosed in Appendix C are two Verilog files illustrating the various controller design guidelines:
trans_defs.v: define all the symbolic values applys to all Transport Layer files, see Reference [11]. dtrans_txctl.v: models the controller for the Device Dongle's Transport Layer Transmit Engine, see Reference [12].
First, let's take a look at some interesting observations from trans_defs.v.
Tip
Verilog Coding Guideline 5-1: If one-hot encoding is used for the finite state machine (see Logic Design Guideline 5-3), define a symbolic value for each bit position as well as a symbolic value for the binary value when that bit position is set. This makes the Verilog code much easier to read and understand.
For example, here are some lines from the trans_defs.v file attached in Appendix C (reader can read the entire definition in Appendix C):
/* * Define the state values and bit position for Device's Transmit Finite * State machine (FSM in dtran_txctl). This FSM implements the "transmit" * states describes in Section 8.7 (PP. 197-205) of SATA Spec, 1.0. */ `define num_dttxfsm 15 `define B_DTTXIDLE 0 `define B_DTCHKTYP 1 `define B_DTREGFIS 2 // Spec's DT_RegHDFIS `define B_DTPIOSTUP 3 // Spec's DT_PIOSTUPFIS : : : `define B_DTBISSTA 14 // Device Dongle's TX FSM State Values `define DTTXIDLE 15'h0001 `define DTCHKTYP 15'h0002 `define DTREGFIS 15'h0004 // Spec's DT_RegHDFIS `define DTPIOSTUP 15'h0008 // Spec's DT_PIOSTUPFIS : : : `define DTBISSTA 15'h4000
Notice that I have use "define" to create these symbolic values:
Tip
Verilog Coding Guideline 5-2: One common convention used by many Verilog code writer is to use "define" for constant values such as:
`define DTTXIDLE 15'h0001
while "parameter" is used ONLY for things that can changed such as the width of the register, muxes ... etc. (see also Section 2):
/*************************************************************** * Simple N-bit register with a 1 time-unit clock-to-q time ***************************************************************/ module v_reg( q, c, d ); parameter n = 1; input [n-1:0] d; input c; output [n-1:0] q; reg [n-1:0] state; assign #(1) q = state; always @(posedge c) begin state = d; end endmodule // v_reg
Next, lets look at the file dtrans_txctl.v. The main module of this file consists of the following sections clearly labeled by comments:
module dtrans_txctl ( // Outputs tp_acksendreg, : senddata, // Inputs at_sendreg, : tptx_reset); /* * Next State Logic and the State Register for the finite state machine */ // Next State Logic dtrans_txfsm dtrans_txfsm ( ... // State Register v_reg #(`num_dttxfsm) state_ff (cur_state, txclk4x, next_state); /* * Counter and its MUX tree to select the count limit * for the generation of the expire signal */ /* * Output Logic for generating output signals */ endmodule // dtrans_txctl
This leads to the following Logic Design and Verilog Coding guidelines.
Tip
Verilog Coding Guideline 5-3: Use an explicit State Register and separate the Next State Logic from this explicit register.
For example in dtrans_txctl.v, we have:
/* * Next State Logic and the State Register for the finite state machine */ // Next State Logic dtrans_txfsm dtrans_txfsm ( // Outputs .next_state (next_state), // Inputs .cur_state (cur_state), .at_sendreg (at_sendreg), .at_senddmaa (at_senddmaa), : .txtimeout (txtimeout), .expire (expire), .tptx_reset (tptx_reset)); // State Register v_reg #(`num_dttxfsm) state_ff (cur_state, txclk4x, next_state);
The Next State Logic here is implemented in the separate "dtrans_txfsm" module in dtrans_txctl.v. The module "dtrans_txfsm" has only one output, the "next_state" vector, and contains only one thing: a "Case Statement" enclosed in a "always" block:
Tip
Verilog Coding Guideline 5-4: The Next State Logic, with only ONE output (the "next_state" vector), can be implemented easily with a Verilog Case statement.
/************************************************************************* * Module dtrans_txfsm: Random logic for the transmit finite state machine ************************************************************************/ module dtrans_txfsm ( // Outputs next_state, // Inputs cur_state, at_sendreg, at_senddmaa, : expire, tptx_reset); : : always @(cur_state or at_sendreg or at_senddmaa ... /*** List ALL Inputs of this module ***/ txtimeout or expire or tptx_reset) begin if (tptx_reset) begin next_state = `DTTXIDLE; end else begin case (cur_state) `DTTXIDLE: if (~r2t_rxempty) begin /* * Give the receive engine higher priority */ next_state = `DTCHKTYP; end : end : `DTBISSTA: if (~lk_txfsmidle & ~txtimeout) begin next_state = `DTBISSTA; end : default: begin // We should never be here next_state = `DTWAITTXID; $display ( "*** Warning: Undefined HTP RX State, cur_state = %b ***", cur_state); end endcase end // End else (tptx_reset == 0) end // End always endmodule // dtrans_txfsm
Notice that the module "dtran_txfsm" has ONLY one output "next_state." This is a very desirable feature when we use the Verilog "Case Statement" because one thing we have to be careful when we use the "Case Statement" is that every output MUST have a defined value for each branch of the Case statement. Otherwise, the synthesis tool will generate a latch to keep the old value, which in most cases is NOT what the logic designer intends. This, having only one output (the "next_state") for the Next State Logic, is one reason why the Logic Design Guideline 5-5 encourages you to separate the Next State Logic block from the Output Logic block.
In many finite state machine design, the number of states can be reduced and the Next State Logic can therefore be simplified if one take advantage of the fact that the state machine wants to stay at a certain state for "N cycles" (where N is a fix integer >=1) then go to the next state and stay there for another "M cycles" (M is another integer >= 1 but != N) before move onto another state. One example of this behavior is the DRAM controller where the controller will enter the "Row Address Active" state for a few cycles, then go to the "Column Address Active" state for a few cycles, before moving onto the "Precharge" state ... etc.
Tip
Logic Design Guideline 5-7: A finite state machine containing states whose transition to their next states are governed only by the number of cycles it has to wait can be simplified by building a multiplexer tree to select the number of cycles a counter must count before generating an "expire" signal to trigger the state transition.
Logic Design Guideline 5-7 is illustrated by the following Verilog code in dtrans_txctl.v. In a nutshell:
We start the counter (count_enable = 1) when the current state is either: DTREGFIS, DTPIOSTUP, DTXMITBIS, or DTDMASTUP. Since we are using one-hot encoding, we are in one of this state when the corresponding bit in the cur_state register: cur_state[`B_DTREGFIS], cur_state[`B_DTPIOSTUP], cur_state[`B_DTXMITBIS], or cur_state[`B_DTDMASTUP] is set:
assign count_enable = cur_state[`B_DTREGFIS] | cur_state[`B_DTPIOSTUP] | cur_state[`B_DTXMITBIS] | cur_state[`B_DTDMASTUP]; v_countN #(`log_maxfis) expire_count ( .count_out (wcount), .count_enable (count_enable), .clk (txclk4x), .reset (tptx_reset | expire));
Based on the current state, the multiplexer tree is used to select the number of cycles the counter must count (count_limit) before the state is triggered to transition to the next state:
/* * Counter and its MUX tree to select the count limit * for the generation of the expire signal */ v_mux2e #(`log_maxfis) regpio_mux (num_regpio, cur_state[`B_DTPIOSTUP], `NDFISREGm1, `NDFISPIOSm1); v_mux2e #(`log_maxfis) dmabis_mux (num_dmabis, cur_state[`B_DTXMITBIS], `NBFISDMASm1, `NBFISBISTAm1); v_mux2e #(`log_maxfis) cntlmt_mux (count_limit, (cur_state[`B_DTXMITBIS] | cur_state[`B_DTDMASTUP]), num_regpio, num_dmabis);
The number of cycles the counter needs to count for each state is defined in trans_defs.v:
`define NDFISREGm1 3'd4 // Device-to-Host (D) Register (REG) `define NDFISPIOSm1 3'd4 // Device-to-Host (D) PIO Setup (PIOS) `define NBFISDMASm1 3'd6 // Bidirectional (B) DMA Setup (DMAS) `define NBFISBISTAm1 3'd2 // Bidirectional (B) BIST Activate (BISTA)
Finally, the 3-bit comparator is used to generate the "expire" signal, which is used as input to the Next State Logic, to trigger the state transition when the counter reaches the "count_limit" selects by the MUX tree in Step 2:
v_comparator #(`log_maxfis) expire_cmp (count_full, wcount, count_limit); assign expire = count_full & count_enable;
The last part of the Verilog code in dtrans_txctl.v:
/* * Random logic for generating output signals */ assign tp_acksendreg = cur_state[`B_DTREGFIS]; : assign tp_acksenddata = cur_state[`B_DTDATAFIS]; assign tp_sendndfis = cur_state[`B_DTREGFIS] | cur_state[`B_DTPIOSTUP] | cur_state[`B_DTDMASTUP] | cur_state[`B_DTDMAACT] | cur_state[`B_DTXMITBIS];
shows how the output logic and the "glue logic" (see Item 3 of Figure 5-1) can be implemented with simple "assign" statements.
Tip
Verilog Coding Guideline 5-5: With the more complex Next State Logic already taken care of by the "Case Statement" (see Verilog Coding Guideline 5-3) and with the help of one-hot encoding for the state machine, the Output Logic can usually be implemented easily with simple assign statements.
If you look at the Verilog files in Appendix A, Appendix B, and Appendix C, you will notice all the verilog files have very similar format.
Tip
Verilog Coding Guideline 6-1: In order to keep the Verilog files easy to read and easy to understand for every member of the design team, adopt a standard format and use the same format for all Verilog files.
For example, the link_txdp.v file in Appendix B follows this format:
module module_name ( // Bi-directional ports (if any) bi_port1, //*** First list the inout ports (if any) bi_port2, //*** List one port per line // Output ports o_port3, //*** Then list the output ports o_port4, // Input ports i_port5); //*** Finally, list the input ports /* * Declare all bi-directional ports */ inout bi_port1; //*** Declare one port per line inout bi_port2; /* * Declare all output ports */ output o_port3; output o_port4; /* * Declare all input ports */ input i_port5; /* * After all ports are declared, declare all the wires */ wire wire1; //** Declare one wire per line wire wire2; /* * Declare all registers (if any) */ reg reg1; //** Declare one register per line reg reg2; /* * Core of the Verilog code */ endmodule
Notice that in link_txdp.v file in Appendix B, when the module "l_scramble" is instantiated, explicit connection (example: .reset (lktx_reset)) is used:
l_scramble scrambler ( .scr_out (scr_out), .scr_in (32'hc2d2768d), .scr_init (txscr_init), .scr_run (txscr_run), .clk (txclk4x), .reset (lktx_reset));
Tip
Verilog Coding Guideline 6-2: In order to avoid confusion on which wire is connected which port, use explicit connection (example: .port_name (wire)) when a module is instantiated.
The module l_scramble module is defined in the file link_library.v which is also included in Appendix B. Notice the detailed comment in this module:
/* Priority: * scram scr_out ------------------------------- * | | reset (asynchronous): highest * +---v---v---+ scr_init (synchronous): middle * scr_run-->\S 1 0 / scr_run (synchronous): lowest * +---+---+ scr_in * | | * +---v-------v---+ * \ 0 1 S/<--scr_init (higher priority than scr_run) * +-----+-----+ * | * v * lastmux */
Tip
Verilog Coding Guideline 6-3: In order to keep the Verilog code easy to understand for everyone (including yourself :-), use detailed comments. More importantly, put in the comments as you do the coding because if you do not put in the comments now, it is unlikely you will put them in later.
Finally, one may notice the absent of the "timescale" statements in any of the files that models the high level modules (Appendix A), the datapath (Appendix B), and the controller (Appendix C). The reason is that there is no need to have any timescale statements in the Verilog code if the Verilog Coding Guideline 2-2 is followed:
Tip
Verilog Coding Guideline 2-2: Only the storage elements (examples: register and latch) have non-zero clock-to-q time. All combinational logic (example: mux) has zero delay.
More specifically, as shown in Section 2, the v_reg and v_latch each has "1 time unit" clock-to-q delay. This clock-to-q delay is the ONLY delay we have in our Verilog code. Consequently, our Verilog code will work no matter what time scale this time unit is set to (i.e. it can set to 1ps, 1ns, 1ms, ... etc.). The only time we need to have a timescale statement is when we want to run simulation on our Verilog model.
Tip
Verilog Coding Guideline 6-4: Ideally, there should not be any "timescale" directive in any of the Verilog file that models the hardware (because they are not needed if we follow the Verilog Coding Guideline 2-2). Consequently, there should only be ONE and only ONE timescale directive in any Verilog simulation run and that timescale directive should be placed at the beginning of the test bench file (see Reference [13]).
Below is a summary of all the logic design guidelines:
Tip
Logic Design Guideline 2-1 (MOST IMPORTANT): The design MUST be as simple as possible and easy to understand!
Tip
Logic Design Guideline 3-1: Use an hierarchal strategy that breaks the design into modules that consists of datapaths and controllers. More specifically:
Divide the problem into multiple modules with clean and well defined interface.
Tip
Logic Design Guideline 3-2: Keep different clock domains separate and have an explicit synchronization module for signals that cross the clock domain.
Tip
Logic Design Guideline 4-1: The best way to study the effect of the datapath's pipeline registers is to draw a timing diagram showing each register's effect on its outputs with respect to rising or falling edge of the register's input clock.
Tip
Logic Design Guideline 4-2: The block diagram of the datapath should show ALL registers, including the implicit register of the Sequential Datapath Element.
Tip
Logic Design Guideline 4-3: While designing the Sequential Datapath Elements, separates the element into the two parts: (1) the combinational logic, and (2) the register.
Tip
Logic Design Guideline 5-1: The best way to decide when and where to use pipeline register or registers to stage the controller inputs and outputs is to draw a timing diagram showing each register's effect on its outputs with respect to rising or falling edge of the register's input clock.
Tip
Logic Design Guideline 5-2: The block diagram of the controller should show ALL registers explicitly while the random logic can be represented by a simple black box.
Tip
Logic Design Guideline 5-3: If possible, use one-hot encoding for the finite state machine's state encoding to simplify the Output Logic as well as the Next State Logic.
Tip
Logic Design Guideline 5-4: Instead of designing a controller with a giant and complex finite state machine at its core, it may be easier to break the controller into multiple smaller controllers, each with a smaller and simplier finite state machine at its core.
Tip
Logic Design Guideline 5-5: For finite state machine design, keep the Next State Logic block separate from the Output Logic block.
Tip
Logic Design Guideline 5-6: In a Meally Machine design, it is possible to use the Next State Logic block's output as inputs to the Output Logic block. This must be done with caution since the total delay of the two logic block may become the critical path of the controller.
Tip
Logic Design Guideline 5-7: A finite state machine containing states whose transition to their next states are governed only by the number of cycles it has to wait can be simplified by building a MUX tree to select the number of cycles a counter must count before generating an "expire" signal to trigger the state transition.
Below is a summary of all the Verilog coding guidelines:
Tip
Verilog Coding Guideline 2-1: Model all the standard logic elements in a library file to be SHARED by ALL engineers in the design team.
Tip
Verilog Coding Guideline 2-2: Only the storage elements (examples: register and latch) have non-zero clock-to-q time. All combinational logic (example: mux) has zero delay.
Tip
Verilog Coding Guideline 2-3: Use explicit register and latch (example: v_reg and v_latch as shown in Section 2) in your verilog coding. Do not rely on logic synthesis tools to generate latches or registers for you.
Tip
Verilog Coding Guideline 3-1: A separate Verilog file is assigned to the Verilog code for:
Tip
Verilog Coding Guideline 3-2: In order to keep the number of Verilog files under control, one should try not to assign a separate Verilog file to any low level module that is at a hierarchy level lower than the datapath and the controller.
Tip
Verilog Coding Guideline 3-3: The Verilog code for the high level module, that is module at a hierarchy level higher than the datapath and the controller (examples: module dtrans_tx, module dtrans_rx, and module dtrans) should not contains any logic. It should only shows how the lower level modules are connected.
Tip
Verilog Coding Guideline 4-1: Keep the verilog coding of the datapath simple and straight forward. Leave the fancy coding (IF any) to the datapath elements and place such elements in a separate (library) file.
Tip
Verilog Coding Guideline 4-2: The Verilog coding of the datapath elements should make use of the standard logic elements (registers, multiplexers, ... etc.) already defined in the library discussed in Verilog Coding Guideline 2-1.
Tip
Verilog Coding Guideline 4-3: Define symbolic values (see also Verilog Coding Guideline 5-2) in a header file (example: link_defs.v) and include this header file in all files that can make use of these symbolic values to make the Verilog code easier to maintain and easier to understand.
Tip
Verilog Coding Guideline 5-1: If one-hot encoding is used for the finite state machine (see Logic Design Guideline 5-3), define a symbolic value for each bit position as well as a symbolic value for the binary value when that bit position is set. This makes the Verilog code much easier to read and understand.
Tip
Verilog Coding Guideline 5-2: One common convention used by many Verilog code writer is to use "define" for constant values such as:
`define DTTXIDLE 15'h0001
while "parameter" is used ONLY for things that can changed such as the width of the register, muxes ... etc. (see also Section 2).
Tip
Verilog Coding Guideline 5-3: Use an explicit State Register and separate the Next State Logic from this explicit register.
Tip
Verilog Coding Guideline 5-4: The Next State Logic, with only ONE output (the "next_state" vector), can be implemented easily with a Verilog Case statement.
Tip
Verilog Coding Guideline 5-5: With the more complex Next State Logic already taken care of by the "Case Statement" (see Verilog Coding Guideline 5-3) and with the help of one-hot encoding for the state machine, the Output Logic can usually be implemented easily with simple assign statements.
Tip
Verilog Coding Guideline 6-1: In order to keep the Verilog files easy to read and easy to understand for every member of the design team, adopt a standard format and use the same format for all Verilog files.
Tip
Verilog Coding Guideline 6-2: In order to avoid confusion on which wire is connected which port, use explicit connection (example: .port_name (wire)) when a module is instantiated.
Tip
Verilog Coding Guideline 6-3: In order to keep the Verilog code easy to understand for everyone (including yourself :-), use detailed comments. More importantly, put in the comments as you do the coding because if you do not put in the comments now, it is unlikely you will put them in later.
Tip
Verilog Coding Guideline 6-4: Ideally, there should not be any "timescale" directive in any of the Verilog file that models the hardware (because they are not needed if we follow the Verilog Coding Guideline 2-2). Consequently, there should only be ONE and only ONE timescale directive in any Verilog simulation run and that timescale directive should be placed at the beginning of the test bench file (see Reference [13]).
With all these logic design and Verilog coding guidelines, does this mean there is no room for logic designer to be creative? Not at all. Artists such as movie directors and music composers need to follow many guidelines and yet nobody can say they are not doing creative work. They just spend their creativity at tasks that require creativity and follow the standard guidelines (such as a movie should be approximately 2 hours long) when creativity is not needed. Logic design is the same: be creative on tasks that truly deserves innovation (such as how to build a datapath that can process data at half the power) but not on tasks such as how to write a complex Verilog statement that can save a few lines of Verilog code but nobody else can understand.
The ultimate goal for any logic designer is to keep his or her design and the Verilog code that models the design AS EASY TO UNDERSTAND AS POSSIBLE. Remember this, the easier other people can understand your design and your Verilog code, more people can help you in your work and less likely will your vacation be interrupted by late night phone calls from your coworker covering for you :-) So make your design easy to understand :-)
[1] Private communications, October 2001.
/home/kong/P2001/Verilog/DeviceDongle/Transport/dtrans.v /home/kong/P2001/Verilog/DeviceDongle/Transport/dtrans_tx.v /home/kong/P2001/Verilog/DeviceDongle/Transport/dtrans_txdp.v /home/kong/P2001/Verilog/DeviceDongle/Transport/dtrans_txctl.v
/home/kong/P2001/Verilog/DeviceDongle/Transport/dtrans_rx.v /home/kong/P2001/Verilog/DeviceDongle/Transport/dtrans_rxdp.v /home/kong/P2001/Verilog/DeviceDongle/Transport/dtrans_rxctl.v
/home/kong/P2001/Verilog/DeviceDongle/Link/link.v /home/kong/P2001/Verilog/DeviceDongle/Link/link_tx.v /home/kong/P2001/Verilog/DeviceDongle/Link/link_txdp.v /home/kong/P2001/Verilog/DeviceDongle/Link/link_txctl.v
/home/kong/P2001/Verilog/DeviceDongle/Link/link_rx.v /home/kong/P2001/Verilog/DeviceDongle/Link/link_rxdp.v /home/kong/P2001/Verilog/DeviceDongle/Link/link_rxctl.v
/home/kong/P2001/Verilog/CommonFiles/link_defs.v
/home/kong/P2001/Verilog/CommonFiles/trans_defs.v
Note: The file trans_defs.v is placed in the "CommonFiles" directory because it is used by all Transport Layer files.
------------------------ That's all for now folks :-) ------------------------
/**************************************************************************** * * File Name: dtrans.v * * Comment: Device Dongle Transport Layer * * Author: Shing Kong * Creation Date: 5/9/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixA,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixA,v 1.1 2001/12/06 21:49:07 kong Exp $ */ module dtrans ( // Transmit Engine's Outputs tp_acksendreg, tp_acksenddmaa, tp_acksendpios, tp_acksenddmas, tp_acksendbist, tp_acksenddata, tp_txdata, tp_txgoempty, tp_txempty, tp_sendndfis, tp_senddafis, tp_partial, tp_slumber, tp_spdsel, // Transmit Engine's Inputs at_data, at_error, at_seccnt, at_secnum, at_cyllow, at_cylhi, at_devhd, at_status, at_interrupt, at_sendreg, at_senddmaa, at_sendpios, at_senddmas, at_sendbista, at_senddata, lk_txfsmidle, lk_rdtxfifo, lk_txerror, txclk, txclk4x, // Receive Engine's Output tp_datain, tp_featurein, tp_seccntin, tp_secnumin, tp_cyllowin, tp_cylhiin, tp_devhdin, tp_cmdin, tp_devctlin, tp_cbit, tp_wrdata, tp_wrATAreg, tp_rxnearfull, tp_rxfull, tp_rxdabort, tp_fisgood, tp_fisundef, // Receive Engine's Input at_rxdabort, lk_rxdata, lk_wrrxfifo, lk_eofis, rxclk, rxclk4x, // Input for both the Transmit and Receive Engines at_reset); /* * Transmit Engine's Outputs to the Parallel ATA Interface (dataif.v) */ output tp_acksendreg; output tp_acksenddmaa; output tp_acksendpios; output tp_acksenddmas; output tp_acksendbist; output tp_acksenddata; /* * Transmit Engine's Outputs to the Link Layer (link.v) */ output [31:0] tp_txdata; output tp_txgoempty; output tp_txempty; output tp_sendndfis; // Sending a non-data FIS output tp_senddafis; // Sending a data FIS output tp_partial; output tp_slumber; output tp_spdsel; /* * Transmit Engine's Inputs from the Parallel ATA Interface (dataif.v) */ input [15:0] at_data; input [7:0] at_error; input [7:0] at_seccnt; input [7:0] at_secnum; input [7:0] at_cyllow; input [7:0] at_cylhi; input [7:0] at_devhd; input [7:0] at_status; input at_interrupt; input at_sendreg; input at_senddmaa; input at_sendpios; input at_senddmas; input at_sendbista; input at_senddata; /* * Transmit Engine's Inputs from the Link Layer (link.v) */ input lk_txfsmidle; // TX FSM has returned to IDLE input lk_rdtxfifo; input lk_txerror; /* * Transmit Engine's Clocks */ input txclk; input txclk4x; /* * Receive Engine's Outputs to the Parallel ATA Interface (dataif.v) */ // These signals will not be synchronized. Instead we will // synchronize the "write strobe" at "dataif"--see below output [15:0] tp_datain; output [7:0] tp_featurein; output [7:0] tp_seccntin; output [7:0] tp_secnumin; output [7:0] tp_cyllowin; output [7:0] tp_cylhiin; output [7:0] tp_devhdin; output [7:0] tp_cmdin; output [2:1] tp_devctlin; output tp_cbit; // These signals will be synchronized at "dataif" output tp_wrdata; output tp_wrATAreg; // Write ATA registers (except Data) /* * Receive Engine's Outputs to the Link Layer (link.v) */ output tp_rxnearfull; output tp_rxfull; output tp_rxdabort; // Data RX has been aborted output tp_fisgood; // Receive a valid FIS output tp_fisundef; // FIS not recognized by Transport /* * Receive Engine's Inputs from the Parallel ATA Interface (dataif.v) * These signals need to be synchronized w.r.t. the rxclk4x clock */ input at_rxdabort; // ATA interface aborts data receiving /* * Receive Engine's Inputs from the Link Layer (link.v) */ input [31:0] lk_rxdata; input lk_wrrxfifo; input lk_eofis; // Link layer finish writing the FIS /* * Receive Engine's Clocks */ input rxclk; input rxclk4x; /* * Inputs to both the Transmit and Receive Engines */ input at_reset; /* * Interconnections within this module */ // Outputs of the Transmit Engine (dtrans_tx.v) wire txfsmidle; wire txokrxgo; // TX FSM gives RX FSM the OK // Outputs of the Recevie Engine (dtrans_rx.v) wire waittxid; wire rxempty; /* * Device Dongle Transport Layer Transmit Engine (dtrans_tx.v) */ dtrans_tx dtrans_tx ( // Outputs .tp_acksendreg (tp_acksendreg), .tp_acksenddmaa (tp_acksenddmaa), .tp_acksendpios (tp_acksendpios), .tp_acksenddmas (tp_acksenddmas), .tp_acksendbist (tp_acksendbist), .tp_acksenddata (tp_acksenddata), .tp_txdata (tp_txdata), .tp_txgoempty (tp_txgoempty), .tp_txempty (tp_txempty), .tp_sendndfis (tp_sendndfis), .tp_senddafis (tp_senddafis), .tp_partial (tp_partial), .tp_slumber (tp_slumber), .tp_spdsel (tp_spdsel), .txfsmidle (txfsmidle), .txokrxgo (txokrxgo), // Inputs .at_data (at_data), .at_error (at_error), .at_seccnt (at_seccnt), .at_secnum (at_secnum), .at_cyllow (at_cyllow), .at_cylhi (at_cylhi), .at_devhd (at_devhd), .at_status (at_status), .at_interrupt (at_interrupt), .at_sendreg (at_sendreg), .at_senddmaa (at_senddmaa), .at_sendpios (at_sendpios), .at_senddmas (at_senddmas), .at_sendbista (at_sendbista), .at_senddata (at_senddata), .lk_txfsmidle (lk_txfsmidle), .lk_rdtxfifo (lk_rdtxfifo), .lk_txerror (lk_txerror), .waittxid (waittxid), .rxempty (rxempty), .txclk (txclk), .txclk4x (txclk4x), .at_reset (at_reset)); /* * Device Dongle Transport Layer Receive Engine (dtrans_rx.v) */ dtrans_rx dtrans_rx ( // Outputs .tp_datain (tp_datain), .tp_featurein (tp_featurein), .tp_seccntin (tp_seccntin), .tp_secnumin (tp_secnumin), .tp_cyllowin (tp_cyllowin), .tp_cylhiin (tp_cylhiin), .tp_devhdin (tp_devhdin), .tp_cmdin (tp_cmdin), .tp_devctlin (tp_devctlin), .tp_cbit (tp_cbit), .tp_wrdata (tp_wrdata), .tp_wrATAreg (tp_wrATAreg), .tp_rxnearfull (tp_rxnearfull), .tp_rxfull (tp_rxfull), .tp_rxdabort (tp_rxdabort), .tp_fisgood (tp_fisgood), .tp_fisundef (tp_fisundef), .waittxid (waittxid), .rxempty (rxempty), // Inputs .at_rxdabort (at_rxdabort), .lk_rxdata (lk_rxdata), .lk_wrrxfifo (lk_wrrxfifo), .lk_eofis (lk_eofis), .txfsmidle (txfsmidle), .txokrxgo (txokrxgo), .rxclk (rxclk), .rxclk4x (rxclk4x), .at_reset (at_reset)); endmodule // dtrans
/**************************************************************************** * * File Name: dtrans_tx.v * * Comment: Device Dongle Transport Layer Transmission Engine * * Author: Shing Kong * Creation Date: 3/25/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixA,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixA,v 1.1 2001/12/06 21:49:07 kong Exp $ */ `include "trans_defs.v" // See ../../CommonFiles module dtrans_tx ( // Outputs tp_acksendreg, tp_acksenddmaa, tp_acksendpios, tp_acksenddmas, tp_acksendbist, tp_acksenddata, tp_txdata, tp_txgoempty, tp_txempty, tp_sendndfis, tp_senddafis, tp_partial, tp_slumber, tp_spdsel, txfsmidle, txokrxgo, // Inputs at_data, at_error, at_seccnt, at_secnum, at_cyllow, at_cylhi, at_devhd, at_status, at_interrupt, at_sendreg, at_senddmaa, at_sendpios, at_senddmas, at_sendbista, at_senddata, lk_txfsmidle, lk_rdtxfifo, lk_txerror, waittxid, rxempty, txclk, txclk4x, at_reset); /* * Outputs to the Parallel ATA Interface Layer (dataif.v) */ output tp_acksendreg; output tp_acksenddmaa; output tp_acksendpios; output tp_acksenddmas; output tp_acksendbist; output tp_acksenddata; /* * Outputs to the Link Layer (link.v) */ output [31:0] tp_txdata; output tp_txgoempty; output tp_txempty; output tp_sendndfis; // Sending a non-data FIS output tp_senddafis; // Sending a data FIS output tp_partial; output tp_slumber; output tp_spdsel; /* * Outputs to the Transport Layer Receive Engine (dtrans_rx.v) */ output txfsmidle; output txokrxgo; // TX FSM gives RX FSM the OK /* * Inputs from the Parallel ATA Interface (dataif.v) */ input [15:0] at_data; input [7:0] at_error; input [7:0] at_seccnt; input [7:0] at_secnum; input [7:0] at_cyllow; input [7:0] at_cylhi; input [7:0] at_devhd; input [7:0] at_status; input at_interrupt; input at_sendreg; input at_senddmaa; input at_sendpios; input at_senddmas; input at_sendbista; input at_senddata; /* * Inputs from the Link Layer (link.v) */ input lk_txfsmidle; // TX FSM has returned to IDLE input lk_rdtxfifo; input lk_txerror; /* * Inputs from the Transport Layer Receive Engine (dtrans_rx.v) */ input waittxid; input rxempty; /* * Reset signal and clocks */ input txclk; input txclk4x; input at_reset; /* * Interconnections within this module */ // Outputs of the Synchronizer wire tptx_reset; wire r2t_waittxid; wire r2t_rxempty; // Outputs of the Transport Layer Transmit Controller (dtrans_txctl.v) wire [`log_maxfis-1:0] wcount; wire wrtxfifo; wire sendreg; wire senddmaa; wire sendpios; wire senddmas; wire sendbista; wire senddata; // Outputs of the Transport Layer Transmit Datapath (dtrans_txdp.v) wire txfull; wire txtimeout; /* * Synchronizer that synchronizes the signals to the txclk4x domain */ dtrans_txsyn dtrans_txsyn ( .tptx_reset (tptx_reset), .r2t_waittxid (r2t_waittxid), .r2t_rxempty (r2t_rxempty), .waittxid (waittxid), .rxempty (rxempty), .txclk (txclk), .txclk4x (txclk4x), .at_reset (at_reset)); /* * Device Transport Layer Transmit Controller (dtrans_txctl.v) */ dtrans_txctl dtrans_txctl ( // Outputs .tp_acksendreg (tp_acksendreg), .tp_acksenddmaa (tp_acksenddmaa), .tp_acksendpios (tp_acksendpios), .tp_acksenddmas (tp_acksenddmas), .tp_acksendbist (tp_acksendbist), .tp_acksenddata (tp_acksenddata), .tp_sendndfis (tp_sendndfis), .tp_senddafis (tp_senddafis), .tp_partial (tp_partial), .tp_slumber (tp_slumber), .tp_spdsel (tp_spdsel), .txfsmidle (txfsmidle), .txokrxgo (txokrxgo), .wcount (wcount), .wrtxfifo (wrtxfifo), .sendreg (sendreg), .senddmaa (senddmaa), .sendpios (sendpios), .senddmas (senddmas), .sendbista (sendbista), .senddata (senddata), // Inputs .at_sendreg (at_sendreg), .at_senddmaa (at_senddmaa), .at_sendpios (at_sendpios), .at_senddmas (at_senddmas), .at_sendbista (at_sendbista), .at_senddata (at_senddata), .lk_txfsmidle (lk_txfsmidle), .lk_txerror (lk_txerror), .r2t_waittxid (r2t_waittxid), .r2t_rxempty (r2t_rxempty), .txfull (txfull), .txtimeout (txtimeout), .txclk4x (txclk4x), .tptx_reset (tptx_reset)); /* * Device Transport Layer Transmit Datapath (dtrans_txdp.v) */ dtrans_txdp dtrans_txdp ( // Outputs .tp_txdata (tp_txdata), .tp_txgoempty (tp_txgoempty), .tp_txempty (tp_txempty), .txfull (txfull), .txtimeout (txtimeout), // Inputs .at_data (at_data), .at_error (at_error), .at_seccnt (at_seccnt), .at_secnum (at_secnum), .at_cyllow (at_cyllow), .at_cylhi (at_cylhi), .at_devhd (at_devhd), .at_status (at_status), .at_interrupt (at_interrupt), .lk_rdtxfifo (lk_rdtxfifo), .wcount (wcount), .wrtxfifo (wrtxfifo), .sendreg (sendreg), .senddmaa (senddmaa), .sendpios (sendpios), .senddmas (senddmas), .sendbista (sendbista), .senddata (senddata), .txclk4x (txclk4x), .tptx_reset (tptx_reset)); endmodule // dtrans_tx /**************************************************************************** * Module dtrans_txsyn: Synchronize signals for the txclk4x clock domain ****************************************************************************/ module dtrans_txsyn ( // Outputs tptx_reset, r2t_waittxid, r2t_rxempty, // Inputs waittxid, rxempty, txclk, txclk4x, at_reset); output tptx_reset; output r2t_waittxid; // RX FSM waiting TX FSM to be idle output r2t_rxempty; // RX FIFO is empty input waittxid; input rxempty; input txclk; input txclk4x; input at_reset; /* * Connection within this module */ wire txclk_waittxid; wire txclk_rxempty; /* * Register the reset signal before using it locally */ v_reg #(1) reset_ff (tptx_reset, txclk4x, at_reset); /* * For signals coming from the rxclk or rxclk4x domain, * Step 1: Sample it with the txclk clock. * Step 2: Sample it again with the txclk4x clock. * * Note: we assume the rising edges of txclk and txclk4x are aligned. * Consequently, sampling a signal with txclk followed by sampling it * with txclk4x has the same effect as sampling a signal twice with * txclk as far as the prevention of meta-stability is concerned. */ v_reg #(1) txclk_ff0 (txclk_waittxid, txclk, waittxid); v_reg #(1) txclk_ff1 (txclk_rxempty, txclk, rxempty); v_reg #(1) syn_ff0 (r2t_waittxid, txclk4x, txclk_waittxid); v_reg #(1) syn_ff1 (r2t_rxempty, txclk4x, txclk_rxempty); endmodule // dtrans_txsyn
/**************************************************************************** * * File Name: dtrans_rx.v * * Comment: Device Dongle Transport Layer Receive Engine * * Author: Shing Kong * Creation Date: 5/3/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixA,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixA,v 1.1 2001/12/06 21:49:07 kong Exp $ */ module dtrans_rx ( // Outputs tp_datain, tp_featurein, tp_seccntin, tp_secnumin, tp_cyllowin, tp_cylhiin, tp_devhdin, tp_cmdin, tp_devctlin, tp_cbit, tp_wrdata, tp_wrATAreg, tp_rxnearfull, tp_rxfull, tp_rxdabort, tp_fisgood, tp_fisundef, waittxid, rxempty, // Inputs at_rxdabort, lk_rxdata, lk_wrrxfifo, lk_eofis, txfsmidle, txokrxgo, rxclk, rxclk4x, at_reset); /* * Outputs to the Parallel ATA Interface Layer (dataif.v) */ // These signals will not be synchronized. Instead we will // synchronize the "write strobe" at "dataif"--see below output [15:0] tp_datain; output [7:0] tp_featurein; output [7:0] tp_seccntin; output [7:0] tp_secnumin; output [7:0] tp_cyllowin; output [7:0] tp_cylhiin; output [7:0] tp_devhdin; output [7:0] tp_cmdin; output [2:1] tp_devctlin; output tp_cbit; // These signals will be synchronized at "dataif" output tp_wrdata; output tp_wrATAreg; // Write ATA registers (except Data) /* * Outputs to the Link Layer (link.v) */ output tp_rxnearfull; output tp_rxfull; output tp_rxdabort; // Data RX has been aborted output tp_fisgood; // Receive a valid FIS output tp_fisundef; // FIS not recognized by Transport /* * Output to the Transport Layer Transmit Engine (dtrans_tx.v) */ output waittxid; // RX FSM waiting TX FSM to be idle output rxempty; /* * Inputs from the Parallel ATA Interface Layer (dataif.v) * These signals need to be synchronized w.r.t. the rxclk4x clock */ input at_rxdabort; // ATA interface aborts data receiving /* * Inputs from the Link Layer (link.v) */ input [31:0] lk_rxdata; input lk_wrrxfifo; input lk_eofis; // Link layer finish writing the FIS /* * Inputs from the Transport Layer Transmit Engine (dtrans_tx.v) */ input txfsmidle; input txokrxgo; // TX FSM gives RX FSM the OK /* * Miscellaneous Inputs */ input rxclk; input rxclk4x; input at_reset; /* * Connections within this module */ // Outputs of the Synchronizer wire tprx_reset; wire t2r_rxdabort; // Abort receiving data wire t2r_txfsmidle; wire t2r_txokrxgo; // TX FSM gives RX FSM the OK // Outputs of the Transport Layer Receive Controller (dtrans_rxctl.v) wire incomefis; // Receiving FIS wire rdrxfifo; wire hld_feature; wire hld_seccnt; wire hld_secnum; wire hld_cyllow; wire hld_cylhi; wire hld_devhd; wire hld_cmd; wire hld_devctl; wire hld_cbit; wire upperdata; // Use bits[31:16] of the Data FIS // Outputs of the Transport Layer Receive Datapath (dtrans_rxdp.v) wire rxtimeout; wire hregfis; wire dmasfis; wire bisafis; wire datafis; /* * Synchronizer that synchronize the signals to the rxclk4x domain */ dtrans_rxsyn dtrans_rxsyn ( // Outputs .tprx_reset (tprx_reset), .t2r_rxdabort (t2r_rxdabort), .t2r_txfsmidle (t2r_txfsmidle), .t2r_txokrxgo (t2r_txokrxgo), // Inputs .at_reset (at_reset), .at_rxdabort (at_rxdabort), .txfsmidle (txfsmidle), .txokrxgo (txokrxgo), .rxclk (rxclk), .rxclk4x (rxclk4x)); /* * Device Transport Layer Receive Controller (dtrans_rxctl.v) */ dtrans_rxctl dtrans_rxctl ( // Outputs .tp_wrdata (tp_wrdata), .tp_wrATAreg (tp_wrATAreg), .tp_rxdabort (tp_rxdabort), .tp_fisgood (tp_fisgood), .tp_fisundef (tp_fisundef), .waittxid (waittxid), .incomefis (incomefis), .rdrxfifo (rdrxfifo), .hld_feature (hld_feature), .hld_seccnt (hld_seccnt), .hld_secnum (hld_secnum), .hld_cyllow (hld_cyllow), .hld_cylhi (hld_cylhi), .hld_devhd (hld_devhd), .hld_cmd (hld_cmd), .hld_devctl (hld_devctl), .hld_cbit (hld_cbit), .upperdata (upperdata), // Inputs .t2r_rxdabort (t2r_rxdabort), .lk_eofis (lk_eofis), .t2r_txfsmidle (t2r_txfsmidle), .t2r_txokrxgo (t2r_txokrxgo), .rxempty (rxempty), .rxtimeout (rxtimeout), .hregfis (hregfis), .dmasfis (dmasfis), .bisafis (bisafis), .datafis (datafis), .rxclk4x (rxclk4x), .tprx_reset (tprx_reset)); /* * Device Transport Layer Receive Datapath (dtrans_rxdp.v) */ dtrans_rxdp dtrans_rxdp ( // Outputs .tp_datain (tp_datain), .tp_featurein (tp_featurein), .tp_seccntin (tp_seccntin), .tp_secnumin (tp_secnumin), .tp_cyllowin (tp_cyllowin), .tp_cylhiin (tp_cylhiin), .tp_devhdin (tp_devhdin), .tp_cmdin (tp_cmdin), .tp_devctlin (tp_devctlin), .tp_cbit (tp_cbit), .tp_rxnearfull (tp_rxnearfull), .tp_rxfull (tp_rxfull), .rxempty (rxempty), .rxtimeout (rxtimeout), .hregfis (hregfis), .dmasfis (dmasfis), .bisafis (bisafis), .datafis (datafis), // Inputs .lk_rxdata (lk_rxdata), .lk_wrrxfifo (lk_wrrxfifo), .incomefis (incomefis), .rdrxfifo (rdrxfifo), .hld_feature (hld_feature), .hld_seccnt (hld_seccnt), .hld_secnum (hld_secnum), .hld_cyllow (hld_cyllow), .hld_cylhi (hld_cylhi), .hld_devhd (hld_devhd), .hld_cmd (hld_cmd), .hld_devctl (hld_devctl), .hld_cbit (hld_cbit), .upperdata (upperdata), .rxclk4x (rxclk4x), .tprx_reset (tprx_reset)); endmodule // dtrans_rx /**************************************************************************** * Module dtrans_rxsyn: Synchronize signals for the rxclk4x clock domain ****************************************************************************/ module dtrans_rxsyn ( // Outputs tprx_reset, t2r_rxdabort, t2r_txfsmidle, t2r_txokrxgo, // Inputs at_rxdabort, txfsmidle, txokrxgo, rxclk, rxclk4x, at_reset); output tprx_reset; output t2r_rxdabort; // Abort receiving data output t2r_txfsmidle; output t2r_txokrxgo; // TX FSM gives RX FSM the OK input at_rxdabort; // ATA interface aborts data receiving input txfsmidle; input txokrxgo; // TX FSM gives RX FSM the OK input rxclk; input rxclk4x; input at_reset; /* * Interconnections within this module */ wire rxclk_rxdabort; // Abort receiving data wire rxclk_txfsmidle; wire rxclk_txokrxgo; // TX FSM gives RX FSM the OK v_reg #(1) reset_ff (tprx_reset, rxclk4x, at_reset); /* * For signals coming from the txclk or txclk4x domain, * Step 1: Sample it with the rxclk clock. * Step 2: Sample it again with the rxclk4x clock. * * Note: we assume the rising edges of rxclk and rxclk4x are aligned. * Consequently, sampling a signal with rxclk followed by sampling it * with rxclk4x has the same effect as sampling a signal twice with * rxclk as far as the prevention of meta-stability is concerned. */ v_reg #(1) rxclk_ff0 (rxclk_rxdabort, rxclk, at_rxdabort); v_reg #(1) rxclk_ff1 (rxclk_txfsmidle, rxclk, txfsmidle); v_reg #(1) rxclk_ff2 (rxclk_txokrxgo, rxclk, txokrxgo); v_reg #(1) syn_ff0 (t2r_rxdabort, rxclk4x, rxclk_rxdabort); v_reg #(1) syn_ff1 (t2r_txfsmidle, rxclk4x, rxclk_txfsmidle); v_reg #(1) syn_ff2 (t2r_txokrxgo, rxclk4x, rxclk_txokrxgo); endmodule // dtrans_rxsyn
/**************************************************************************** * * File Name: link_txdp.v * * Comment: Link Layer datapath for data transmission * * The datapath here excludes the part that is interfaced to the * SERDES so everything here should run on the txclk4x. * * Author: Shing Kong * Creation Date: 3/26/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixB,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixB,v 1.1 2001/12/06 21:49:07 kong Exp $ */ `include "link_defs.v" // See ../../CommonFiles module link_txdp ( // Outputs txencdata, next_txnrd, // Inputs tp_txdata, r2t_rxsnprim, r2t_rxsnrrdy, r2t_rxsnrip, r2t_rxsnrok, r2t_rxsnrerr, r2t_rxsndmat, r2t_rxsnsync, r2t_rxsnhold, r2t_rxsnholda, txscram_on, txscr_init, txscr_run, txcrc_init, txcrc_cal, txsn_CRC, txsn_data, txsn_sync, txsn_align, txsn_xrdy, txsn_sof, txsn_holda, txsn_hold, txsn_eof, txsn_wtrm, cur_txnrd, txclk4x, lktx_reset); /* * Output to SERDES Interface */ output [0:39] txencdata; // Encoded transmit data /* * Output to the Link Layer Transmit Controller */ output next_txnrd; // New negative running disparity /* * Inputs from the Transport Layer */ input [31:0] tp_txdata; /* * Inputs from the Link Layer Receiver (link_rx.v) * These signals have been synchronized with respect to txclk4x */ input r2t_rxsnprim; // RX controls the primitive sending input r2t_rxsnrrdy; // RX sends the R_RDY primitive input r2t_rxsnrip; // RX sends the R_RIP primitive input r2t_rxsnrok; // RX sends the R_OK primitive input r2t_rxsnrerr; // RX sends the R_ERR primitive input r2t_rxsndmat; // RX sends the DMAT primitive // The following primitives can be sent by either link_txctl or link_rxctl input r2t_rxsnsync; // Sends the SYNC primitive input r2t_rxsnhold; // Sends the HOLD primitive input r2t_rxsnholda; // Sends the HOLDA primitive /* * Inputs from the Link Layer Transmit Controller (link_txctl.v) */ input txscram_on; // Turn on data scrambling input txscr_init; // Initialize the scrambler input txscr_run; // Increment the scrambler polynomial input txcrc_init; // Initialize the CRC calculator input txcrc_cal; // Update the CRC output input txsn_CRC; // Send out the CRC pattern input txsn_data; // Send data or CRC (nor primitive) input txsn_sync; // TX sends the SYNC primitive input txsn_align; // TX sends the ALIGN primitive input txsn_xrdy; // TX sends the X_RDY primitive input txsn_sof; // TX sends the SOF primitive input txsn_holda; // TX sends the HOLDA primitive input txsn_hold; // TX sends the HOLD primitive input txsn_eof; // TX sends the EOF primitive input txsn_wtrm; // TX sends the WTRM primitive input cur_txnrd; // Current negative running disparity /* * Clocks and reset signals */ input txclk4x; // Transmit clock input lktx_reset; /* * Interconnections within this portion of the datapath */ wire [`num_prim:0] // Number of primitives + D10.2 sel_prim; // Select the proper primitives wire [31:0] scr_out; // Output of the scrambler wire [31:0] crc_out; // Output of the CRC calculator wire [31:0] prim_out; // Output of the Primitive generator wire [31:0] crc_data; // Output of the CRC mux wire [31:0] scram_data; // Scrambled data wire [31:0] scrcrcdata; // Data or CRC after scrambled wire [31:0] rawtxdata; // TX Data before 8b/10b coding wire [31:0] clk_rawtxdata; // rawtxdata after registered wire [0:39] enctxdata; // TX Data after 8b/10b coding // Running negative disparity after encoding Byte 0, Byte 1, and Byte 2 wire [2:0] txnrd_byte; wire clk_sendprim; // ~(txsn_data | txsn_CRC) registered /* * Simple logic to form the primitive selct vector * * First, deal with the 3 primitives that can be sent by either * the Transmit (link_txctl) or the Receive (link_rxctl) controller * * Then we just need to proper connection with assign statements */ v_mux2e #(3) txrx_selmux ( {sel_prim[`B_HOLDA], sel_prim[`B_HOLD], sel_prim[`B_SYNC]}, r2t_rxsnprim, {txsn_holda, txsn_hold, txsn_sync}, {r2t_rxsnholda, r2t_rxsnhold, r2t_rxsnsync}); // Primitive send by the Transmit Controller assign sel_prim[`B_ALIGN] = txsn_align; assign sel_prim[`B_X_RDY] = txsn_xrdy; assign sel_prim[`B_SOF] = txsn_sof; assign sel_prim[`B_EOF] = txsn_eof; assign sel_prim[`B_WTRM] = txsn_wtrm; // Primitive send by the Receive Controller assign sel_prim[`B_R_RDY] = r2t_rxsnrrdy; assign sel_prim[`B_R_IP] = r2t_rxsnrip; assign sel_prim[`B_R_OK] = r2t_rxsnrok; assign sel_prim[`B_R_ERR] = r2t_rxsnrerr; assign sel_prim[`B_DMAT] = r2t_rxsndmat; /*** Debug 4/13/2001: For now set these to zeros ***/ assign sel_prim[`B_CONT] = 1'b0; assign sel_prim[`B_PMREQ_P] = 1'b0; assign sel_prim[`B_PMREQ_S] = 1'b0; assign sel_prim[`B_PMACK] = 1'b0; assign sel_prim[`B_PMNAK] = 1'b0; assign sel_prim[`B_D10_2] = 1'b0; /* * Scrambler */ l_scramble scrambler ( .scr_out (scr_out), .scr_in (32'hc2d2768d), .scr_init (txscr_init), .scr_run (txscr_run), .clk (txclk4x), .reset (lktx_reset)); /* * CRC Calculator */ l_crccal crc_calculator ( .crc_out (crc_out), .crc_in (32'h52325032), .datain (tp_txdata), .crc_init (txcrc_init), .crc_cal (txcrc_cal), .clk (txclk4x), .reset (lktx_reset)); // MUX to select between sending data or sending the CRC pattern v_mux2e #(32) crc_mux (crc_data, txsn_CRC, tp_txdata, crc_out); // Scramble the data by performing a bit-wise exclusive OR assign scram_data = crc_data ^ scr_out; // MUX to decide if we want to send the scrambled data v_mux2e #(32) scramble_mux (scrcrcdata, txscram_on, crc_data, scram_data); /* * Generate the primitive (prime_out) based on the selection (sel_prim) */ l_primgen primgen (.prim_out (prim_out), .sel_prim (sel_prim)); // Decide if we should send the primitive out v_mux2e #(32) prim_mux (rawtxdata, (txsn_data | txsn_CRC), prim_out, scrcrcdata); // Registered the data before sending them to the 8b/10b encoders v_reg #(32) rawdata_ff (clk_rawtxdata, txclk4x, rawtxdata); v_reg #(1) sendprim_ff (clk_sendprim, txclk4x, ~(txsn_data | txsn_CRC)); /* * Perform the 8b/10b encoding. The encoder for Byte 0 (enc8b10bk) * can encode the 8-bit D28_3 and D28_5 data words into the 10-bit * K28_3 and K28_5 code words if the "K" control input is set to 1. */ enc8b10bk enc_byte0 ( // Byte 0 encoder .do (enctxdata[0:9]), .nrdo (txnrd_byte[0]), .di (clk_rawtxdata[7:0]), .nrdi (cur_txnrd), .k (clk_sendprim)); enc8b10b enc_byte1 ( // Byte 1 Encoder .do (enctxdata[10:19]), .nrdo (txnrd_byte[1]), .di (clk_rawtxdata[15:8]), .nrdi (txnrd_byte[0])); enc8b10b enc_byte2 ( // Byte 2 Encoder .do (enctxdata[20:29]), .nrdo (txnrd_byte[2]), .di (clk_rawtxdata[23:16]), .nrdi (txnrd_byte[1])); enc8b10b enc_byte3 ( // Byte 3 Encoder .do (enctxdata[30:39]), .nrdo (next_txnrd), .di (clk_rawtxdata[31:24]), .nrdi (txnrd_byte[2])); v_reg #(40) encdata_ff (txencdata, txclk4x, enctxdata); endmodule // link_txdp
/**************************************************************************** * * File Name: link_library.v * * Comment: Components for the Link layer datapath. * * Author: Shing Kong * Creation Date: 1/25/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixB,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixB,v 1.1 2001/12/06 21:49:07 kong Exp $ */ /**************************************************************************** * All the library elements in this file have names start with: "l_" ***************************************************************************/ `include "link_defs.v" /**************************************************************************** * l_scramble: 32-bit scrambler that can be: * a. Reset to all zeros asynchronously * b. Load a fix pattern synchronously. * c. Keep its old value if scramble is not enable. * d. Update its output synchronously based on a LFSR algorithm. ***************************************************************************/ module l_scramble (scr_out, scr_in, scr_init, scr_run, clk, reset); output [31:0] scr_out; // Scrambler's output input [31:0] scr_in; // Initial pattern to be loaded input scr_init; // Load the initial pattern input scr_run; // Update scr_out based on a LFSR input clk; input reset; reg [31:0] scram; // Scramble data pattern reg a15, a14, a13, // Intermediate scramble bits a12, a11, a10, a9, a8, a7, a6, a5, a4, a3, a2, a1, a0; wire [31:0] runmuxout; // Output of the scr_run MUX wire [31:0] lastmux; // Output of the final MUX /* * Combinational logic to produce the scramble pattern, * which should be updated whenever scr_out changes. * This logic was copied from Frank Lee's scramble.v */ always @(scr_out) begin a15 = scr_out[31] ^ scr_out[29] ^ scr_out[20] ^ scr_out[16]; a14 = scr_out[30] ^ scr_out[21] ^ scr_out[17]; a13 = scr_out[31] ^ scr_out[22] ^ scr_out[18]; a12 = scr_out[23] ^ scr_out[19]; a11 = scr_out[24] ^ scr_out[20]; a10 = scr_out[25] ^ scr_out[21]; a9 = scr_out[26] ^ scr_out[22]; a8 = scr_out[27] ^ scr_out[23]; a7 = scr_out[28] ^ scr_out[24]; a6 = scr_out[29] ^ scr_out[25]; a5 = scr_out[30] ^ scr_out[26]; a4 = scr_out[31] ^ scr_out[27]; a3 = scr_out[28]; a2 = scr_out[29]; a1 = scr_out[30]; a0 = scr_out[31]; scram[31] = a15^a14^ a12^a11^a10^ a7^a6^a5^ a0; scram[30] = a15^ a13^a12^a11^ a8^a7^a6^ a1^a0; scram[29] = a14^a13^a12^ a9^a8^a7^ a2^a1; scram[28] = a15^a14^a13^ a10^a9^a8^ a3^a2; scram[27] = a15^a14^ a11^a10^a9^ a4^a3^ a0; scram[26] = a15^ a12^a11^a10^ a5^a4^ a1^a0; scram[25] = a13^a12^a11^ a6^a5^ a2^a1; scram[24] = a14^a13^a12^ a7^a6^ a3^a2^ a0; scram[23] = a15^a14^a13^ a8^a7^ a4^a3^ a1^a0; scram[22] = a15^a14^ a9^a8^ a5^a4^ a2^a1^a0; scram[21] = a15^ a10^a9^ a6^a5^ a3^a2^a1; scram[20] = a11^a10^ a7^a6^ a4^a3^a2; scram[19] = a12^a11^ a8^a7^ a5^a4^a3^ a0; scram[18] = a13^a12^ a9^a8^ a6^a5^a4^ a1; scram[17] = a14^a13^ a10^a9^ a7^a6^a5^ a2^ a0; scram[16] = a15^a14^ a11^a10^ a8^a7^a6^ a3^ a1^a0; scram[15] = a15^ a12^a11^ a9^a8^a7^ a4^ a2^a1^a0; scram[14] = a13^a12^ a10^a9^a8^ a5^ a3^a2^a1; scram[13] = a14^a13^ a11^a10^a9^ a6^ a4^a3^a2; scram[12] = a15^a14^ a12^a11^a10^ a7^ a5^a4^a3; scram[11] = a15^ a13^a12^a11^ a8^ a6^a5^a4; scram[10] = a14^a13^a12^ a9^ a7^a6^a5; scram[9] = a15^a14^a13^ a10^ a8^a7^a6; scram[8] = a15^a14^ a11^ a9^a8^a7; scram[7] = a15^ a12^ a10^a9^a8; scram[6] = a13^ a11^a10^a9; scram[5] = a14^ a12^a11^a10; scram[4] = a15^ a13^a12^a11; scram[3] = a14^a13^a12; scram[2] = a15^a14^a13; scram[1] = a15^a14; scram[0] = a15; end // Scrambling logic /* Priority: * scram scr_out ------------------------------- * | | reset (asynchronous): highest * +---v---v---+ scr_init (synchronous): middle * scr_run-->\S 1 0 / scr_run (synchronous): lowest * +---+---+ scr_in * | | * +---v-------v---+ * \ 0 1 S/<--scr_init (higher priority than scr_run) * +-----+-----+ * | * v * lastmux */ v_mux2e #(32) run_mux (runmuxout, scr_run, scr_out, scram); v_mux2e #(32) init_mux (lastmux, scr_init, runmuxout, scr_in); v_regre #(32) scr_ff (scr_out, clk, lastmux, (scr_run | scr_init), reset); endmodule // l_scramble /**************************************************************************** * l_crccal: 32-bit CRC calculator that can be: * a. Reset to all zeros asynchronously * b. Load a fix pattern synchronously. * c. Keep its old value if CRC calculation is not enable. * d. Update its output synchronously based on its input ***************************************************************************/ module l_crccal (crc_out, crc_in, datain, crc_init, crc_cal, clk, reset); output [31:0] crc_out; // CRC calculator's output input [31:0] crc_in; // Initial patern to be loaded input [31:0] datain; // Data contribute to CRC calculation input crc_init; // Load the initial pattern input crc_cal; // Calculate crc_out based on a LFSR input clk; input reset; reg [31:0] x; // Temporary variable for CRC reg [31:0] crc; // Output of the CRC random logic wire [31:0] runmuxout; // Output of the crc_cal MUX wire [31:0] lastmux; // Output of the final MUX /* * Combinational logic to calculates the CRC output. * Its output should be updated whenever crc_out or datain changes. * This logic was copied from Frank Lee's satacrc.v */ always @(crc_out or datain) begin x = datain ^ crc_out; crc = (x[0]? 32'b00000100110000010001110110110111: 32'b0) ^ (x[1]? 32'b00001001100000100011101101101110: 32'b0) ^ (x[2]? 32'b00010011000001000111011011011100: 32'b0) ^ (x[3]? 32'b00100110000010001110110110111000: 32'b0) ^ (x[4]? 32'b01001100000100011101101101110000: 32'b0) ^ (x[5]? 32'b10011000001000111011011011100000: 32'b0) ^ (x[6]? 32'b00110100100001100111000001110111: 32'b0) ^ (x[7]? 32'b01101001000011001110000011101110: 32'b0) ^ (x[8]? 32'b11010010000110011100000111011100: 32'b0) ^ (x[9]? 32'b10100000111100101001111000001111: 32'b0) ^ (x[10]? 32'b01000101001001000010000110101001: 32'b0) ^ (x[11]? 32'b10001010010010000100001101010010: 32'b0) ^ (x[12]? 32'b00010000010100011001101100010011: 32'b0) ^ (x[13]? 32'b00100000101000110011011000100110: 32'b0) ^ (x[14]? 32'b01000001010001100110110001001100: 32'b0) ^ (x[15]? 32'b10000010100011001101100010011000: 32'b0) ^ (x[16]? 32'b00000001110110001010110010000111: 32'b0) ^ (x[17]? 32'b00000011101100010101100100001110: 32'b0) ^ (x[18]? 32'b00000111011000101011001000011100: 32'b0) ^ (x[19]? 32'b00001110110001010110010000111000: 32'b0) ^ (x[20]? 32'b00011101100010101100100001110000: 32'b0) ^ (x[21]? 32'b00111011000101011001000011100000: 32'b0) ^ (x[22]? 32'b01110110001010110010000111000000: 32'b0) ^ (x[23]? 32'b11101100010101100100001110000000: 32'b0) ^ (x[24]? 32'b11011100011011011001101010110111: 32'b0) ^ (x[25]? 32'b10111100000110100010100011011001: 32'b0) ^ (x[26]? 32'b01111100111101010100110000000101: 32'b0) ^ (x[27]? 32'b11111001111010101001100000001010: 32'b0) ^ (x[28]? 32'b11110111000101000010110110100011: 32'b0) ^ (x[29]? 32'b11101010111010010100011011110001: 32'b0) ^ (x[30]? 32'b11010001000100111001000001010101: 32'b0) ^ (x[31]? 32'b10100110111001100011110100011101: 32'b0); end // End of the combinational logic that calculates the CRC /* Priority: * crc crc_out ------------------------------- * | | reset (asynchronous): highest * +---v---v---+ crc_init (synchronous): middle * crc_cal-->\S 1 0 / crc_cal (synchronous): lowest * +---+---+ crc_in * | | * +---v-------v---+ * \ 0 1 S/<--crc_init (higher priority than crc_cal) * +-----+-----+ * | * v * lastmux */ v_mux2e #(32) run_mux (runmuxout, crc_cal, crc_out, crc); v_mux2e #(32) init_mux (lastmux, crc_init, runmuxout, crc_in); v_regre #(32) crc_ff (crc_out, clk, lastmux, (crc_cal | crc_init), reset); endmodule // l_crccal /**************************************************************************** * l_primgen: 32-bit primitive generator * This 32-bit primitive must be encoded by the 8-bit/10-bit encoder * to become the actual 40-bit SATA primitives. * * One interesting feature here is that Byte 0 of the 32-bit primitive * is either D28.3 or D28.5 and when it is encoded to K28.3 and K28.5 * by a 8-bit/10-bit encoder with the control bit (K) set to 1. * ***************************************************************************/ module l_primgen (prim_out, sel_prim); output [31:0] prim_out; // 32-bit primitive input [`num_prim:0] sel_prim; // Primitive + D10.2 select reg [31:0] prim_out; always @(sel_prim) begin if (sel_prim[`B_ALIGN]) begin //******** {Byte 3, Byte 2, Byte 1, Byte 0} prim_out = {`D27_3, `D10_2, `D10_2, `D28_5}; end else if (sel_prim[`B_SYNC]) begin prim_out = {`D21_5, `D21_5, `D21_4, `D28_3}; end else if (sel_prim[`B_CONT]) begin prim_out = {`D25_4, `D25_4, `D10_5, `D28_3}; end else if (sel_prim[`B_HOLD]) begin prim_out = {`D21_6, `D21_6, `D10_5, `D28_3}; end else if (sel_prim[`B_HOLDA]) begin prim_out = {`D21_4, `D21_4, `D10_5, `D28_3}; end else if (sel_prim[`B_R_RDY]) begin prim_out = {`D10_2, `D10_2, `D21_4, `D28_3}; end else if (sel_prim[`B_R_IP]) begin prim_out = {`D21_2, `D21_2, `D21_5, `D28_3}; end else if (sel_prim[`B_R_OK]) begin prim_out = {`D21_1, `D21_1, `D21_5, `D28_3}; end else if (sel_prim[`B_R_ERR]) begin prim_out = {`D22_2, `D22_2, `D21_5, `D28_3}; end else if (sel_prim[`B_X_RDY]) begin prim_out = {`D23_2, `D23_2, `D21_5, `D28_3}; end else if (sel_prim[`B_SOF]) begin prim_out = {`D23_1, `D23_1, `D21_5, `D28_3}; end else if (sel_prim[`B_EOF]) begin prim_out = {`D21_6, `D21_6, `D21_5, `D28_3}; end else if (sel_prim[`B_DMAT]) begin prim_out = {`D22_1, `D22_1, `D21_5, `D28_3}; end else if (sel_prim[`B_WTRM]) begin prim_out = {`D24_2, `D24_2, `D21_5, `D28_3}; end else if (sel_prim[`B_PMREQ_P]) begin prim_out = {`D23_0, `D23_0, `D21_5, `D28_3}; end else if (sel_prim[`B_PMREQ_S]) begin prim_out = {`D21_3, `D21_3, `D21_4, `D28_3}; end else if (sel_prim[`B_PMACK]) begin prim_out = {`D21_4, `D21_4, `D21_4, `D28_3}; end else if (sel_prim[`B_PMNAK]) begin prim_out = {`D21_7, `D21_7, `D21_4, `D28_3}; end else if (sel_prim[`B_D10_2]) begin // Special four D10_2 words in a row prim_out = {`D10_2, `D10_2, `D10_2, `D10_2}; end else begin // None are selected: send ALIGN prim_out = {`D27_3, `D10_2, `D10_2, `D28_5}; end end endmodule // l_primgen /**************************************************************************** * l_primdec: primitive decoder ***************************************************************************/ module l_primdec (prim_vector, prim_in, k28_3, k28_5); output [`num_prim-1:0] prim_vector; // Primitive vector input [31:8] prim_in; // Bytes 3, 2, and 1 of the primitive input k28_3; // Byte 0 is a K28_3 character input k28_5; // Byte 0 is a K28_5 character wire [`num_prim-1:0] maybe; // May be vector // Basic primitives v_comparator #(24) align_cmp ( .out (maybe[`B_ALIGN]), .in1 (prim_in), .in2 ({`D27_3, `D10_2, `D10_2})); assign prim_vector[`B_ALIGN] = maybe[`B_ALIGN] & k28_5; v_comparator #(24) sync_cmp ( .out (maybe[`B_SYNC]), .in1 (prim_in), .in2 ({`D21_5, `D21_5, `D21_4})); assign prim_vector[`B_SYNC] = maybe[`B_SYNC] & k28_3; v_comparator #(24) cont_cmp ( .out (maybe[`B_CONT]), .in1 (prim_in), .in2 ({`D25_4, `D25_4, `D10_5})); assign prim_vector[`B_CONT] = maybe[`B_CONT] & k28_3; // Flow control primitives v_comparator #(24) hold_cmp ( .out (maybe[`B_HOLD]), .in1 (prim_in), .in2 ({`D21_6, `D21_6, `D10_5})); assign prim_vector[`B_HOLD] = maybe[`B_HOLD] & k28_3; v_comparator #(24) holda_cmp ( .out (maybe[`B_HOLDA]), .in1 (prim_in), .in2 ({`D21_4, `D21_4, `D10_5})); assign prim_vector[`B_HOLDA] = maybe[`B_HOLDA] & k28_3; v_comparator #(24) r_rdy_cmp ( .out (maybe[`B_R_RDY]), .in1 (prim_in), .in2 ({`D10_2, `D10_2, `D21_4})); assign prim_vector[`B_R_RDY] = maybe[`B_R_RDY] & k28_3; v_comparator #(24) r_ip_cmp ( .out (maybe[`B_R_IP]), .in1 (prim_in), .in2 ({`D21_2, `D21_2, `D21_5})); assign prim_vector[`B_R_IP] = maybe[`B_R_IP] & k28_3; v_comparator #(24) r_ok_cmp ( .out (maybe[`B_R_OK]), .in1 (prim_in), .in2 ({`D21_1, `D21_1, `D21_5})); assign prim_vector[`B_R_OK] = maybe[`B_R_OK] & k28_3; v_comparator #(24) r_err_cmp ( .out (maybe[`B_R_ERR]), .in1 (prim_in), .in2 ({`D22_2, `D22_2, `D21_5})); assign prim_vector[`B_R_ERR] = maybe[`B_R_ERR] & k28_3; v_comparator #(24) x_rdy_cmp ( .out (maybe[`B_X_RDY]), .in1 (prim_in), .in2 ({`D23_2, `D23_2, `D21_5})); assign prim_vector[`B_X_RDY] = maybe[`B_X_RDY] & k28_3; v_comparator #(24) sof_cmp ( .out (maybe[`B_SOF]), .in1 (prim_in), .in2 ({`D23_1, `D23_1, `D21_5})); assign prim_vector[`B_SOF] = maybe[`B_SOF] & k28_3; v_comparator #(24) eof_cmp ( .out (maybe[`B_EOF]), .in1 (prim_in), .in2 ({`D21_6, `D21_6, `D21_5})); assign prim_vector[`B_EOF] = maybe[`B_EOF] & k28_3; v_comparator #(24) dmat_cmp ( .out (maybe[`B_DMAT]), .in1 (prim_in), .in2 ({`D22_1, `D22_1, `D21_5})); assign prim_vector[`B_DMAT] = maybe[`B_DMAT] & k28_3; v_comparator #(24) wtrm_cmp ( .out (maybe[`B_WTRM]), .in1 (prim_in), .in2 ({`D24_2, `D24_2, `D21_5})); assign prim_vector[`B_WTRM] = maybe[`B_WTRM] & k28_3; // Power management primitives v_comparator #(24) pmreq_p_cmp ( .out (maybe[`B_PMREQ_P]), .in1 (prim_in), .in2 ({`D23_0, `D23_0, `D21_5})); assign prim_vector[`B_PMREQ_P] = maybe[`B_PMREQ_P] & k28_3; v_comparator #(24) pmreq_s_cmp ( .out (maybe[`B_PMREQ_S]), .in1 (prim_in), .in2 ({`D21_3, `D21_3, `D21_4})); assign prim_vector[`B_PMREQ_S] = maybe[`B_PMREQ_S] & k28_3; v_comparator #(24) pmack_cmp ( .out (maybe[`B_PMACK]), .in1 (prim_in), .in2 ({`D21_4, `D21_4, `D21_4})); assign prim_vector[`B_PMACK] = maybe[`B_PMACK] & k28_3; v_comparator #(24) pmnak_cmp ( .out (maybe[`B_PMNAK]), .in1 (prim_in), .in2 ({`D21_7, `D21_7, `D21_4})); assign prim_vector[`B_PMNAK] = maybe[`B_PMNAK] & k28_3; endmodule // l_primgen
/**************************************************************************** * * File Name: link_defs.v * * Comment: Definitions for the Link Layer * * Author: Shing Kong * Creation Date: 1/29/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixB,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixB,v 1.1 2001/12/06 21:49:07 kong Exp $ */ /* * Number of primitives and the bit position of the 1-hot encoded vector */ `define num_prim 18 // Basic Primitives `define B_ALIGN 0 `define B_SYNC 1 `define B_CONT 2 // Flow Control Primitives `define B_HOLD 3 `define B_HOLDA 4 `define B_R_RDY 5 `define B_R_IP 6 `define B_R_OK 7 `define B_R_ERR 8 `define B_X_RDY 9 `define B_SOF 10 `define B_EOF 11 `define B_DMAT 12 `define B_WTRM 13 // Power Management Primitives `define B_PMREQ_P 14 `define B_PMREQ_S 15 `define B_PMACK 16 `define B_PMNAK 17 // Special patterns (not a primitive) to be generated // by the l_primgen: four consecutive D10_2's `define B_D10_2 18 /* * Define the 8-bit pattern for the primitives * * Note: The 8-bit D28_3 and D28_5 will be encoded into the 10-bit * K28_3 and K28_5 patterns by the 8B/10B encoder with its * control bit (K) set to 1. */ `define D10_2 8'b010_01010 // 0x4A `define D10_4 8'b100_01010 // 0x8A `define D10_5 8'b101_01010 // 0xAA `define D21_1 8'b001_10101 // 0x35 `define D21_2 8'b010_10101 // 0x55 `define D21_3 8'b011_10101 // 0x75 `define D21_4 8'b100_10101 // 0x95 `define D21_5 8'b101_10101 // 0xB5 `define D21_6 8'b110_10101 // 0xD5 `define D21_7 8'b111_10101 // 0xF5 `define D22_1 8'b001_10110 // 0x36 `define D22_2 8'b010_10110 // 0x56 `define D23_0 8'b000_10111 // 0x17 `define D23_1 8'b001_10111 // 0x37 `define D23_2 8'b010_10111 // 0x57 `define D24_2 8'b010_11000 // 0x58 `define D25_4 8'b100_11001 // 0x99 `define D27_3 8'b011_11011 // 0x7B `define D27_4 8'b100_11011 // 0x9B `define D28_3 8'b011_11100 // 0x7C `define D28_5 8'b101_11100 // 0xBC /* * Define the bit position and state values for the transmit finite state * machine (FSM in the link_txctl). This FSM implements the "Link Idle * State Diagram" (P. 145 of the SATA V1 Spec.) and the "Link Transmit * State Diagram" (P. 148 of the SATA V1 Spec). * * The two states below are not shown explicitly in the two state diagrams * described above: * BUSYRCV: we have kicked off the receive finite state machine * (see below) and therefore cannot transmit any of our own data. * * POWERDOWN: we have entered the power saving states, which will * be handled by the Power Management State machine. */ // Number of TX states and bit position of the 1-hot state encoding `define num_lktxstate 15 `define B_NOCOMM 0 `define B_SENDALIGN 1 `define B_NOCOMMERR 2 `define B_TXIDLE 3 `define B_HSENDCHKRDY 4 `define B_DSENDCHKRDY 5 `define B_SENDSOF 6 `define B_SENDDATA 7 `define B_RCVRHOLD 8 `define B_SENDHOLD 9 `define B_SENDCRC 10 `define B_SENDEOF 11 `define B_WAIT 12 `define B_BUSYRCV 13 `define B_POWERDOWN 14 // State Values `define RESET 15'h0000 // All bits are zeros `define NOCOMM 15'h0001 // Bit 0 is set `define SENDALIGN 15'h0002 // Bit 1 is set `define NOCOMMERR 15'h0004 // Bit 2 is set `define TXIDLE 15'h0008 `define HSENDCHKRDY 15'h0010 `define DSENDCHKRDY 15'h0020 `define SENDSOF 15'h0040 `define SENDDATA 15'h0080 `define RCVRHOLD 15'h0100 `define SENDHOLD 15'h0200 `define SENDCRC 15'h0400 `define SENDEOF 15'h0800 `define WAIT 15'h1000 `define BUSYRCV 15'h2000 // Link layer is busy receiving data `define POWERSAVE 15'h4000 // Link layer is power down /* * Define the state values and bit position for the receive finite state * machine (FSM in the link_rxctl). This FSM implements the "Link * Receive State Diagram" (P. 154 of the SATA V1 Spec). */ // Number of RX states and bit position of the 1-hot state encoding `define num_lkrxstate 11 `define B_RXIDLE 0 `define B_RCVCHKRDY 1 `define B_RCVWAITFIFO 2 `define B_RCVDATA 3 `define B_RXSNHOLD 4 `define B_RXSNHOLDA 5 `define B_RCVEOF 6 `define B_GOODCRC 7 `define B_GOODEND 8 `define B_BADEND 9 `define B_WAITTXID 10 // Wait for TX FSM to return to idle state // State Values `define RXIDLE 11'h001 `define RCVCHKRDY 11'h002 `define RCVWAITFIFO 11'h004 `define RCVDATA 11'h008 `define RXSNHOLD 11'h010 `define RXSNHOLDA 11'h020 `define RCVEOF 11'h040 `define GOODCRC 11'h080 `define GOODEND 11'h100 `define BADEND 11'h200 `define WAITTXID 11'h400
/**************************************************************************** * * File Name: trans_defs.v * * Comment: Definitions for the Transport Layer * * Author: Shing Kong * Creation Date: 3/21/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixC,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixC,v 1.1 2001/12/06 21:49:07 kong Exp $ */ /* * When the (Number of Unused Entries - 1) in the FIFO is less than * this value, declare the receive FIFO as "nearly full." */ `define NEARFULL 5'd20 // Declare "nearfull" when we have 20 entries /* * For the Receive engine, assert timeout if the Transport layer is * expecting an incoming FIS and the Link Layer has not written anything * into the rxfifo for `RXTIMEOUT cycles. * * For the Transmit engine, assert timeout if the Transport layer has * finished writing the FIS into the txfifo but the Link Layer has not * done anymore reading for `TXTIMEOUT cycles. */ `define RXTIMEOUT 10'd999 `define TXTIMEOUT 10'd999 /* * FIS Type: the 8-bit hex value as appeared in Byte 0 Word 0 of the FIS */ `define HFISREG 8'h27 // Host-to-Device (H) Register (REG) `define DFISREG 8'h34 // Device-to-Host (D) Register (REG) `define DFISDMAA 8'h39 // Device-to-Host (D) DMA Activate (DMAA) `define DFISPIOS 8'h5F // Device-to-Host (D) PIO Setup (PIOS) `define BFISDMAS 8'h41 // Bidirectional (B) DMA Setup (DMAS) `define BFISBISTA 8'h58 // Bidirectional (B) BIST Activate (BISTA) // `define BFISDATA 8'h73 // Bidirectional (B) Data (DATA) FIS `define BFISDATA 8'h46 // Changed per erata (see ktoh's 5/1's email) /* * Number of 32-bit words in various FIS minus one * For example, the Register (REG) Host-to-Device (H) FIS (see P.167 of * of the SATA Spec, Version 1.0) has 5 words => REGHFISm1 = 5 - 1 = 4. */ `define NHFISREGm1 3'd4 // Host-to-Device (H) Register (REG) `define NDFISREGm1 3'd4 // Device-to-Host (D) Register (REG) `define NDFISPIOSm1 3'd4 // Device-to-Host (D) PIO Setup (PIOS) `define NBFISDMASm1 3'd6 // Bidirectional (B) DMA Setup (DMAS) `define NBFISBISTAm1 3'd2 // Bidirectional (B) BIST Activate (BISTA) /* * Log 2 of the biggest number appears above. The biggest number at this * point is: DMASFISBm1 = 6; so Integer (log (6)) = 3. Notice that if this * number change, we need to change the "3'd" definition above as well as * changing the v_count3 to a differnt width counter in dtrans_ctl.v. */ `define log_maxfis 3 /* * Define the state values and bit position for the Host's Transmit Finite * State machine (FSM in htran_txctl). This FSM implements the "transmit" * states describes in Section 8.6 (PP. 181-197) of SATA Spec, 1.0. */ `define num_httxfsm 16 `define B_HTTXIDLE 0 `define B_HTCHKTYP 1 `define B_HTCMDFIS 2 `define B_HTCTLFIS 3 `define B_HTDMASTUP 4 // Spec's HT_DMASTUPFIS `define B_HTXMITBIS 5 `define B_HTPIOTX2 6 // Spec's HT_PIOOTrans2 // These states are entered via HT_CHKTYP after Transport Receiver // decode FISs that require the Host Transport Layer to transmit. `define B_HTPIOTX1 7 // Spec's HT_PIOOTrans1 `define B_HTDMATX1 8 // Spec's HT_DMAOTrans1 `define B_HTDMATX2 9 // Spec's HT_DMAOTrans2 // The following are collectively referred to as: HT_TransStatus in the spec `define B_HTCMDSTA 10 `define B_HTCTLSTA 11 `define B_HTDMASSTA 12 `define B_HTBISSTA 13 `define B_HTPIOTXEND 14 // Spec's HT_PIOEND `define B_HTDMATXEND 15 // Spec's HT_DMAEND // Host Dongle's TX FSM State Values `define HTTXIDLE 16'h0001 `define HTCHKTYP 16'h0002 `define HTCMDFIS 16'h0004 `define HTCTLFIS 16'h0008 `define HTDMASTUP 16'h0010 `define HTXMITBIS 16'h0020 `define HTPIOTX2 16'h0040 `define HTPIOTX1 16'h0080 `define HTDMATX1 16'h0100 `define HTDMATX2 16'h0200 `define HTCMDSTA 16'h0400 `define HTCTLSTA 16'h0800 `define HTDMASSTA 16'h1000 `define HTBISSTA 16'h2000 `define HTPIOTXEND 16'h4000 `define HTDMATXEND 16'h8000 /* * Define the state values and bit position for the Host's Receive Finite * State machine (FSM in htran_rxctl). This FSM implements the "Decompose" * states describes in Section 8.6 (PP. 181-197) of SATA Spec, 1.0. * * The following states are not defined in the spec: * HTWAITTXID: wait for TX to return to Idle state. This is added * so that TX and RX FSM can be partitioned cleanly. * HTRXCLEAN: clean up the mess if we receive an unknown FIS */ `define num_htrxfsm 15 `define B_HTRXIDLE 0 `define B_HTRCVREG 1 // Spec's HT_RegFIS: receive Register FIS `define B_HTRCVDS 2 // Spec's HT_DS_FIS: receive DMA Setup `define B_HTRCVPS 3 // Spec's HT_PS_FIS: receive PIO Setup `define B_HTRCVBIST 4 `define B_HTRCVDAC 5 // Spec's HT_DMA_FIS: receive DMA Activate `define B_HTPIORX1 6 // Spec's HT_PITITrans1 `define B_HTPIORX2 7 // Spec's HT_PITITrans2 `define B_HTDMARX 8 // Spec's HT_DMAITrans `define B_HTBISTTRAN1 9 `define B_HTPIORXEND 10 // Spec's HT_PIOEND `define B_HTDMARXEND 11 // Spec's HT_DMAEND `define B_HTRXCLEAN 12 // Receive an unknown FIS, needs to clean up `define B_HTTXBUSY 13 `define B_HTWAITTXID 14 // Host Dongle's RX FSM State Values `define HTRXIDLE 15'h0001 `define HTRCVREG 15'h0002 `define HTRCVDS 15'h0004 `define HTRCVPS 15'h0008 `define HTRCVBIST 15'h0010 `define HTRCVDAC 15'h0020 `define HTPIORX1 15'h0040 // *** Debug 4/27: combine with RX2? `define HTPIORX2 15'h0080 // *** Debug 4/27: combine with RX1? `define HTDMARX 15'h0100 `define HTBISTTRAN1 15'h0200 `define HTPIORXEND 15'h0400 `define HTDMARXEND 15'h0800 `define HTRXCLEAN 15'h1000 `define HTTXBUSY 15'h2000 `define HTWAITTXID 15'h4000 /* * Define the state values and bit position for the Device's Transmit Finite * State machine (FSM in dtran_txctl). This FSM implements the "transmit" * states describes in Section 8.7 (PP. 197-205) of SATA Spec, 1.0. */ `define num_dttxfsm 15 `define B_DTTXIDLE 0 `define B_DTCHKTYP 1 `define B_DTREGFIS 2 // Spec's DT_RegHDFIS `define B_DTPIOSTUP 3 // Spec's DT_PIOSTUPFIS `define B_DTDMASTUP 4 // Spec's DT_DMASTUPFIS `define B_DTDMAACT 5 // Spec's DT_DMAACTFIS `define B_DTXMITBIS 6 `define B_DTDATAFIS 7 // Spec's DT_DATAIFIS `define B_DTDATATX 8 // Spec's DT_DATAITrans `define B_DTDTXEND 9 // Spec's DT_DATAIEnd // The following are collectively referred to as: DT_TransStatus in the spec `define B_DTREGSTA 10 `define B_DTPIOSSTA 11 `define B_DTDMASSTA 12 `define B_DTDMAASTA 13 `define B_DTBISSTA 14 // Devcie Dongle's TX FSM State Values `define DTTXIDLE 15'h0001 `define DTCHKTYP 15'h0002 `define DTREGFIS 15'h0004 // Spec's DT_RegHDFIS `define DTPIOSTUP 15'h0008 // Spec's DT_PIOSTUPFIS `define DTDMASTUP 15'h0010 // Spec's DT_DMASTUPFIS `define DTDMAACT 15'h0020 // Spec's DT_DMAACTFIS `define DTXMITBIS 15'h0040 `define DTDATAFIS 15'h0080 // Spec's DT_DATAIFIS `define DTDATATX 15'h0100 // Spec's DT_DATAITrans `define DTDTXEND 15'h0200 // Spec's DT_DATAIEnd // The following are collectively referred to as: DT_TransStatus in the spec `define DTREGSTA 15'h0400 `define DTPIOSSTA 15'h0800 `define DTDMASSTA 15'h1000 `define DTDMAASTA 15'h2000 `define DTBISSTA 15'h4000 /* * Define the state values and bit position for the Device's Receive Finite * State machine (FSM in dtran_rxctl). This FSM implements the "Decompose" * states describes in Section 8.7 (PP. 197-210) of SATA Spec, 1.0. * * The following states are not defined in the spec: * DTWAITTXID: wait for TX to return to Idle state. This is added * so that TX and RX FSM can be partitioned cleanly. * DTRXCLEAN: clean up the mess if we receive an unknown FIS */ `define num_dtrxfsm 10 `define B_DTRXIDLE 0 `define B_DTRCVREG 1 // Spec's DT_RegHDFIS: receive Register FIS `define B_DTRCVDMAS 2 // Spec's DT_DAMSTUPFIS `define B_DTRCVBIST 3 `define B_DTRCVDFIS 4 // spec's DT_DATAOFIS `define B_DTRCVDATA 5 // Spec's DT_DATAOREC `define B_DTDEVABORT 6 // Spec's DT_DeviceAbort `define B_DTBISTTRAN1 7 `define B_DTRXCLEAN 8 // Receive an unknown FIS, needs to clean up `define B_DTWAITTXID 9 // Devcie Dongle's RX FSM State Values `define DTRXIDLE 10'h001 `define DTRCVREG 10'h002 // Spec's DT_RegHDFIS: receive Register FIS `define DTRCVDMAS 10'h004 // Spec's DT_DAMSTUPFIS `define DTRCVBIST 10'h008 `define DTRCVDFIS 10'h010 // spec's DT_DATAOFIS `define DTRCVDATA 10'h020 // Spec's DT_DATAOREC `define DTDEVABORT 10'h040 // Spec's DT_DeviceAbort `define DTBISTTRAN1 10'h080 `define DTRXCLEAN 10'h100 // Receive an unknown FIS, needs to clean up `define DTWAITTXID 10'h200
/**************************************************************************** * * File Name: dtrans_txctl.v * * Comment: Device Dongle Transport Layer controller for data transmission * * Author: Shing Kong * Creation Date: 3/25/2001 * * $Source: /proj/gemini/cvs_root/P2002/Notes/Style/appendixC,v $ * $Date: 2001/12/06 21:49:07 $ * $Revision: 1.1 $ * *=========================================================================== * Copyright (c) 2001 by Shing Ip Kong. All Rights Reserved. ****************************************************************************/ /* * $Id: appendixC,v 1.1 2001/12/06 21:49:07 kong Exp $ */ `include "trans_defs.v" // See ../../CommonFiles module dtrans_txctl ( // Outputs tp_acksendreg, tp_acksenddmaa, tp_acksendpios, tp_acksenddmas, tp_acksendbist, tp_acksenddata, tp_sendndfis, tp_senddafis, tp_partial, tp_slumber, tp_spdsel, txfsmidle, txokrxgo, wcount, wrtxfifo, sendreg, senddmaa, sendpios, senddmas, sendbista, senddata, // Inputs at_sendreg, at_senddmaa, at_sendpios, at_senddmas, at_sendbista, at_senddata, lk_txfsmidle, lk_txerror, r2t_waittxid, r2t_rxempty, txfull, txtimeout, txclk4x, tptx_reset); /* * Outputs to the Parallel ATA Interface Layer (dataif.v) */ output tp_acksendreg; output tp_acksenddmaa; output tp_acksendpios; output tp_acksenddmas; output tp_acksendbist; output tp_acksenddata; /* * Outputs to the Link Layer (link.v) */ output tp_sendndfis; // Sending a non-data FIS output tp_senddafis; // Sending a data FIS output tp_partial; output tp_slumber; output tp_spdsel; /* * Outputs to the Transport Layer Receive Engine (dtrans_rx.v) */ output txfsmidle; output txokrxgo; // TX FSM gives RX FSM the OK /* * Outputs to the Transport Layer Datapath (dtrans_dp.v) */ output [`log_maxfis-1:0] wcount; output wrtxfifo; output sendreg; output senddmaa; output sendpios; output senddmas; output sendbista; output senddata; /* * Inputs from the Parallel ATA Interface Layer (dataif.v) * These signals must remain asserted until the FSM has changed state */ input at_sendreg; input at_senddmaa; input at_sendpios; input at_senddmas; input at_sendbista; input at_senddata; /* * Inputs from the Link Layer (link.v) */ input lk_txfsmidle; // TX FSM has returned to IDLE input lk_txerror; // Error in transmitting a FIS /* * Inputs from the Device Transport Layer Receive Engine (dtrans_rx.v) */ input r2t_waittxid; // RX FSM waiting TX FSM to be idle input r2t_rxempty; // RX FIFO is empty /* * Inputs from the Transport Layer Transmit Datapath (dtrans_txdp.v) */ input txfull; input txtimeout; /* * Reset signal and clocks */ input txclk4x; // 150 MHz Transmit Clock input tptx_reset; /* * Interconnections within this controller */ wire [`num_dttxfsm-1:0] next_state; wire [`num_dttxfsm-1:0] cur_state; // Output of the MUXes for selecting the count limit for various states wire [`log_maxfis-1:0] num_regpio; wire [`log_maxfis-1:0] num_dmabis; wire [`log_maxfis-1:0] count_limit; wire count_enable; wire count_full; /* * Next State Logic and the State Register for the finite state machine */ // Next State Logic dtrans_txfsm dtrans_txfsm ( // Outputs .next_state (next_state), // Inputs .cur_state (cur_state), .at_sendreg (at_sendreg), .at_senddmaa (at_senddmaa), .at_sendpios (at_sendpios), .at_senddmas (at_senddmas), .at_sendbista (at_sendbista), .at_senddata (at_senddata), .lk_txfsmidle (lk_txfsmidle), .lk_txerror (lk_txerror), .r2t_waittxid (r2t_waittxid), .r2t_rxempty (r2t_rxempty), .txtimeout (txtimeout), .expire (expire), .tptx_reset (tptx_reset)); // State Register v_reg #(`num_dttxfsm) state_ff (cur_state, txclk4x, next_state); /* * Counter and its MUX tree to select the count limit * for the generation of the expire signal */ v_mux2e #(`log_maxfis) regpio_mux (num_regpio, cur_state[`B_DTPIOSTUP], `NDFISREGm1, `NDFISPIOSm1); v_mux2e #(`log_maxfis) dmabis_mux (num_dmabis, cur_state[`B_DTXMITBIS], `NBFISDMASm1, `NBFISBISTAm1); v_mux2e #(`log_maxfis) cntlmt_mux (count_limit, (cur_state[`B_DTXMITBIS] | cur_state[`B_DTDMASTUP]), num_regpio, num_dmabis); assign count_enable = cur_state[`B_DTREGFIS] | cur_state[`B_DTPIOSTUP] | cur_state[`B_DTXMITBIS] | cur_state[`B_DTDMASTUP]; v_countN #(`log_maxfis) expire_count ( .count_out (wcount), .count_enable (count_enable), .clk (txclk4x), .reset (tptx_reset | expire)); v_comparator #(`log_maxfis) expire_cmp (count_full, wcount, count_limit); assign expire = count_full & count_enable; /* * Output Logic for generating output signals */ assign tp_acksendreg = cur_state[`B_DTREGFIS]; assign tp_acksenddmaa = cur_state[`B_DTDMAACT]; assign tp_acksendpios = cur_state[`B_DTPIOSTUP]; assign tp_acksenddmas = cur_state[`B_DTDMASTUP]; assign tp_acksendbist = cur_state[`B_DTXMITBIS]; assign tp_acksenddata = cur_state[`B_DTDATAFIS]; assign tp_sendndfis = cur_state[`B_DTREGFIS] | cur_state[`B_DTPIOSTUP] | cur_state[`B_DTDMASTUP] | cur_state[`B_DTDMAACT] | cur_state[`B_DTXMITBIS]; /*** Debug 5/7/2001: may need to fix the logic later for this ***/ assign tp_senddafis = cur_state[`B_DTDATAFIS] | cur_state[`B_DTDATATX]; /*** Debug 5/18/2001: need to fix the logic later ***/ assign tp_partial = 1'b0; assign tp_slumber = 1'b0; assign tp_spdsel = 1'b0; assign txfsmidle = cur_state[`B_DTTXIDLE]; assign txokrxgo = cur_state[`B_DTCHKTYP]; assign wrtxfifo = ~txfull & (tp_sendndfis | tp_senddafis); assign sendreg = cur_state[`B_DTREGFIS]; assign senddmaa = cur_state[`B_DTDMAACT]; assign sendpios = cur_state[`B_DTPIOSTUP]; assign senddmas = cur_state[`B_DTDMASTUP]; assign sendbista = cur_state[`B_DTXMITBIS]; assign senddata = cur_state[`B_DTDATAFIS]; endmodule // dtrans_txctl /**************************************************************************** * Module dtrans_txfsm: Random logic for the transmit finite state machine ****************************************************************************/ module dtrans_txfsm ( // Outputs next_state, // Inputs cur_state, at_sendreg, at_senddmaa, at_sendpios, at_senddmas, at_sendbista, at_senddata, lk_txfsmidle, lk_txerror, r2t_waittxid, r2t_rxempty, txtimeout, expire, tptx_reset); output [`num_dttxfsm-1:0] next_state; input [`num_dttxfsm-1:0] cur_state; /* * Inputs from the Parallel ATA Interface (dataif.v) */ input at_sendreg; input at_senddmaa; input at_sendpios; input at_senddmas; input at_sendbista; input at_senddata; /* * Inputs from the Link Layer (link.v) */ input lk_txfsmidle; // TX FSM has returned to IDLE input lk_txerror; // Error in transmitting a FIS /* * Inputs from the Device Transport Layer Receive Engine (dtrans_rx.v) */ input r2t_waittxid; // RX FSM waiting TX FSM to be idle input r2t_rxempty; // RX FIFO is empty /* * Inputs from the Device Transport Layer Transmit Datapath (htrans_txdp.v) */ input txtimeout; // Link layer did not empty the FIFO input expire; input tptx_reset; reg [`num_dttxfsm-1:0] next_state; always @(cur_state or at_sendreg or at_senddmaa or at_sendpios or at_senddmas or at_sendbista or at_senddata or lk_txfsmidle or lk_txerror or r2t_waittxid or r2t_rxempty or txtimeout or expire or tptx_reset) begin if (tptx_reset) begin next_state = `DTTXIDLE; end else begin case (cur_state) `DTTXIDLE: if (~r2t_rxempty) begin /* * Give the receive engine higher priority */ next_state = `DTCHKTYP; end else if (~lk_txfsmidle) begin /* * Do not send a new FIS until the Link Layer has * finished reading the last one. */ next_state = `DTTXIDLE; end else if (at_sendreg) begin next_state = `DTREGFIS; end else if (at_sendpios) begin next_state = `DTPIOSTUP; end else if (at_senddmas) begin next_state = `DTDMASTUP; end else if (at_senddmaa) begin next_state = `DTDMAACT; end else if (at_sendbista) begin next_state = `DTXMITBIS; end else if (at_senddata) begin next_state = `DTDATAFIS; end `DTCHKTYP: // Start the Receive Engine for receiving FIS if (r2t_waittxid) begin /* * RX FSM is done receiving */ next_state = `DTTXIDLE; end else begin /* * RX FSM is still busy receiving */ next_state = `DTCHKTYP; end `DTREGFIS: // Send out a Device-to-Host Register FIS if (~expire) begin next_state = `DTREGFIS; end else begin next_state = `DTREGSTA; end `DTPIOSTUP: // Send out a PIO Setup FIS if (~expire) begin next_state = `DTPIOSTUP; end else begin next_state = `DTPIOSSTA; end `DTDMASTUP: // Send otu a DMA Setup FIS if (~expire) begin next_state = `DTDMASTUP; end else begin next_state = `DTDMASSTA; end `DTDMAACT: // Send out a DMA Activate FIS (only one Dword) next_state = `DTDMAASTA; `DTXMITBIS: // Send out a BIST Activate FIS if (~expire) begin next_state = `DTXMITBIS; end else begin next_state = `DTBISSTA; end `DTDATAFIS: // Send out a DATA FIS next_state = `DTDATATX; `DTDATATX: //*** Debug 5/7/2001: may be combined with DTDATAFIS? //*** Debug ***: may need signals other than expire? if (~expire) begin next_state = `DTDATATX; end else begin next_state = `DTDTXEND; end `DTDTXEND: //*** Debug 5/7/2001: may need a better state next_state = `DTTXIDLE; `DTREGSTA: // Check status after sending out a Register FIS if (~lk_txfsmidle & ~txtimeout) begin next_state = `DTREGSTA; end else if (lk_txfsmidle & lk_txerror) begin next_state = `DTREGFIS; // Retry sending the FIS end else begin next_state = `DTTXIDLE; end `DTPIOSSTA: // Check status after sending out a PIO Setup FIS if (~lk_txfsmidle & ~txtimeout) begin next_state = `DTPIOSSTA; end else if (lk_txfsmidle & lk_txerror) begin next_state = `DTPIOSTUP; // Retry sending the FIS end else begin next_state = `DTTXIDLE; end `DTDMASSTA: // Check status after sending the DMA Setup FIS if (~lk_txfsmidle & ~txtimeout) begin next_state = `DTDMASSTA; end else if (lk_txfsmidle & lk_txerror) begin next_state = `DTDMASTUP; // Retry sending the FIS end else begin next_state = `DTTXIDLE; end `DTDMAASTA: // Check status after sending the DMA Activate FIS if (~lk_txfsmidle & ~txtimeout) begin next_state = `DTDMAASTA; end else if (lk_txfsmidle & lk_txerror) begin next_state = `DTDMAACT; // Retry sending the FIS end else begin next_state = `DTTXIDLE; end `DTBISSTA: // Check status after sending the BIST Activate FIS if (~lk_txfsmidle & ~txtimeout) begin next_state = `DTBISSTA; end else if (lk_txfsmidle & lk_txerror) begin next_state = `DTXMITBIS; // Retry sending the FIS end else begin next_state = `DTTXIDLE; end default: begin // We should never be here next_state = `DTWAITTXID; $display ( "*** Warning: Undefined HTP RX State, cur_state = %b ***", cur_state); end endcase end // End else (tptx_reset == 0) end // End always endmodule // dtrans_txfsm </pre> <hr> <hr> </body> </html>