

#### **VisualSim Training**

Training: Planning, Modeling, Simulation, Advanced Features MIRABILIS

#### Planning



## Agenda- Part 2: Modeling

- Basic Components for hardware modeling
- Performance, Power and Functionality
- Performance and power Metrics
- Library components and important parameters which affect performance

## Performance, Power and Functionality

- VisualSim provides an integrated solution for performance, power and functional modeling
- The details regarding Power modeling have been dealt in a separate presentation
- The functional modeling aspect is covered in modeling the task graphs section
- The user can also perform functional verification by observing the input, output values, checking for their correctness
- Custom algorithms or functional flows can also be define dusing the Script Block

### **Basic Components for Hardware Modeling**



## Architecture Library Overview

- •Generate architecture with parameterized blocks
- Define hardware and software components
- Create proposed or derivative architectures in few minutes
- Rapidly define application flow diagram and behavior
- Optimize architecture and functionality mapping combination

### Hardware Modeling Library



 $\sim$ 

## Basic Components to define

1. Hardware modeling requires Architecture\_Setup block - has the routing information of all hardware blocks in the model

2. If using Traffic or other custom block to connect to a Bus, use the Device\_Interface block in front of the Bus port.

- Also true for AXI, PCIe and NoC
- Not required for Standard Blocks Like Processor
- 3. Processor block requires Instruction\_Set



ArchitectureSetup Block gives out statistics for all the hardware blocks in the model



#### Use of Device Interface

Processor Blocks with InstructionSet



Step2

# Architecture Setup

#### All Bus and Hardware block must associate with Architecture Setup

This Block Handles

➢ Routing

➢ Plotting

Statistics

> Debugging for all the Hardware components

There can be multiple architecture setup blocks

2. Each block must have unique name

#### Note: this Block can be used as it is after updating the Architecture\_Name field



### Custom Routing Table Construction - in Architecture Setup

#### Format Sample:

| Source_Node | Destination_Node | Hop Sourc  | e_Port   |
|-------------|------------------|------------|----------|
| Source      | Destination      | Next Block | Out Port |
| Processor   | DRAM             | Port_1     | bus_out  |

#### Add entries using RegEX

addToRoutingTable (Architecture\_Name, Source\_Name, Destination\_Name, Hop\_Name, Source\_Port\_Name)

### Delete entries using RegEX

removeFromRoutingTable (Architecture\_Name, Source\_Name, Destination\_Name, Hop\_Name, Source\_Port\_Name)



## DeviceInterface Block

- Can be used to define the Source Name, data size, command type and destination for a Master
- Can be used to define the Device name on the Slave side
- Add the Master or Slave block to the Linear Bus, Bridge and AHB Buses automatically
  - ✓ Generate Hello Messages
  - ✓ Eliminates the need for a RegEx functions or manually generate Hello message
- Map fields of other data structure formats to the corresponding fields of the Processor\_DS
- Used to connect only in the presence of the Linear Bus, AHB and bridge blocks.
- Not useful with a single AXI or a single PCIe

| Block_Documentation: | D Enter User Documentation Here |
|----------------------|---------------------------------|
| Architecture_Name:   | "Architecture_1"                |
| A_Source:            | "CPU2"                          |
| A_Destination:       | "Cache"                         |
| A_Command:           | "A_Command"                     |
| A_Instruction:       | "Fld_Name_or_String_or_None"    |
| A_Bytes:             | "A_Bytes"                       |
| A_Priority:          | "Fld_Name_or_Integer"           |
| A_Address:           | "Fld_Name_or_Integer"           |
|                      |                                 |
| Commit               | Add Remove                      |

Set field values here or keep default to use existing field values

### Stochastic Components in the Library



## Concept of System Resource

#### Concept

- Split operation into two parts
- Behavior or mapper
- Resource (similar to Server)

#### Blocks

- Behavior: Mapper, SoftwareMapper, DynamicMapper
- Architecture: SystemResource\_Extend, SystemResource
- Notify: SystemResource\_Done

#### Multiple concurrent requests

- Send from Mapper (Behavior) to the SystemResource with the delay information
- Can be static or dynamic reference
- Scheduler: First Come-First Serve, Round-Robin, Preemption, Non-Locking

#### SystemResource\_Done block

Release appropriate SystemResource\_Extend block by signaling the completion of an external task

#### SystemResource task\_output Schedu... Type: General Type: Double SystemResource\_Extend Type: General Type: Double SystemResource\_Done Type: General Type: Double

#### Architecture

#### **Behavior**



## Mapper to System Resource

- Mapper blocks define the connectivity between the behavior flow and the architecture flow, and within the architecture flow using a named connection
- The block takes the incoming Data Structure and send it to the Scheduler virtually
- This block can send a request to either the SystemResource or SystemResource\_Extend.



#### SystemResource task\_output "Schedu...">Type: General task\_plot Type: Double





## System Resource (Cont.)

| Edit parameters for SystemResource |                               |  | ×    |         | Time the scheduler will devote to each task for Round Robin |                |                       |  |
|------------------------------------|-------------------------------|--|------|---------|-------------------------------------------------------------|----------------|-----------------------|--|
| Block_Documentation:               | Enter User Documentation Here |  |      | Ар      | plicatio                                                    | on comp        | arisons               |  |
| Resource_Name:                     | "CPU"                         |  |      | Featur  | es                                                          | SystemResource | SystemResource_Extend |  |
| Next_Resource:                     | "Fld_Name_or_String_or_None"  |  |      | Preemp  | tion 🖌                                                      | Yes            | No                    |  |
| Task_Context_Switch_Time:          | 0.0                           |  |      | Hierarc |                                                             | Yes            | No                    |  |
| Round_Robin_Time_Slice:            | 1.0E-3                        |  |      |         | ed Task                                                     |                |                       |  |
| Clock_Rate_Mhz:                    | 500.0                         |  |      | Process |                                                             | No             | Yes                   |  |
| Max_Scheduler_Length:              | 30                            |  |      | Non-Bl  | ocking                                                      | No             | Yes                   |  |
| Time_Type:                         | Relative Time                 |  |      |         | 0                                                           |                |                       |  |
| Scheduler_Type:                    | Scheduler_FCFS                |  | <br> | ٦       |                                                             |                |                       |  |
| Add_Scheduler_Times_to_DS:         | Scheduler_FCFS                |  |      |         | Cot Cobo                                                    | dular tupa f   | rom the renge         |  |
| (                                  | FCFS + Preempt                |  |      | Ļ       |                                                             |                | rom the range         |  |
| Commit                             | Scheduler_RR                  |  |      |         | of sched                                                    | aulers         |                       |  |
| Commit Add                         | Scheduler_User_1              |  |      |         |                                                             |                |                       |  |
|                                    | Scheduler_User_2              |  | <br> |         |                                                             |                |                       |  |



- Double click to configure



System Resource Extended

## What is Mapper?

#### Connect behavior flow with architecture resources

- Takes incoming Data Structure and sends to
  - SystemResource
  - SystemResource\_Extend blocks

Mapper

- Placed in the behavior flow where timed resources required
- Consumes zero time, no queue, no arbitration

| Edit parameters for Mapper3                                                                                                                                      | _         |     | × | Edit parameters for SystemResource —                                                                                                                                                                              | $\times$ |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----|---|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| Block_Documentatio 🗊 Enter User Documentation Here                                                                                                               |           | _   |   | Block_Documentation: 🎲 Enter User Documentation Here                                                                                                                                                              | ^        |
| Target_Resource:     CPU       Task_Number.     1       Task_Priority:     Task Priority Eld Int Dbl Expr                                                        |           |     |   | Resource_Name;       "CPU"         Next_Resource:       "Fld_Name_or_String_or_None"         Task_Context_Switch_Time:       0.0         Round_Robin_Time_Slice:       1.0E-3         Clock_Rate_Mhz:       500.0 | -        |
| Task_Priority:     Task_Priority_Fld_Int_Dbl_Expr       Task_Time:     input.time       Task_Plot_ID:     1       Commit     Add     Remove     Restore Defaults | Preferenc | ces |   | Max_Scheduler_Length:     30       Time_Type:     Relative Time       Scheduler_Type:     Scheduler_FCFS       Add_Scheduler_Times_to_D                                                                           | ~        |

input ..... Mapper output input ..... Type: General Type: General plot Type: Double

System Resource

### Software Mapper Hardware or Software Task issuer



- Sends tasks to SystemResource or SystemResource\_Extend based on Target\_Resource
- Block can either Queue incoming Data Structure or send to SystemResource immediately

|                                    |   | Edit parameters for                                                                               | SoftwareMapper                                               |               |                  | -           | - 🗆  | $\times$ |               |                                  |  |
|------------------------------------|---|---------------------------------------------------------------------------------------------------|--------------------------------------------------------------|---------------|------------------|-------------|------|----------|---------------|----------------------------------|--|
| Attributes to issue<br>the task to |   | Edit parameters for<br>Block_Documentatio [<br>Target_Resource:<br>Task_Number:<br>Task_Priority: |                                                              | Documentation | Here             |             |      | ×        |               |                                  |  |
| System Resource                    |   | Task_Mean_Time:<br>Task_Spread_Time:<br>Random_Seed:<br>Task_Distribution:                        | Scan_Proc_Time<br>Task_Spread_Flo<br>123457L<br>Fixed (Mean) |               |                  |             |      |          | <br>preemptin | Il other tasks<br>g this Task at |  |
|                                    | 5 | Task_Type:<br>Task_Mutual_Exclusion:                                                              | Queue Task Nov                                               | 1             |                  |             |      | ×<br>  × | SystemRes     | ource                            |  |
|                                    |   | Commit                                                                                            | Add                                                          | Remove        | Restore Defaults | Preferences | Help |          |               |                                  |  |

File View Edit Graph Debug Interface Help  $\blacksquare$ 📇 🗠 က 👗 🗨 🔍 🛃 🗨 🔛 💽 ĥ Find: Library Tree Test Model - Scheduler Digital Document Model Setup PowerTable2 Traffic TASK: "SW" 00 TimeDataPlotter Results "Manager\_1" File IO Behavior Mappers Task Creation. Resources TextDisplay Power SystemResource Traffic Mapper Hardware Setup The PowerTable is updated with ProcessorGenerator xTimeyDataPlot the power consumed by the Cycle\_Accurate\_Processor SystemResource blocks. Memory Note that the Reg-Ex expression Traffic2 Mapper2 HardwareDevices regarding power\_manager is used in the  $\wedge$ Interfaces and Buses ExpressionList block. Full Library UserLibrary Traffic3 Mapper3  $\bigtriangleup$ PowerManager 3 tasks competing for the Current Test Model - Scheduler Digital Traffic4 ExpressionList same resource Cumulative execution finished. 12:14 PM ∧ © "□ *(i*, ⊄») *₫* S C:N\_ ٢ Ω Type here to search Ξi P 16 2/16/2024

\$VS/doc/Training Material/Tutorial/General/System Resources/Scheduler Sw 2.xml

SusualSim Architect - file:/C:/VisualSim/VisualSim2340\_64/V. . . I/System\_Resources/Scheduler\_SW\_02.xml

\_

D

 $\times$ 

## Plots and Stats for Example Model 1



## Different Types of Scheduling Algorithms in SystemResource



Scheduler\_RR

#### Scheduler\_FCFS+Preempt

## Example Model – System Resource Extend





## Queue

|                         |                               |                              | Queues                                                       |
|-------------------------|-------------------------------|------------------------------|--------------------------------------------------------------|
| Edit parameters for Q   | lueues                        | _                            | input, output                                                |
| Block_Documentation:    | Enter User Documentation Here |                              | P reject_output                                              |
| Block_Name:             | "Queue"                       |                              |                                                              |
| Queue_Number_Field:     | input.queue                   | Provides the priority number | Queue-N Transaction are                                      |
| Priority_Field:         | input.priority                | for reordering the queue.    | pushed out according to the                                  |
| Max_Queue_Length:       | 30                            | for reordering the queue.    | arbitration scheme                                           |
| Number_of_Queues:       | 1                             |                              | T4 T3 Queue2                                                 |
| Initial_Queue_State:    | First_Token_Flow_Through      |                              |                                                              |
| Queue_Reject_Mechanism: | Incoming_Token_Rejected       | Set how the packets should   | T2 T1 Queue1                                                 |
| Queue_Type:             | FIFO                          |                              | Transaction are sent Transaction are                         |
|                         |                               | flow                         | to respective Queue arranged as per priority<br>According to |
| Commit A                | dd Remove Restore             | Defaul Preferences Help      | Queue_No                                                     |
|                         |                               |                              | Queue N with Priority                                        |

Mirabilis Design Inc.

23 Return

## Queue – First Token Flow Through





## Queue – First token Enqueue



pop\_input is required to send all the packets including the first one



| Number_of_Queues:       | Number_Of_Queues        |
|-------------------------|-------------------------|
| nitial_Queue_State:     | First_Token_Enqueue     |
| Queue_Reject_Mechanism: | Incoming_Token_Rejected |
| Queue_Type:             | FIFO                    |
|                         |                         |

## Multiple number of Queues



| Edit parameters for Qu  | ieues — 🗆 X                                         |        | Internally, it look like this |
|-------------------------|-----------------------------------------------------|--------|-------------------------------|
| Block_Documentation: 🗊  | Enter User Documentation Here                       |        | "Queue"                       |
| Block_Name:             |                                                     |        | Q Num=1 Q Num=2 Q Num=3       |
| Queue_Number_Field:     | "Ingress_Queue"                                     | Pos=   | P1 P3 P5                      |
|                         | input.Queue_Num                                     |        |                               |
| Priority_Field:         | Int_Dbl_Expr_Mem_Fld                                | ∲os= 2 | P2                            |
| Max_Queue_Length:       | 5                                                   | 103-2  |                               |
| Number_of_Queues:       | 3                                                   | Pos= 3 | P4                            |
| Initial_Queue_State:    | First_Token_Enqueue ~                               |        |                               |
| Queue_Reject_Mechanism: | Incoming_Token_Rejected ~                           | Pos= 4 |                               |
| Queue_Type:             | FIFO                                                | Pos= 5 |                               |
|                         |                                                     | F05- 5 |                               |
| Commit                  | Add Remove Restore Defaults Preferences Help Cancel |        |                               |

pop\_input value can be 1, 2 or 3

### **Queue Operation - Summary**



- Data Structures are queued based on priority from high to low number
- Data Structures in the queue are arranged based on FIFO or LIFO setting
- Number\_of\_Queues defines the number of parallel queues contained by a single Queue block
- Queue Number Field selects the queue to place
- To pop a packet
  - From the head of a queue, Queue\_Number must be sent to pop\_input port.
- When Maximum\_Queue\_Length is reached, packets are Rejected based on Rejection\_Mechanism and sent to Reject\_output
- Based on initial Queue State parameter,
  - Enqueue: First Transaction can be enqueued and wait for the pop
  - First\_Packet\_Flow\_Through: First transaction send without pop. After first packet, head of queue sent if prior was acknowledged with pop



## Server

- Define multiple {queues + time delay}
  - Active Resource
  - DataStructures queued in FIFO or LIFO order
- Processing time is known in advance
  - Provided along with the transaction to this block.
- SLOT
  - Special operation mechanism
  - Models any slot-based architecture such as multiple virtual RTOS, TDMA etc.





Basic Timed Queue



## Server

| Edit parameters for  | Server                        |                               | - |      | × |
|----------------------|-------------------------------|-------------------------------|---|------|---|
| Block_Documentation: | Enter User Documentation Here |                               |   |      |   |
|                      |                               | Server has a special          |   |      |   |
| Block_Name:          | "server"                      | parameter called "Time        |   |      |   |
| Queue_Number_Field:  | input.queue                   | field" to delay head of queue |   |      |   |
| Priority_Field:      | input.priority                | before sending out            |   |      |   |
| Time_Field:          | input.time                    | <u> </u>                      |   |      |   |
| Max_Queue_Length:    | 30                            |                               |   |      |   |
| Number_of_Queues:    | 1                             |                               |   |      |   |
| Queue_Type:          | FIFO                          |                               |   |      | ~ |
|                      |                               |                               |   |      |   |
| Commit               | Add Remove Restor             | re Defaul Preferences Help    |   | Cano | e |

Number\_of\_Queues can be multiple in the Server as well

29Return



## Server - Summary

- *Queue\_Number\_Field* selects the queue
- Queue is reordered based on *Priority field*
- Queue data in FIFO or LIFO based on *Queue\_Type*
- Delayed by *Time\_Field* value at head of queue and sent out
- Packet is sent to *reject\_output* when Max\_Queue\_Length reached

| Edit parameters for S                                                                                          | erver                                                     | _ |        | × |
|----------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|---|--------|---|
| Block_Documentation: 🚺                                                                                         | Enter User Documentation Here                             |   |        |   |
| Block_Name:<br>Queue_Number_Field:<br>Priority_Field:<br>Time_Field:<br>Max_Queue_Length:<br>Number_of_Queues: | "Server"<br>input.Queue_Num<br>1<br>input.Delay<br>30     |   |        |   |
| Queue_Type:                                                                                                    | 5<br>FIFO<br>Add Remove Restore Defaults Preferences Help |   | Cancel | ~ |

## **Performance and Power Metrics**



#### Common Statistics

- ✓ End-to-end latency and Task Delay
- ✓ Throughput (MIPS or MB/s), Utilization (%)
- ✓ Minimum, maximum, mean and standard deviation statistics
- ✓ Power Instantaneous, Average and Cumulative
- ✓ Battery Current Capacity(Watt Hr), Battery Life Remaining (%)

#### Processor

- ✓ Individual statistics for Caches, Execution Units and Pipeline Stages
- ✓ Flush Time, Stall (%), Thread swaps and Context switching
- ✓ Listener for Real-Time pipeline activity

Cache

#### ✓ Hit-miss Ratio

✓ Requests, utilization, evictions, prefetches and latency and throughput



### Hardware Statistics

| Name                                          | Value       | Name                                          | Value         | Name                                                                | Value            |
|-----------------------------------------------|-------------|-----------------------------------------------|---------------|---------------------------------------------------------------------|------------------|
| Bus_1_Utilization_Pct_Max                     | 10.1,       | Bus_1_Delay_Max                               | 3.600004E-8,  | Processor_1_D_1_Hit_Ratio_Max                                       | 100.0,           |
| Cache_1_Utilization_Pct_StDev                 |             | Bus_1_IOs_per_sec_StDev                       | 634428.24721, | Processor_1_D_1_KB_per_Thread_StDev                                 | 0.0,             |
|                                               |             |                                               |               | Processor_1_I_1_Hit_Ratio_Max                                       | 100.0,           |
| Processor_1_D_1_Utilization_Pct_Max           | 1.45,       | Bus_1_Input_Buffer_Occupancy_in_Words_Max     | 32.0,         | Processor_1_I_1_KB_per_Thread_StDev                                 | 0.0,             |
| Processor_1_INT_1_Utilization_Pct_StDev       | 0.10969973, | Bus_1_Preempt_Buffer_Occupancy_in_Words_StDev | 0.0,          | Processor_1_L_2_Hit_Ratio_Max                                       | 100.0,           |
| Processor_1_INT_2_Utilization_Pct_Max         | 24.55,      | Bus_1_Throughput_MBs_Max                      | 176.0,        | Processor_1_L_2_KB_per_Thread_StDev                                 | 0.0,             |
| Processor_1_I_1_Utilization_Pct_StDev         | 0.01500007, | Cache_1_Delay_Time_StDev                      | 1.499871E-8,  | Processor_1_Stall_Time_Pct_Max                                      | 1.35,            |
| Processor_1_L_2_Utilization_Pct_Max           | 3.1,        | Cache_1_Hit_Ratio_Max                         | 100.0,        | Processor_1_Task_Delay_StDev                                        | 2.9502E-7,       |
| Processor_1_PROC_Utilization_Pct_StDev        | 0.03489914, | Cache_1_Memory_Used_By_Processor_1_MB_StDev   | 8.455181E-5,  | SDRAM_1_Delay_Time_Max                                              | 1.5E-7,          |
| Processor_1_Pipeline_Utilization_Pct_Max      | 50.2,       | Cache_1_Memory_Used_By_SDRAM_1_MB_Max         | 2.56E-4,      | SDRAM_1_Memory_Used_By_Processor_1_MB_StDev                         | 0.0,             |
| Processor_1_Register_Rd_Utilization_Pct_StDev | 0.19330594, | Cache_1_Memory_Used_By_Total_MB_StDev         | 8.455151E-5,  | SDRAM_1_Memory_Used_By_Total_MB_Max<br>SDRAM_1_Throughput_MBs_StDev | 2.56E-4,<br>0.0, |
| Processor_1_Register_Wr_Utilization_Pct_Max   |             | Cache_1_Throughput_MBs_Max                    | 116.0,        | DMA_IO_per_sec_Max                                                  | 7.45E6,          |
|                                               | 1.91E-07    | Processor_1_Context_Switch_Time_Pct_StDev     | 0.0,          | DMA_Throughput_MBs_StDev                                            | 3.463999999,     |

### Analyzing Performance in VisualSim – Important Parameters in Library Components

### Hybrid Processor





Outstanding\_Req\_Count is used for External Cache

### **Processor – Internal Cache Configuration**

Edit parameters for ARM9

Block Documentatio... 🍞 Mapping of ARM9 CPU architecture parameters to generic CPU Please NOTE that this is not goldenized and is only for Demo purpose. Architecture\_Name: Board Name Processor Name: Processor Name /\* First row contains Column Names. \*/ Processor Setup: Parameter\_Name Parameter\_Value Processor\_Instruction\_Set: ARM INSTR Micro Architecture details – Number Number\_of\_Registers: 16 /\* active registers being processed. total reg = 31 \*/ of Registers, ROB size, Number of Processor\_Speed\_Mhz: Processor\_Speed 200 /\* switch between internal pipeline stages \*/ Context\_Switch\_Cycles: Registers 100 /\* This cab be assumed. Real ARM data ? \*/ Instruction\_Queue\_Length: 5 /\* after 5'th stage is implemeted \*/ Number\_of\_Pipeline\_Stages: Number\_of\_INT\_Execution\_Units: 2 /\* 1 exection unit 1 coproc unit \*/ Hit\_Ratio by Number\_of\_FP\_Execution\_Units: 0 /\* no particular FP units \*/ Number\_of\_Cache\_Execution\_Units: 2 /\* I-cache and D-cache \*/ {Cache\_Speed\_Mhz=Processor\_Speed, Size\_KBytes=I\_Cache\_Size, Words\_per\_Cache\_Line=8, Hit\_Ratio=0.4, Cache\_Miss\_Name=Cache} I 1: default will be D 1: {Cache\_Speed\_Mhz=Processor\_Speed, Size\_KBytes=D\_Cache\_Size, Words\_per\_Cache\_Line=8, Hit\_Ratio=0.6, Cache\_Miss\_Name=Cache} 0.99 /\* Pipeline stages in ARM \*/ Pipeline\_Stages: Stage\_Name Execution Location Condition Action 1\_FETCH  $I_1$ instr none /\* Fetch \*/ 2 DECODE I\_1 wait none /\* Decode \*/ 3 DECODE : /\* Write \*/ none none exec 3\_DECODE /\* Read \*/ none exec none 3 EXECUTE ; /\* Execute ARM instr \*/ ARM exec none /\* 3\_EXECUTE Co\_Processor\_Name task VPU Execute MOVE instr \*/ Mention next l evel 4\_MEMORY ARM wait none : /\* Wait for ARM instr \*/ /\* 4\_MEMORY Wait for MOVE instr \*/ Co\_Processor\_Name wait none Cache to be 5\_WRITE\_BACK D\_1 : /\* Wait for Mem transaction \*/ write none accessed in case of Enable Hello Messages:  $\checkmark$ Processor Bits: 32 misses

#### **Processor Instruction Set**

Processor name



| le Edit | Halp   |       |          |                   |              |                       |                                                       |                                                  |                              |
|---------|--------|-------|----------|-------------------|--------------|-----------------------|-------------------------------------------------------|--------------------------------------------------|------------------------------|
|         |        | ructi | lon Set  | or File Path. */  |              |                       |                                                       |                                                  |                              |
| 2       |        |       |          |                   |              |                       |                                                       | begin execUnit_config                            | 16                           |
| Mn      | iew Ra | Rb R  | Rc Rd Re | e Rf Rg Rh ; /* I | abel */      |                       |                                                       | Queue_Size INT_1<br>Queue Size INT 2             | 16         ;<br>16         ; |
|         |        |       |          | COSIMD LDSTR ;    | ,            |                       |                                                       | Queue Size INT 3                                 | 16 ;                         |
|         |        |       | 10 1111  | JODIN HUDIN ,     |              |                       | Execution and Load                                    | Queue_Size INT_4                                 | 16 ;                         |
|         |        |       |          |                   |              |                       | Store Units within                                    | Queue_Size INT_5                                 | 16 ;                         |
|         |        |       |          |                   |              |                       | -                                                     | Queue_Size FP_1                                  | 16 ;                         |
| AL      |        |       | _        |                   |              | v, Simple ALU */      | RISCV and the                                         | Queue_Size FP_2<br>Queue_Size INT_6              | 16 ;<br>12 ;                 |
| FP      | NEOSI  | MD FP | 2_1      | ; /*Floating      | point/NEON/  | ASIMD instruction     | * <sup>/</sup> corresponding                          | Queue_Size INT_7                                 | 12 ;                         |
| LD      | STR    | IN    | NT_4     | ; /*Load/Sto      | re instructi | ons*/                 |                                                       | end execUnit_config                              | ·                            |
|         |        |       |          |                   |              |                       | instruction groups                                    | _ 0                                              |                              |
|         |        |       |          |                   |              | -                     | they execute                                          |                                                  |                              |
| be      | egin s | ize_c | config   |                   | ;            |                       | 5                                                     |                                                  |                              |
| R       | lead   | 5     | 128      | INT_4[1:2]        | ;            | begin INT_1           | ; /* Group */                                         |                                                  | <u> </u>                     |
| R       | lead   | 5     | 64       | INT_4[3:7]        | ;            | ADD 2<br>SUB 2        | i<br>i                                                |                                                  |                              |
| R       | lead   | 5     | 32       | INT 4[8:16]       | ;            | *b 2<br>MUL 4         | ;                                                     | _                                                | Queue Size                   |
| R       | lead   | 5     | 16       |                   | ;            | DIV 4<br>LDR 1        |                                                       | Instruction Group                                | for each                     |
| R       | lead   | 5     | 8        | INT 4[19:20]      | ;            | LDR I<br>LDUR 1       |                                                       |                                                  |                              |
|         | Irite  | 5     | 128      | INT_4[21:22]      | ;            | STR 1<br>end INT 1    |                                                       |                                                  | Instruction                  |
|         | Irite  |       | 64       | INT_4[23:27]      | ;            | begin FP 1            | ; /* Group */                                         |                                                  | Group                        |
| W       | Irite  | 5     | 32       | INT 4[28:35]      | ;            | FADD 2<br>FSUB 2      |                                                       |                                                  | •                            |
| W       | Irite  | 5     | 16       | sh                | ;            | FMUL 4 8<br>FDIV 4 12 |                                                       |                                                  |                              |
| W       | Irite  | 5     | 8        | sb                | ;            | end FP_1              | ;                                                     |                                                  |                              |
| en      | nd siz | e_con | nfig     |                   | ;            |                       |                                                       |                                                  |                              |
|         |        | _     |          |                   |              | The entry FADD 2      | ; means that the instruction "FADD" will take 2 cycle | s to complete execution (Without including the I | _Cache                       |

Instruction width in bits can vary for different load/store instructions

#### The entry FADD 2 ; means that the instruction "FADD" will take 2 cycles to complete execution (Without including the I\_Cache access latency and pipeline transfer latency).

The entry FMUL 4 8 ; means that the instruction FMUL can take a random delay cycle between 4 and 8 cycles to complete execution.



#### Example of Pipeline usage



## DMA BLOCK AND ITS USAGE



# PORTS AND HOW TO CONNECT?





- 1. **Req port** receives request from the source (Traffic generator in this case). It will be connected to DeviceInterface
- 2. Ack port will have transactions received with additional fields like Time\_Array, Trace\_Array useful for debug
- **3.** Ack and Din are connected to the output blocks via DeviceInterface or directly
- Dout carries response from output devices like Memory.
   Connected to Bus,
   DeviceInterface or Memory
- 5. Dout will have Task\_Latency and Task\_Throughpout as additional fields



#### DMA – DEFINING ACTIONS USING DATA STRUCTURE



If the user wants to define actions for the DMA from Data structure fields, then above fields are necessary

#### USING THE DWA DATABASE



#### **DMA Controller Block**



## DMA – DEMO MODELS AND RESULTS



# MODEL SHOWCASING DS FIELDS AS WELL AS DMA DATABASE OPTION



#### **Model Location :**

\$VS/doc/Training\_Material/Tutorial/Architecture/DMA/DMA\_Demo\_Model\_Field\_n\_DB.xml



#### **RESULTS FOR THE PREVIOUS MODEL**





### CONNECTING PROCESSOR – DMA – DRAM



There are three peices required- DMA block, DMADatabase block and the entry in the processor: Memory\_Database\_Reference: DMADatabase, where DMADatabase is the name of the Database block containing the DMA Task activity it is best to connect the DMA to one of the ports of the processor either directly or via a Bus.

There is no graphical output to the HW\_DRAM Plot. This is not used

**Model Location :** \$VS/doc/Training\_Material/Tutorial/Architecture/DMA/Processor\_DMA.xml



#### **RESULTS FOR THE PREVIOUS MODEL**



## BUS ARBITER AND BUS INTERFACE BLOCKS AND ITS USAGE



# PORTS AND HOW TO CONNECT





#### **KEY CONFIGURATION**

Edit parameters for BusArbiter2

| List all devices connected via<br>this port. Index is in order of<br>name. Port 1. Port 2 otc.<br>Vidth_Bytes:<br>Arbiter_Mode:<br>Split_Retry_Flag:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                          |  |                                    |                               |             |   |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|--|------------------------------------|-------------------------------|-------------|---|
| List all devices connected via this port. Index is in order of pame. Port 1. Port 2 arts and content of the pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 2 arts and content of pame. Port 2 arts a |                          |  | Block_Documentation:               | Enter User Documentatio       | on Here     |   |
| List all devices connected via this port. Index is in order of pame. Port 1. Port 2 arts and content of the pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 2 arts and content of pame. Port 2 arts a |                          |  |                                    |                               |             |   |
| List all devices connected via this port. Index is in order of pame. Port 1. Port 2 arts and content of the pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 2 arts and content of pame. Port 2 arts a |                          |  |                                    |                               |             |   |
| List all devices connected via this port. Index is in order of pame. Port 1. Port 2 arts and content of the pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 1. Port 2 arts and content of pame. Port 2 arts and content of pame. Port 2 arts a |                          |  |                                    |                               |             |   |
| Required       Bus_Name:       Cluster_Name+"_DSU_Cache_Bus"       Unique Name         Bus_Speed_Mhz:       Core_Speed_Mhz         Burst_Size_Bytes:       100         Round_Robin_Port_Array:       ("Port_1", "Port_2"}         Devices_Attached_to_Slave_by_Port:       1"}, {"Device_3"}, {"Device_4"}, {"Device_5"}, {"Device_5"}, {"Device_7"}, {"Device_4"}         List all devices connected via this port. Index is in order of name       Split_Retry_Flag:       Image: Core_Speed_Mhz                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                          |  | Architecture_Name:                 | "Architecture_1"              |             |   |
| Required       Bus_Speed_Mhz:       Core_Speed_Mhz         Burst_Size_Bytes:       100         Round_Robin_Port_Array:       ("Port_1", "Port_2")         Devices_Attached_to_Slave_by_Port:       1"), {"Device_2"}, {"Device_3"}, {"Device_4"}, {"Device_5"}, {"Device_6"}, {"Device_7"}, {"Device_5"}, {"Device_7"}, {"Device_5"}, {"Device_5"}                                                                                                                                                           |                          |  | _explanation:                      | HardwareDevices->BusArbiter   |             | 1 |
| Interprised       Burst_Size_Bytes:       100         Round_Robin_Port_Array:       ("Port_1", "Port_2")         Devices_Attached_to_Slave_by_Port:       1"}, {"Device_3"}, {"Device_4"}, {"Device_5"}, {"Device_6"}, {"Device_7"}, {"Device_1"         List all devices connected via this port. Index is in order of       Arbiter_Mode:       FCFS         Split_Retry_Flag:       ✓                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                          |  | Bus_Name:                          | Cluster_Name+"_DSU_Cache_Bus" | Unique Name |   |
| Round_Robin_Port_Array:       {"Port_1", "Port_2"}         Devices_Attached_to_Slave_by_Port:       !", {"Device_3"}, {"Device_4"}, {"Device_5"}, {"Device_6"}, {"Device_7"}, {"Device_4"}         List all devices connected via this port. Index is in order of name. Port_1 Port_2 atc.       Arbiter_Mode:       FCFS         Split_Retry_Flag:       Image: Split_Retry_Flag:       Image: Split_Retry_Flag:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Required                 |  | Bus_Speed_Mhz:                     | Core_Speed_Mhz                |             |   |
| List all devices connected via<br>this port. Index is in order of<br>pame. Port 1. Port 2 otc                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                          |  | Burst_Size_Bytes:                  |                               |             |   |
| List all devices connected via<br>this port. Index is in order of<br>pame. Port 1. Port 2 otc                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                          |  | Round_Robin_Port_Array:            |                               |             |   |
| List all devices connected via<br>this port. Index is in order of Split_Retry_Flag:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                          |  | Devices_Attached_to_Slave_by_Port: |                               |             |   |
| this port. Index is in order of Split_Retry_Flag:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                          |  | ▲ Width_Bytes:                     | {16,64} <b>{Read, Write}</b>  |             |   |
| name Bort 1 Bort 2 otc                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                          |  | Arbiter_Mode:                      | FCFS                          |             | × |
| name-Port 1. Port 2 etc.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                          |  | Split_Retry_Flag:                  | $\checkmark$                  |             |   |
| Enable_Plots:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | name- Port_1, Port_2 etc |  | Enable_Plots:                      |                               |             |   |



\_

 $\times$ 





Alternate Read and Write operation Bus will perform one functionality at a time



# **RESULTS FOR THE PREVIOUS MODEL**





#### TILELINK AND ITS USAGE



# PORTS AND HOW TO CONNECT?

1. Source (master devices that request data) like Processor and DMA (standard blocks) have to be connected to the master ports via TileLink\_Client (Library -- > Interfaces and Buses --> TileLink --> TileLink\_Client)





## CONTINUED

2. If the Master device is designed with simple library block (as logic) then user has to include Device Interface block to provide the interface with master.





#### CONTINUED

3. If the TileLink\_Client is connecting to another BUS like PCIe or AMBA-AXI, then the user has to use a Bridge (Library --> HardwareDevices --> Bridge)





### KEY CONFIGURATION

| Edit parameter | s for TileLink |
|----------------|----------------|
|----------------|----------------|

| TileLink_Name:                                                                                 | "TileLink"                                                                                            |
|------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|
| TileLink_Speed_Mhz:                                                                            | 1000.0                                                                                                |
| TileLink_Cycle_Time:                                                                           | (1.0E-6/TileLink_Speed_Mhz)                                                                           |
| Bus_Width:                                                                                     | 8                                                                                                     |
| Number_Masters:                                                                                | 4                                                                                                     |
| Number_Slaves:                                                                                 | 1                                                                                                     |
| Slave_Speeds_Mhz:                                                                              | {TileLink_Speed_Mhz, TileLink_Speed_Mhz, TileLink_Speed_Mhz, TileLink_Speed_Mhz, TileLin              |
| $\label{eq:cycles_for_RdReq_WrReq_RdData_WrData:} Extra_Cycles_for_RdReq_WrReq_RdData_WrData:$ | {0, 0, 0, 0, 0, 0, 0, 0}                                                                              |
| Devices_Attached_to_Slave_by_Port:                                                             | {{"L2","SDRAM_1"},{"SDRAM_2"},{"SDRAM_3"},{"SDRAM_4"},{"SDRAM_5"},{"SDRAM_6"}                         |
| Master_First_Word_Flag:                                                                        | true                                                                                                  |
| DEBUG:                                                                                         | false                                                                                                 |
| Ports_to_Plot:                                                                                 | {1,1} /* master n, slave m, 0 disables */                                                             |
| Managers_Attached_to_Slave_by_Port:                                                            | {{" <u>Manager1"},{"M</u> anager2"},{"Manager3"}, <b>,</b> "Manager4"},{"Manager5"},{"Manager6"},{"Ma |
| Architecture_Name:                                                                             | "Architecture_1"                                                                                      |
|                                                                                                |                                                                                                       |
|                                                                                                |                                                                                                       |
| Commit Add Ren                                                                                 | nove Restore Defaults Preferences Help Cancel                                                         |
|                                                                                                |                                                                                                       |
|                                                                                                |                                                                                                       |

#### TileLink Block

From Architecture Setup

Edit parameters for TileLink\_Client2

 $\times$ 

| Client_Name:            | "Client2"       |                  |  |  |  |  |
|-------------------------|-----------------|------------------|--|--|--|--|
| Client_Speed:           | 1000.0          |                  |  |  |  |  |
| Release_Threshold:      | 5               |                  |  |  |  |  |
| Architecture_Name:      | "Architecture_1 | "Architecture_1" |  |  |  |  |
| No_Of_Retry_FirstBurst: | 3               |                  |  |  |  |  |
|                         |                 |                  |  |  |  |  |
| Commit                  | Add             | Remove           |  |  |  |  |
|                         |                 |                  |  |  |  |  |
| TileLink_Client         |                 |                  |  |  |  |  |

refers to how many times a TileLink client will attempt to send a burst of data to a target device before giving up and retrying the transaction.

| Edit parameters                                                         | for TileLink_Manager                                   |       |   |                                   |
|-------------------------------------------------------------------------|--------------------------------------------------------|-------|---|-----------------------------------|
| Manager_Name:<br>Manager_Speed:<br>Architecture_Name:<br>TileLink_Name: | "Manager1"<br>1000.0<br>"Architecture_1"<br>"TileLink" |       | _ | All the<br>fields are<br>Required |
| Commit                                                                  | Add                                                    | Remov |   |                                   |

#### TileLink\_Manager



## DEMO MODEL



**Model Location :** \$VS/demo/Bus\_Std/TileLink\_Models/TileLink\_RISCV\_SoC\_Approach.xml



# PLOTS FOR THE PREVIOUS MODEL



Timing Diagram for TileLink Bus showing the messages that were sent during the transactions



## STATS FOR THE PREVIOUS MODEL

 $\square$ 

VisualSim Architect - .TileLink RISCV SoC Approach.TextDisplay2 CYCLES\_PER\_INSTRUCTION = 6.7945205479452.DELTA = 0.098333333338.= "Processor DS". DS NAME FP\_1\_Max\_Buffer\_Size = 5. FP\_1\_Usage\_Pct = 6.029810298103.ID = 15. INDEX = 0. INT 1 Max Buffer Size = 2. INT\_1\_Usage\_Pct = 3.6585365853659.INT\_2\_Max\_Buffer\_Size = 22. INT\_2\_Usage\_Pct = 21.6124661246612INT 3 Max Buffer Size = 1. INT\_3\_Usage\_Pct = 1.6260162601626, INT\_4\_Max\_Buffer\_Size = 78. INT\_4\_Usage\_Pct = 100.1355013550135, INT\_5\_Max\_Buffer\_Size = 0. INT\_5\_Usage\_Pct = 0.4065040650407MHZ\_PROCESSOR = 500.0.MIPS\_IN\_PROCESSOR = 73.5850007560032.Pipeline\_Stall\_Pct = {5.4878048780488, 1.69376693 Processor\_Total\_Stall\_Pct = 0.8130081300813, ROB Stall Pct = 0.0.Stall\_Fetch\_Pct = 0.0.Stall\_Load\_Pct = 0.0,TIME = 0.098333333338, TIME\_IN\_PROCESSOR = 2.97615000003E-6

{BLOCK DS NAME Number Entered Number\_Exited Occupancy\_Max Occupancy\_Mean Occupancy\_Min Occupancy\_StDev Total\_Delay\_Max Total\_Delay\_Mean Total\_Delay\_Min Total\_Delay\_StDev {BLOCK DS NAME Number Entered Number\_Exited Occupancy\_Max Occupancy\_Mean Occupancy\_Min Occupancy\_StDev Total\_Delay\_Max Total\_Delay\_Mean Total\_Delay\_Min

Total\_Delay\_StDev

= "TileLink\_RISCV\_SoC\_Approach.TileLink.TileLink\_Channel\_C", = "TileLink\_Channel\_C\_4", = 145. = 145.= 3.0.= 0.5896551724138, = 0.0,= 0.5877161099728. = 3.0E-9. = 1.0896551724786E-9.= 9.999999947364E - 10.= 3.0888545764435E-10= "TileLink\_RISCV\_SoC\_Approach.TileLink.TileLink\_Channel\_D", = "TileLink Channel D 1". = 124.= 124, = 1.0, = 0.5.= 0.0.= 0.5.= 1.000000012084E-9,= 1.0000000004E-9.

Channel wise Occupancy and Delay for TileLink

= 9.999999947364E - 10.

= 2.4812421759788E-17}



#### AXI BLOCK AND ITS USAGE



# PORTS AND HOW TO CONNECT?



- Support 16 masters and 8 slaves
- Master and slave ports can be increased if it is required. Eg: AXI\_16\_x16 block
- Master and Slave Ports :



- Port data type general
- Support only 2 wire connection per port (input and output)

AMBA AXI

stats out

<---> <--->

<--->

Master

Connections

input 1-16

<--->

<---> <---> <--->

<---> <--->

S1ave

Connections

output 1-8

<--->

<--->

- **Input ports**: Accept read/write requests from master devices and forward read/write respons to master devices
- **Output ports**: Forward read/write requests to slave devices and accept read/write response from slave devices.
- **Stats\_out**: provide debug messages during the simulation if "DEBUG" parameter is set.
- **Plot\_out:** Not used in the current version.



# KEY CONFIGURATION

- **AXI\_Speed\_Mhz**: Bus speed in Mhz
- **Bus\_Width**: Width of the AXI Data and response channel
- **Read\_Threshold**: Total number of outstanding reads in each slave ports.
- Write\_Threshold: Total number of outstanding writes in each slave ports if A\_Task\_Flag is true.
- **Master\_Request\_Threshold**: Input queue size of master ports.

| Arbiter_FIX_1_RR_2_CUSTOM_3:                | 1                                                                                        |
|---------------------------------------------|------------------------------------------------------------------------------------------|
|                                             | <pre>iz, AXI_Speed_Mhz,AXI_Speed_Mhz, AXI_Speed_Mhz, AXI_Speed_Mhz, AXI_Speed_Mhz}</pre> |
| Extra_Cycles_for_RdReq_WrReq_RdData_WrData: | {0, 0, 0, 0, 0, 0, 0, 0}                                                                 |
| Devices_Attached_to_Slave_by_Port:          | e_2"},{"Device_3"},{"Device_4"},{"Device_5"},{"Device_6"},{"Device_7"},{"Device_8"}}     |

| Bus_Width:                | 8                                       |  |  |
|---------------------------|-----------------------------------------|--|--|
| Read_Threshold:           | 2                                       |  |  |
| Write_Threshold:          | 2                                       |  |  |
| Master_Request_Threshold: | {2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2} |  |  |

- Device\_Attached\_to\_Slave\_by\_Port: List of slave devices connected in each slave port.
- Arbiter\_FIX\_1\_RR\_2\_CUSTOM\_3: Arbitration algorithm for multi master single slave cases. (fixed priority uses Fixed\_Priority\_Array parameter)



# DEMO MODEL





# **RESULTS FOR PREVIOUS MODEL**





#### CHI BLOCK AND ITS USAGE



# PORTS AND HOW TO CONNECT





- Support existing AMBA AXI protocol.
- Coherency can be enabled for specific masters
- If coherency is not enabled for any master, bus acts like a AXI bus and performs non coherent data transfer
  - Coherency channels will not be used.



### **KEY CONFIGURATION**

| Fixed_Priority_Array:  | ,15,16},{1,2,3,4,5,6 |                                |  |  |
|------------------------|----------------------|--------------------------------|--|--|
| Slave_First_Word_Flag: | true /* Not Active   |                                |  |  |
| Custom_Slave_File:     | "none"               | Lis of coherent masters in an  |  |  |
| Ports_to_Plot:         | {0,0} /* master n,   |                                |  |  |
| Coherent_Masters:      | {1,2}                | array.                         |  |  |
| AXI_Sync_Speed_Mhz:    | 1000.0               | Empty array is acceptable, and |  |  |
| Commit Add Rer         | nove Restor          | it implies none of the masters |  |  |
|                        |                      | are coherent                   |  |  |







Model Location : \$VS/demo/Coherency/Snooping\_Protocol.xml (2340 version)



#### **RESULTS FOR PREVIOUS MODEL**





# BRIDGE BLOCK AND ITS USAGE



## PORTS AND HOW TO CONNECT





- Blocking
- Delay is variable based on data size and speed



### **KEY CONFIGURATION**

| Edit parameters for Bridg | ge            |           |         |      |                              |
|---------------------------|---------------|-----------|---------|------|------------------------------|
| Architecture Name:        | II A          | 411       |         |      |                              |
| Architecture_Nume.        | "Architecture | _1"       |         |      |                              |
| Bridge_Name:              | "Bridge"      | Unique Na | me      |      | Bridge synchronization clock |
| Bridge_Speed_in_Mhz:      | 100.0         |           |         | -    | Bridge synchronization clock |
| Bridge_Width_in_Bytes:    | 4             |           |         | _    | <br>Width of the bridge      |
| Overhead_Cycles:          | 1             |           |         |      | Width of the bridge          |
| Bridge_Sync_Speed_in_Mhz: | Bridge_Speed  | l_in_Mhz  |         |      |                              |
| Commit                    | Add           | Remove    | Restore | Defa |                              |







Model Location : \$VS/doc/Training\_Material/Architecture/bus/AHB\_AXI/AHB\_AXI\_Bus.xml



# **RESULTS FOR THE PREVIOUS MODEL**



### SWITCH BLOCK AND ITS USAGE



# PORTS AND HOW TO CONNECT



#### Blocking

- Single interconnect
- 4 connected Devices and can add more
- Single-cycle delay for Read and Multi-cycle delay for Write
   Non blocking
- Point-to-point mesh with multiple channels per wire
- Basic block contains for 4 device connections; Can be expanded
- Single-cycle delay for Read and Multi-cycle delay for Write



### **KEY CONFIGURATION**

| Edit parameters for Sv   | witch          |                         |   |                               |
|--------------------------|----------------|-------------------------|---|-------------------------------|
|                          |                |                         |   |                               |
| Architecture_Name:       | "Architecture_ | _1"                     |   |                               |
| Switch_Name:             | "Switch"       |                         |   |                               |
| Speed_Mhz:               | 100.0          |                         | _ | Width of the switch           |
| Width_Bytes:             | 4              |                         |   |                               |
| Blocking_Mode:           | false          | ~                       |   |                               |
| _explanation:            | Hardware_Mo    | deling->Bus_Switch_Ctrl |   |                               |
| Overhead_Cycles:         | 1              |                         |   | Blocking or non blocking mode |
| Address_Bits:            | 32             |                         |   |                               |
| Switch_Levels:           | 3              |                         |   |                               |
| Switch_Prefix_Name:      | "Switch"       | Unique Name             |   |                               |
| Switch_Devices_to_Ports: | {{"Device1"},  | {"Device2"},{"Device3"} |   |                               |
|                          |                |                         |   |                               |
| Commit                   | Add            | Remove                  |   |                               |









This model describes the implementation of Cortex-MO processor board using hybrid processor.



• Sim Time: 100.0e-3

Bus\_Speed\_Mhz: 48.0
Processor: "ARM\_CortexM0"
Processor\_Speed: 48.0

View\_Stats: true

• Memory\_Speed\_MHz: 48.0

#### **Model Location :**

\$VS/doc/Training\_Material/Architecture/Processor/Cortex\_M0/ARM\_CortexM0\_Demo.xml



# **RESULTS FOR THE PREVIOUS MODEL**





## CROSSBAR BLOCK AND ITS USAGE



# PORTS AND HOW TO CONNECT



- Master devices can be connected to the top multi ports of the block
- Slave blocks can be connected to the bottom multi ports of the block
- The packet can reach a slave device that is in different crossbar by hopping through the crossbar according to the configuration in Crossbar\_Setup block.



### **KEY CONFIGURATION-CROSSBAR**

| Crossbar_Name:                 | "CX2"                    | Width of the crossbar                   |
|--------------------------------|--------------------------|-----------------------------------------|
| Architecture_Setup_Name:       | "Architecture_1"         |                                         |
| Speed_Mhz:                     | Crossbar_Clock_Speed_MHz |                                         |
| Width_Bytes:                   | 4                        | Crossbar QoS – Rate Regulation or       |
| Crossbar_QoS:                  | Rate_Regulation          | Bandwidth limitation                    |
| Buffer_Size:                   | 100                      | Danawiani minianon                      |
| Enable_Flow_Control:           |                          |                                         |
| Enable_Hello_Message_Forwardin |                          |                                         |
| Arbitration_Algorithm:         | FCFS                     | Buffer Size – Max number of frames that |
| Routing_Option:                | Interleave               | can be stored at the port interface     |
|                                |                          |                                         |
|                                |                          |                                         |

Routing option – Address based or interleave (detailed configuration in Crossbar\_Setup)



#### KEY CONFIGURATION-CROSSBAR SETUP

| Crossbar_Routing_Table: 🗍 | Crossbar_Name | Port_Name | Connected_Crossbar_Name | Connected_Crossbar_Port | Port_Bandwidth_Mbps ; |
|---------------------------|---------------|-----------|-------------------------|-------------------------|-----------------------|
|                           | CX1           | Slave_1   | CX2                     | Master_1                | 200.0 ;               |
|                           | CX1           | Slave_2   | CX2                     | Master_2                | 200.0 ;               |
|                           | CX1           | Slave_3   | CX3                     | Master_1                | 200.0 ;               |
|                           | CX1           | Slave_4   | CX3                     | Master_2                | 200.0 ;               |
|                           | CX2           | Slave_1   | CX4                     | Master_1                | 200.0 ;               |
|                           | CX2           | Slave_2   | CX4                     | Master_2                | 200.0 ;               |
|                           | CX3           | Slave_3   | CX4                     | Master_3                | 200.0 ;               |
|                           | CX3           | Slave_4   | CX4                     | Master_4                | 200.0 ;               |
|                           | CX4           | Slave_1   | CX5                     | Master_1                | 200.0 ;               |
|                           | CX4           | Slave_2   | CX5                     | Master_2                | 200.0 ;               |
|                           | CX4           | Slave_3   | CX5                     | Master_3                | 200.0 ;               |
|                           | CX4           | Slave_4   | CX5                     | Master_4                | 200.0 ;               |
|                           | CX5           | Slave_1   | Wired                   | DRAM_0                  | 200.0 ;               |
|                           | CX5           | Slave_2   | Wired                   | DRAM_1                  | 200.0 ;               |
|                           | CX5           | Slave_3   | Wired                   | DRAM_2                  | 200.0 ;               |
|                           | CX5           | Slave_4   | Wired                   | DRAM_3                  | 200.0 ;               |
|                           |               |           |                         |                         |                       |
|                           |               |           |                         |                         |                       |

Specifies the Crossbar routing and bandwidth configuration

| Crossbar_Settings_Table: Crossbar_Name<br>CX1<br>CX2<br>CX3<br>CX4<br>CX5 | Port_Address {{},{},{},{},{},{},{},{},{},{},{},{},{}, |
|---------------------------------------------------------------------------|-------------------------------------------------------|
|---------------------------------------------------------------------------|-------------------------------------------------------|

Specifies the addressing parameters and interleave settings







Model Location : \$VS/demo/NoC/Crossbar/20220527\_5M\_4S\_CX\_Demo\_Model\_17.xml



# **RESULTS FOR THE PREVIOUS MODEL**





# GENERIC NOC BLOCKS AND ITS USAGE



### PORTS AND HOW TO CONNECT





Latency Δ.





# KEY CONFIGURATION

#### Router



#### **RN Array Path, RNI, HNI and SNF**







Debug\_Fields: {}



Model Location : \$VS/demo/networking/Noc/NoC\_Demo\_Updated.xml



# **RESULTS FOR PREVIOUS MODEL**





# ARTERIS NOC BLOCKS AND ITS USAGE



#### HOW TO CONNECT?





# KEY CONFIGURATION

#### Switch (Router)

Edit parameters for Switch4

Router\_Speed\_Mhz: NoC\_Speed\_Mhz Node\_Name: "R\_"+x\_val+"\_"+y\_val /\* Do not Modify \*/ Power\_Manager\_Name: "none" Router x\_val: 1 y\_val: Coordinates 1 Topology: Mesh No\_of\_Routers: 4 configure only in Loop Topology\*/ Priority\_Enable: true Stats\_Enable:  $\sim$ Interconnect\_QoS: Bandwidth\_Limiter No\_Of\_VC\_Per\_Port: 1 Bandwidth\_per\_Port\_Mbps: {200.0,200.0,200.0,200.0,200.0,200.0} /\* **Device Interface** Interface Buffer Size: 500 and Virtual Channel **Buffer Size** Add Commit Remove Restore

#### NIU (Network Interface Unit)



Edit parameters for Master NIU4

# DEMO MODEL



Model Location : \$VS/demo/NoC/Arteris/Noc\_Arteris\_Demo.xml



# **RESULTS FOR PREVIOUS MODEL**





# **CMN600 BLOCKS AND ITS USAGE**



### PORTS AND HOW TO CONNECT







#### **RNF, RNI, HNI and SNF** No of VC in the router KEY CONFIGURATION VLAN\_Q: 4/\*Total number of VLAN Q available\*/ Device\_Threshold: Device\_Threshold Device buffer size TrafficRate: 10.0 \* 1.0/Frequency Address Low: 0 Address High: 1023 Random Address: Ise/\*true for generating random addr re-Edit parameters for XP Packet Priority Request\_Priority: "All"/\*ALL.HH.H.M.L are the available or **XP (Router)** QoS\_Regulator\_Mode: None \_ \_ \_ . . . . Ingress\_Buffer\_Size: 1000 Flit size in bytes Size of Virtual Channel and Flit\_Size: Flit Size VC\_Buffer\_Size: 10 HNF device interface queue Router\_Frequency: Router Frequency Flit\_Size: Flit\_Size Node\_Name: "R\_"+x\_val+"\_"+y\_val **HNF** input Queue Num Resources: Power\_Manager\_Name: "Manager\_1" 50/\*Total number of resources available for POCQ\*/ Bit error ratio size Source Address: "HNF 2" Single\_bit\_error\_ratio: 0.4 Device\_Threshold: Device Threshold Double\_bit\_error\_ratio: 0.2 Power\_Manager\_Name: "Manager\_1" Number of VC VLAN\_Q: 4 SLC SIZE KB: 256 HNF cache size x\_val: View\_Plots: y\_val: 1 **Router Coordinates** VLAN\_Q: 4/\*Total number of VLAN O available\*/ Stats Enable: Edit parameters for Wire2 Wire Add Re Commit Start\_Device: "R\_1\_2" End\_Device: "R\_1\_1" **Router connections** Delay\_Name: Start\_Device + "\_to\_" + End\_Device Wire Length: 1e-8 flipPortsVertical: true \_flipPortsHorizontal: false rotatePorts: 180 Clock\_Speed: 200e6 Delay cycle for flits Delay Cycles: 1 Wire ID:

2

# DEMO MODEL



Parameters

Model Location : \$VS/demo/NoC/Corelink/CMN600\_with\_A77\_DDR5.xml



## **RESULTS FOR PREVIOUS MODEL**





# PCIE BLOCK AND ITS USAGE





**Stats\_out**: not used in the current version



### **KEY CONFIGURATION**

Edit parameters for PCIe\_Bus

| Architecture_Name:          | "Architecture_1"                                                                            |                                       |                                           |     |                                |  |  |
|-----------------------------|---------------------------------------------------------------------------------------------|---------------------------------------|-------------------------------------------|-----|--------------------------------|--|--|
| Bus_Name:                   | "PCIe_1" Can be common or array for each port in order starting from                        |                                       |                                           |     |                                |  |  |
| Number_of_Lanes:            | 16 /* Can be an array */ top-left and continuing through top-right                          |                                       |                                           |     |                                |  |  |
| Slave_Buffer:               | 512 /* Max Bytes @ 9                                                                        | Slave */                              | top-left and continuing through top-light |     |                                |  |  |
| Master_Buffer:              | 512 /* Max Bytes @ Master */                                                                |                                       |                                           |     |                                |  |  |
| _explanation:               | Interfaces and Buses->PCI->PCIe_Bus                                                         |                                       |                                           |     |                                |  |  |
| Header_Bytes:               | 16 /* 32 Bit Mode, includes CRC Bytes */                                                    |                                       |                                           |     |                                |  |  |
| Number_of_Ports:            | {12, 12} /* Master, E                                                                       | {12, 12} /* Master, Endpoint Ports */ |                                           |     |                                |  |  |
| BER:                        | 1E-11                                                                                       |                                       |                                           | Rec | uired: List of all the Devices |  |  |
| Max_Payload_Size:           | 64 /* Write, Read Data */                                                                   |                                       |                                           | con | nected to each End-Point       |  |  |
| Max_Payload_Req_Size:       | 128 /* Read Requests */                                                                     |                                       |                                           |     |                                |  |  |
| Read_to_Write_Ratio:        | 0.5 /* 0.0 to 1.0 */                                                                        |                                       |                                           |     |                                |  |  |
| Devices_Attached_to_Slaves: | : )RAM_3"},{"Dev_4"},{"Dev_5"},{"Dev_6"},{"Dev_7"},{"Dev_8"},{"Dev_9"},{"Dev_10"},{"De      |                                       |                                           |     |                                |  |  |
| Root_Complex_Flow_Control:  | {false,false,false,false,false,false,false,false,false,false,false,false,false,false,false} |                                       |                                           |     |                                |  |  |
| Endpoint_Flow_Control:      | {false,false,false,false,false,false,false,false,false,false,false,false,false,false,false} |                                       |                                           |     |                                |  |  |
| Enable_Plots:               |                                                                                             |                                       |                                           |     |                                |  |  |
| Bit_64_Mode:                |                                                                                             |                                       |                                           |     |                                |  |  |
| NumOfRetry:                 | 4                                                                                           |                                       |                                           |     |                                |  |  |
| Timeout:                    | 6E-6                                                                                        |                                       |                                           |     |                                |  |  |
| PCIe_MBps:                  | PCIe_Gen_1                                                                                  | Requir                                | ed: Determines speed                      |     |                                |  |  |







This model describes the usage of DMA with the PCIe Bus.



Model Location : \$VS/doc/Training\_Material/Architecture/bus/PCIe/PCIe\_DMA.xml



# **RESULTS FOR THE PREVIOUS MODEL**





# PCIE6.0 BLOCK AND ITS USAGE



# PORTS AND HOW TO CONNECT







Master and slave connections use multi port for input and output.

Debug\_port: provides debug message of the PCIe6 bus



# **KEY CONFIGURATION**

| Edit parameters      | for PCle6_l   | Bus       |                   |                        |                     | -                    |                   | ×      |                                  |
|----------------------|---------------|-----------|-------------------|------------------------|---------------------|----------------------|-------------------|--------|----------------------------------|
| Architecture_Setup_N | lame: "       | Architect | ure 1"            |                        |                     |                      |                   |        | Maximum request size of a packet |
| PCIe_Switch_Name:    | _             | PCIe Sw   |                   |                        |                     |                      |                   |        |                                  |
| Max_Read_Req_Size    | -             | 096       |                   |                        |                     |                      |                   | _      |                                  |
| Flit_Size_Bytes:     | 2             | 56        |                   |                        |                     |                      |                   |        |                                  |
| Number_of_Lanes:     | 1             | 6         |                   |                        |                     |                      |                   | _      | Number of Lanes                  |
| Buffer_Size_Bytes:   | {•            | 4096,409  | 96} //Rx,Tx       | •                      |                     |                      |                   |        |                                  |
| Overhead_Cycles:     | 0             |           |                   |                        |                     |                      |                   |        |                                  |
| Devices_Attached_to  | Ports: le     | v_4"},{"  | Dev_5"},{"Dev_6"] | },{"Dev_7","DRAM","Reg | )"},{"Dev_8"},{"Dev | /_9"},{"Dev_10"},{"D | )ev_11"},{"Dev_12 | "}}    |                                  |
| Enable_Debug:        |               | ue        |                   |                        |                     |                      |                   |        | Maximum bytes that can be        |
| BER:                 | 1             | .0e-11    |                   |                        |                     |                      |                   |        | stored in Rx and Tx buffer       |
| NumOfRetry:          | 4             |           |                   |                        |                     |                      |                   |        |                                  |
| Timeout:             | 6             | E-6       |                   |                        |                     |                      |                   |        |                                  |
| Enable_Master_Flow   | Control:      |           |                   |                        |                     |                      |                   |        |                                  |
| Enable_Slave_Flow_(  | Control:      | 7         |                   |                        |                     |                      |                   | List c | of Devices connected to each     |
| Replay_Buffer_Size_B | Bytes: 1      | <br>024   |                   |                        |                     |                      |                   | of th  | e port                           |
| Number_Of_Successi   | ve_Acks: 3    |           |                   |                        |                     |                      |                   |        |                                  |
| Enable_Selective_Ac  | u t           | ue        |                   |                        |                     |                      |                   |        |                                  |
| Enable_Hello_Msg_Fo  | orwarding: tr | ue        |                   |                        |                     |                      |                   |        |                                  |
|                      |               |           |                   |                        |                     |                      |                   |        |                                  |
|                      |               |           |                   |                        |                     |                      |                   |        |                                  |
| Commit               | Add           |           | Remove            | Restore Defaults       | Preferences         | Help                 | Cancel            |        |                                  |







**Model Location :** \$VS/doc/Training\_Material/Architecture/bus/PCIe6/PCIe6\_Device Interface2x2\_base\_model.xml



# **RESULTS FOR THE PREVIOUS MODEL**

| VisualSim ArchitectPCle6_Device_Interface2x2_base_model.TextDisplay |           |         |        | $\times$ |
|---------------------------------------------------------------------|-----------|---------|--------|----------|
| rere_switch_rere_switch_rolt_rz_ix_mbps = 0.0,                      | 0070      |         |        |          |
| PCIe_Switch_PCIe_Switch_Port_1_Drop_Count                           |           |         |        |          |
| <pre>PCIe_Switch_PCIe_Switch_Port_1_Rx_MBps = 88152</pre>           |           |         |        |          |
| PCIe_Switch_PCIe_Switch_Port_1_Total_MBps                           |           |         | 217824 | ,        |
| <pre>PCIe_Switch_PCIe_Switch_Port_1_Tx_MBps = 90100</pre>           |           |         |        |          |
| <pre>PCIe_Switch_PCIe_Switch_Port_1_to_Port_8_Max_Lat</pre>         | -         |         | 986E-7 |          |
| PCIe_Switch_PCIe_Switch_Port_1_to_Port_8_Mean_La                    | tency     | = 1.12  | 386033 | 92       |
| <pre>PCIe_Switch_PCIe_Switch_Port_1_to_Port_8_Min_Lat</pre>         | ency      | = 4.53  | 8E-9,  |          |
| PCIe_Switch_PCIe_Switch_Port_2_Drop_Count                           | = 9961,   |         |        |          |
| <pre>PCIe_Switch_PCIe_Switch_Port_2_Rx_MBps = 88684</pre>           | .0886840  | 8868,   |        |          |
| PCIe_Switch_PCIe_Switch_Port_2_Tota1_MBps                           | = 17884   | 4.17884 | 417885 | ,        |
| <pre>PCIe_Switch_PCIe_Switch_Port_2_Tx_MBps = 90160</pre>           | .09016009 | 9016,   |        |          |
| PCIe_Switch_PCIe_Switch_Port_2_to_Port_7_Max_Lat                    | ency      | = 1.24  | 181E-7 | ,        |
| PCIe_Switch_PCIe_Switch_Port_2_to_Port_7_Mean_La                    | tency     | = 1.04  | 680302 | 85:      |
| PCIe_Switch_PCIe_Switch_Port_2_to_Port_7_Min_Lat                    | ency      | = 4.73  | 8E-9,  |          |
| PCIe_Switch_PCIe_Switch_Port_3_Drop_Count                           | = 0,      |         |        |          |
| <pre>PCIe_Switch_PCIe_Switch_Port_3_Rx_MBps = 0.0,</pre>            |           |         |        |          |
| PCIe_Switch_PCIe_Switch_Port_3_Tota1_MBps                           | = 0.0,    |         |        | - 14     |
| <pre>PCIe_Switch_PCIe_Switch_Port_3_Tx_MBps = 0.0,</pre>            |           |         |        | - 1      |
| PCIe_Switch_PCIe_Switch_Port_4_Drop_Count                           | = 0,      |         |        | - 1      |
| <pre>PCIe_Switch_PCIe_Switch_Port_4_Rx_MBps = 0.0,</pre>            |           |         |        | - 8      |
| PCIe_Switch_PCIe_Switch_Port_4_Total_MBps                           | = 0.0,    |         |        |          |
| <pre>PCIe_Switch_PCIe_Switch_Port_4_Tx_MBps = 0.0,</pre>            |           |         |        |          |
| PCIe_Switch_PCIe_Switch_Port_5_Drop_Count                           | = 0,      |         |        |          |
| <pre>PCIe_Switch_PCIe_Switch_Port_5_Rx_MBps = 0.0,</pre>            |           |         |        |          |
|                                                                     |           |         |        |          |



# INTEGRATED CACHE AND ITS USAGE



## **Integrated Cache**

Integrated Cache can be used as L1(Instruction and/or Data), L2 and L3 cache in both stochastic and cycle accurate mode.

#### **Stochastic Mode:**

- Hit or miss of input request will be determined by the instruction hit ratio/data hit ratio.
- If it is a hit, request will be processed and response will be returned to the source. If it is a miss, request will be sent to next level cache or memory to fetch whole block of data, while the request is waiting in the buffer,.

#### Address\_Based Mode

- Hit or miss of the input request will be determined by the availability of the requested address in the cache.
- If it is a hit, request will be processed and response to the requested address will be returned. If it is a
  miss, whole block of address range will be fetched from next block of memory and the request waiting
  in the buffer will be processed.

#### Flow control:

- User can run the model either with flow control or without flow control. By default block will be used in without flow control.
- Input flow control can be achieved by including a field named "Event\_Name" in the input data structure and a TIMEQ to trigger the next request. The next request will be triggered only when the data is processed by the cache.
- Output flow control can be achieved by setting the "Output\_Flow\_Control" parameter as true.



## **Required Parameters**

- Outstanding\_Requests
- Number of Outstanding misses and prefetches
- Number of Hit before
   Prefetch

|              | Eulit parameters for L2_Cat  | .ne                                                               |  |
|--------------|------------------------------|-------------------------------------------------------------------|--|
|              |                              |                                                                   |  |
|              | Cache_Name:                  | Cluster_Name+"_L2_Cache"                                          |  |
|              | Cache_Speed_Mhz:             | L2_Cache_Speed_Mhz                                                |  |
|              | Cache_Width_Bytes:           | L2_Cache_Width_Bytes                                              |  |
|              | Cache_Size_Bytes:            | L2_Cache_Size_Bytes                                               |  |
|              | Block_Size_Bytes:            | Cache_Block_Size_Bytes                                            |  |
|              | Cache_Type:                  | D_Cache                                                           |  |
|              | Stochastic_or_Address_Based: | Address_Based                                                     |  |
|              | Hit_Ratio:                   | 0.8                                                               |  |
|              | Loop_Ratio:                  | 0.2                                                               |  |
|              | Overhead_Cycles:             | 0 /* For read and write set it as array {Read,Write}, eg: {1,3}*/ |  |
|              | Input_Flow_Control:          |                                                                   |  |
|              | Req_Buffer_Size:             | 16                                                                |  |
|              | Output_Flow_Control:         |                                                                   |  |
|              | Total_Outstanding:           | 10                                                                |  |
|              | N_Way_Associativity:         | 8                                                                 |  |
| $\backslash$ | Cache_Replacement_Policy:    | Pseudo-LRU                                                        |  |
| $\mathbf{A}$ | Cache_Write_Policy:          | Write_Back                                                        |  |
|              | Inclusion_Policy:            | NINE /* Mandatory for L1 cache */                                 |  |
|              | Miss_Memory_Name:            | Next_Level_memory_Name                                            |  |
|              | Power_Manager_Name:          | "None" /*To analyse power, link the manager name */               |  |
|              | No_of_Statistics:            | cast(1,Number_of_samples)                                         |  |
|              | Word_Access:                 | First_Word                                                        |  |
|              | tal_Outstanding_Miss:        | 10                                                                |  |
|              | N_Hits_Before_Prefetch:      | 1                                                                 |  |
|              | Main_Memory_Size_KB:         | 4*1024*1024                                                       |  |
|              | Architecture_Name:           | Architecture_Setup_Name                                           |  |
|              |                              |                                                                   |  |

Remove

Commit Add

Edit parameters for L2 Cache

Restore Defaults Preferences

Cancel

Help



### **Integrated Cache Statistics**

Number of Statistics samples can be generated during the simulation time using the parameter "No\_of\_Statistics'

| VisualSim ArchitectCache_and | d_mem.L2_Cache_Statistics — 🗌 | $\times$ |
|------------------------------|-------------------------------|----------|
|                              |                               | /        |
| DISPLAY AT TIME              | 500.2795000 us                |          |
| {A_Hit_Ratio                 | = 93.2203389830508,           |          |
| A_Miss_Ratio                 | = 6.7796610169492,            |          |
| A_Prefetch_Ratio             | = 0.0,                        |          |
| A_Read_MBs                   | = 0.181248,                   |          |
| A_Read_MBs_per_Second        | = 362.4960072499201,          |          |
| A_Total_MBs                  | = 0.193536,                   |          |
| A_Total_MBs_per_Second       | = 387.0720077414401,          |          |
| A_Write_MBs                  | = 0.012288,                   |          |
| A_Write_MBs_per_Second       | = 24.57600049152,             |          |
| BLOCK                        | = "L2_Cache",                 |          |
| Buffer_Occupancy             | = 0,                          |          |
| DS_NAME                      | = "Header_Only",              |          |
| TD                           | = 1,                          |          |
| Number_Entered               | = 177,                        |          |
| Number_Returned              | = 177,<br>= 177,              |          |
| TIME                         | -                             |          |
|                              | = 5.002795E-4,                |          |
| Utilization                  | = 12.1845002436899            | *        |

# STOCHASTIC CACHE AND IT'S USAGE





### Stochastic Cache

Emulate a cache in stochastic architecture mode where addresses are not available- Used with hit=ratio

#### Handles

Request are Queued

Measures actual Cache hit-miss ratio

o Cache Prefetch,

 $\circ$  Read/Write

 $_{\odot}$  Cache miss activity to the next level of memory



**MIRABILIS DESIGN INC.** 



### Configuration

| Cache_Speed       Cache_Speed_Mhz:       L2_Cache_Clock         is necessary       Gache_Size_KBytes:       64.0         Width_Bytes:       4         Words_per_Cache_Line:       16         FIFO_Buffers:       32         Cache_Address:       "/* Format: Min_Address.Example:100,200 */" |                                       | lit parameters for Cache<br>_Documentatio 🗱 Enter U                                                                        | User Documentation | Here | _    |        | ×      | Route the requests to the next level of<br>memory to access when a cache miss<br>occurs or a prefetch is requested |  |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------|----------------------------------------------------------------------------------------------------------------------------|--------------------|------|------|--------|--------|--------------------------------------------------------------------------------------------------------------------|--|
| FIFO_Buffers:       32         Cache_Address:       "/* Format: Min_Address. Example:100,200 */"                                                                                                                                                                                             | ache_Speed                            | Architectu<br>Architectu<br>Memory_Name: "L2_Cache<br>"DRAM"<br>P_Speed_Mhz: L2_Cache_<br>e_Size_KBytes: 64.0<br>_Bytes: 4 | ne"                |      |      |        | ·<br>· | Number of outstanding requests that need to be processed                                                           |  |
| Enable_Hello_Messages:                                                                                                                                                                                                                                                                       | FIFO_B<br>Cache_<br>Cache_<br>Enable_ | Buffers: 32<br>a_Address: "/* Formal<br>"rand(0.0,<br>a_Hello_Messages:                                                    | 0,1.0) <= 0.95"    |      | Help | Cancel |        | • true, then the task had a cache hit                                                                              |  |

**MIRABILIS DESIGN INC.** 

# STOCHASTIC MEMORY AND IT'S USAGE



## **Stochastic Memory**





## RAM

#### 1. Models the memory controller and memory array

#### 2. Handles

- Read
- Write
- Refresh
- Erase
- 3. Applications
  - ➢ RÔM, RAM, SRAM, DRAM or SDRAM
  - > DDR, DDR2, DDR3
  - ≻ SDR, QDR
  - > VRAM, Direct Rambus, PSRAM, SGRAM
  - ➤ NAND and NOR flash



MIRABILIS DESIGN INC.



### Important Concepts

- 1. Controller Time Scheduling + Activation
  - Cycle Time = 1/Memory Speed Mhz

### 2. Access Time

- Access time for Read, Write, Prefetch and Erase is in nanoseconds Example : Read 1000.0/Memory\_Speed\_Mhz
- Default value
  - Read 5.0
  - Prefetch 6.0
  - Write 7.0
  - ReadWrite 8.0
  - Erase 9.0



### Read Operation

#### Request goes as a Read, Response returns as a Write

#### Request goes as a Write, Ack returns as a Read



# DNN (MASK-R-CNN) USAGE



# PORTS AND HOW TO CONNECT



- The On chip network (pink surface) is designed as per the Mask R-CNN specification
- The Task Graph in the bottom defines the behavioral flow for software and trigger the hardware devices.





# KEY CONFIGURATION

| Edit parameters for S  | hape_Parameters2 —                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |            | $\times$    |
|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-------------|
| Block_Documentatio 📝   | <pre>*.xml, *.csv files abs or rel (./) path<br/> *.csv real columns set to number<br/>Input_Fields == Lookup_Fields (num, type)<br/>Output_Expr: match, match_last, match_all<br/> match_all.field not allowed</pre>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |            |             |
| Linking_Name:          | "Shape_Parameters_MRCNN"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |            | _           |
| fileOrURL:             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Browse     |             |
| Data_Structure_Text: 📝 | Layer H_W R_S E_F C M U m n e p q r t ifmap_gb<br>FMAPS 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>RPN 28 2 14 32 6 1 64 1 27 16 2 1 1 3891<br>ROIALIGN 14 3 10 64 16 1 64 4 13 16 4 1 4 7168<br>FCLAYERS 14 2 5 64 16 1 64 4 13 16 3 2 2 10752<br>CONV1 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>POOL1 28 2 14 32 6 1 64 1 27 16 2 1 1 3891<br>CONV2 14 3 10 64 16 1 64 4 13 16 3 2 2 10752<br>DECONV1 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>POOL2 14 2 5 64 16 1 64 4 13 16 3 2 2 10752<br>DECONV1 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>DECONV1 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>DECONV1 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>DECONV1 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>DECONV1 32 3 28 32 6 4 96 1 7 16 1 1 2 15872<br>DECONV2 14 3 10 64 16 1 64 4 13 16 3 2 2 10752<br>DECONV2 14 3 10 64 16 1 64 4 13 16 3 2 2 10752<br>DECONV2 14 3 10 64 16 1 64 4 13 16 3 2 2 10752 | _allocatio | n 1<br>7:99 |
| Input_Fields:          | "Layer"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |            |             |
| Lookup_Fields:         | "Layer"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |            |             |
| Output_Expression:     | "output = match" /* FORMAT output = match.fieldb */                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |            |             |
| Mode:                  | Read                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |            | ~           |
| Commit                 | Add Remove Restore Defaults Preferences Help                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Cancel     |             |

- Each of the layers are defined as different tasks in the task graph and the dependency between them is modeled.
- A database is used to list the layers/functions and the parameters associated with them.
  - These will be used to determine the number of Multiply Accumulate (MAC) operations corresponding to each layer/function









Model Location : \$VS/demo/DNN/DNN\_Model\_Mask\_R\_CNN.xml



# **RESULTS FOR THE PREVIOUS MODEL**



