OpenVPX

OpenVPXreplaces VME and VXS standards in defense designs.

Quick Explanation

  • System-level VPX specification
  • Provides profiles for slots, backplanes and modules

Protocol

  • ANSI/VITA 65-2017 is the OpenVPX System Standard Based on ANSI/VITA 46.0 and 46.1 VPX Standards

Systems that support complex avionics and spacecraft processing requiring significant processing, storage and communication bandwidth. With the advent of the Integrated Modular Architecture (IMA), the sharing of resources for multiple concurrent tasks, introduction of multi-core processors and an OpenVPX based system that are interconnected by high-speed serial interfaces makes the computation of latency a challenge.

There are multiple ways to evaluate the performance impact of target hardware architecture. There are analytical, physical tests and discrete simulation. The analytical method can provide a view of the worst case execution time. The problem is that this WCET is not bounded by any probability and 90% of the range will never be achieved in real-life. Using a real test, the system can be tested for a small finite use cases and workloads. Discrete-event simulation can create a realistic model of the end-to-end system including the workload, use cases, Real-Time Operating System (RTOS), processing sub-system, storage, communication, FPGA clusters, and the interconnect network. The user can run 100’s of scenarios to quickly identify the bottlenecks and the best performing areas.

For this paper, we have constructed a fairly common system design. Take a look at the system in Figure 1. The system consists of a FPGA cluster, GPU cluster, Solid-State Device for storage and a communication sub-system that has the telemetry for uplink and downlink communication. These sub-systems have a OpenVPX interface to the network. The OpenVPX systems are interconnected using RapidIO. The topology is constructed with double-redundancy, each sub-system connected to two switches in parallel. The RapidIO is a 5Gbps with 4 lanes providing a 20Gbps link for each of the systems. The FPGA cluster processing is excluded from the analysis to keep the focus on the OpenVPX + RapidIO network. The FPGA cluster generates reads andr write request every 249 ns. The image frames are full HD at 20 frames per second. The GPU cluster is processing ultra HD at 4K MPEG4. The user case is for the FPGA, GPU and the Communication system is to send read/write request to the Storage. Our goal is to get maximum efficiency out of the Storage and to size the RapidIO network for this system design.

We have modelled this system in a commercial system-level simulation package called VisualSim Architect from Mirabilis Design Inc. We modelled the system using the standard libraries in VisualSim Architect for the processors, DMA, memory, communication, OpenVPX and RapidIO. The system is setup with parameters for the common attributes including the speeds, types of images, traffic rates, processing times, packet sizes and the RapidIO network attributes including the switch speed and number of lanes. We executed the simulation for a variety of rates and RapidIO speeds. Here is the table of the different runs:

Run Number RIO Speed Lanes HD Rate Ultra HD Rate
1 5 Gbps 4 200 fps 300 fps
2 2 Gbps 4 100 fps 133 fps
3 1 Gbps 6 75 fps 75 fps
4 1 Gbps 4 30 fps 75 fps

The major finding was that the RapidIO had more than sufficient capacity to handle the traffic generated by the devices. An interesting finding was when the switch speed increased without an increase in the frames per second. Here we varied the packet sizes between 16 bytes and 64 bytes. You cans ee that the packet sizes do have an impact on the effective throughput.

Run 1 Statistics
RIO_1_2
RIO_Input_MBps                                             = 95.16,
RIO_Output_MBps                                         = 114.52,
Run 2 Statistics
RIO_1_2
RIO_Input_MBps                                             = 93.04,
RIO_Output_MBps                                         = 57.24,
Run 3 Statistics
RIO_1_2
RIO_Input_MBps                                             = 86.96,
RIO_Output_MBps                                         = 55.36,
Run 4 Statistics
RIO_1_2
RIO_Input_MBps                                             = 93.36,
RIO_Output_MBps                                         = 66.22,

The latency plot has the simulation time on the X-Axis and the end-to-end response time on the Y-axis. End-to-end response time or latency is measured from the completion of the processing, request to the storage device and the response, either the data for a read or an acknowledgement for the write. The FPGA Latency is the Read and Write requests to the Storage system form the FPGA cluster. Similarly the Communication and the GPU latency are the response times from the Clusters to the Storage and the response. The storage latency is the latency for the Read request at the Storage device.

The following plots are included:
1GB4LaneLatency- Latency for Run 4
1GB6LaneLatency- Latency for Run 3
1GB6Lane.png- Throughput plots for Run 3
2GB4Lane.png- Latency for Run 2
5GB4Lane.png- Latency for Run 1

Article: https://ieeexplore.ieee.org/document/7500650/

OpenVPX using RapidIO interconnect