Analyzing Resource Sharing in
Large Complex Systems
Click here to view and
execute the Top-Level Model in VisualSim
Click here to view the Network of Sensors and Processor
Array in VisualSim
Click here to view the VisualSim Resource Model of a
Sensor
Click here to view the Datalink
Model in VisualSim
Early performance analysis and virtual system prototyping
provides a methodology and platform to evaluate architectures, processing
requirements and module functionality within the Unmanned Aerial Vehicle. This methodology allows the system engineer to
start with an abstract concept and increase model fidelity and accuracy through
successive levels of decomposition. Establishing a simulation platform allows
for quick and accurate trade studies and spiral engineering enhancements over
the program’s lifecycle. This also allows investigations into interoperability
which lessens Total Overall Cost of the electronic systems. This is especially
important for electronics because this system element faces obsolescence or
enhancement replacement earlier.
The example below uses VisualSim software to conduct trade studies on the architecture of the
processing Datalinks and to select the best Bus backplane. The system prototype combines existing
components available in VisualSim model library to assemble the sensors,
on-board multi-blade processing units, wireless channels and the operation of a
ground vehicle. These processing
Datalinks and the wireless channels are connected together over a 1553B
Bus. Each Datalink processes messages from
a number of sensors and transmits the results across the 1553B backbone through
a common set of transmitters to ground vehicles.
Based on the analysis conducted by constructing a model of
the proposed architecture and support for the different use scenarios, the
optimal architecture was identified to be a Datalink with a 6-board, 30 MHz
processor, 66 MHz shared cache and 1 Mbps- 4 link downlink.
The simulation model evaluates the maximum handling capacity
of the processing units, impact of channel errors and speed on the latency, and
performance of the 1553B bus. The
following metrics are evaluated:
1.
End-to-end
latency from the sensors to the ground vehicles. This is the time taken to retrieve data from
the sensors, process the data, transmit it across the 1553B bus, over the
Wireless channel and terminate at the ground vehicle (DCGS).
2.
1553B Bus
Latency Histogram shows the variations of the latency from Datalink Remote
Terminals (RT2 and RT3) to the Wireless Transmitter Remote Terminal (RT).
3.
Packet Histogram
displays packet sizes transmitted across the 1553B bus.
4.
1553B Bus
Throughput displays the Peak and Mean throughput on the 1553B bus.
5.
Display
Rejects plots the times when sensor messages were dropped at the Datalink
because of buffer overflow or lack of processing power.
6.
Datalink
Statistics captures the statistics of the buffer occupancy, utilization,
processing time at each processor, buffers, flash memory and disk.
7.
1553B Bus
Statistics captures (1) the buffer occupancy and waiting time at the RT and
(2) the utilization of the bus controller.
Architecture
The system model consists of sensor generators, Datalink,
1553B RT, 1553B bus controller, Wireless channel, error checking and DCGS-
Ground Vehicles. All of these are
connected together on a single 1553B bus with a single controller.

Figure 1: Top-level VisualSim block
diagram of the Unmanned Aerial Vehicle
Sensor Generator-
This emulates the sensor data including acquisition rate, size, header
information and sensor to Datalink distance.
A sensor generator template was created and each sensor had different
parameters values. The parameter values
being modified include size, inter-arrival time, processing cycles, delayed
start and distance to Datalink.
Datalink- One Datalink
handles many sensors. In this model,
there are 4 sensors in a network feeding into a single Datalink. The Datalink contains a RTOS that feeds in
parallel to both a processor array and a flash memory. The flash memory is a temporary buffer that
writes into an archiving system. The
processor array contains a number of processors running at a fixed speed. The results from the processor are written
into a shared cache. The resulting
messages are transmitted by the RTOS to the RT.
Wireless Channel-
The wireless channel is modeled as a multi-link channel with variable error
probability. The channel also has an
Ack-Nack to support retransmission.
DCGS- The ground
vehicle is a sink that receives message and computes the latency.
1553B Bus Remote
Terminal (RT) - This models the queuing, request for bus resource, cable
propagation delay, response time and the ability to broadcast.
1553B Bus Controller-
This does a simple arbitration according to the 1553B bus standard. Also, the latency across the controller is
specified here.

Figure 2: Network of Sensors and the
Processing Array on the UAV
Analysis
The initial analysis is performed to characterize a rough
architecture that will support the processing requirements for a fixed arrival
of messages from the sensors. The Datalink
architecture is validated and adjusted for various arrival rates of the sensor
traffic.

Figure 3: Analysis Plots captured from
the UAV Simulation
Performance Evaluation
The inter-arrival rate of messages is set for an initial
range of 0.0008 +/- 30% for each sensor.
The rough architecture has 4 Sun SPARC processor boards of 20 MHz with
66 MHz Cache and a 1 Mbps channel. You
will see that the Datalink starts dropping messages after 1.37 seconds of
simulated operation and continues rejecting message until the end of the
simulation. Also, the end-to-end delay
is in a wide range from 1.25 seconds to 2.57 seconds. The cache statistics indicate that there is
no buffer overflow and the utilization is quite low. The Cache is not a bottleneck. The individual statistics for the 4 processor
boards shows a buffer overflow indicating that the processing speed is
extremely low. The sensor messages are
unevenly distributed to the different processor boards, with the usage ranging
from 100% to 23%. Also, the rejection of
the messages at the processor boards makes the 1553B bus under utilized.
To refine this architecture, a number of alternates exist-
increase the number of processors, speed up the processors, modify the
scheduling algorithm, increase the cache speed and pipeline, as opposed to
parallel execution, of the 4 processor boards.
In this model, we have tried the following- more number of faster
processors, more number of processors at the same speed, increase the cache
speed and increase the channel speed.
Case 1: We shall first increase the processor speed
from 20 MHz to 50 MHz. This is done by
changing a single parameter at the top-level of the model. This single parameter is linked to all the 4
processor boards. The rest of the
parameters are maintained the same. The
new latency histogram shows a narrower range of latency values at 1.25, 1.42
and 1.67 seconds. All the sensor
messages are processed and transmitted across the 1553B bus without any message
being rejected. The processor and cache
statistics indicates no buffer overflow.
The processor utilization is now uniform across all the processor boards
at around 43%. The 1553B controller
utilization has increased from 10% to 36%.
There is a small buffering at the Remote Terminal-2 (RT), thus
indicating that data is arriving at a faster rate than the 1553B controller can
handle. The buffers at the RT prevent
any loss of data but add some latency.
The mean 1553B bus throughput has now doubled for the same traffic from
.16 Mbps to .41 Mbps. The peak
throughput on the 1553B with this architecture is .72 Mbps, out of an available
1 Mbps. This is because of the protocol
overhead and the controller latency.
Case 2: The next
experiment is to increase the number of processors to 6 and reduce the speed to
30 MHz. The results indicate no
significant performance improvement from the previous experiments. On the other hand there is a small increase
in the buffering at the cache. The
shared cache is receiving data at a higher rate. Also, the average processor
utilization is slightly higher than the Case 1 but it peaks out at 50%. The same volume of data is received at the
1553B bus controller and the mean utilization remains at 36%.
Case 3: Now let
us reduce the number of processor to 4 and increase each processor speed to 30
MHz. There is still a buffer overload
but also the peak latency increases to 3.8 seconds. This is not a viable option.
Case 4: Additional
experiments can be performed by simply changing the parameters at the top-level
of this model. Experiments include (1)
increasing the channel speed, (2) increasing the number of channel links and
(3) changing Cache speed from 66MHz to 133/288 MHz. Simulation show that these do not contribute
to a reduction of the end-to-end latency.
At this point the bottleneck is at the 1553B bus and not on the
processing architecture.
Cost Comparison
The cost of a processor board is a function of the processor
speed. For the sake of this analysis, we
shall assume that the price is $1 per MHz.
The Table shows the comparison:
|
Case #
|
# of Processors
|
Processor Speed
|
Cost
|
Viability
|
|
Rough initial Architecture
|
4
|
20 MHz
|
$20 * 4 = $80
|
No
|
|
1
|
4
|
50 MHz
|
$50 * 4 = $200
|
Yes
|
|
2
|
6
|
30 MHz
|
$30 * 6 = $180
|
Yes/cost-effective
|
|
3
|
4
|
30 MHz
|
$30 * 4 = $120
|
No
|
Table 1 Cost Comparison of different
architectures
Functional Analysis
After running this complex system, a very interesting
observation was made. The burst nature
of the handling from the sensors to the RTOS increases the buffering on the
processor boards without increasing the utilization of the processor. Even though the processors do not achieve
100% utilization, there are still a number of messages that are being rejected. There are a number of factors affecting the
overall performance including the RTOS scheduling, the distribution between
parallel executions and burst data arrival. There are periods of inactivity
followed by a burst of traffic that fills up the buffer and then starts to
overflow. This can be modified by
altering the sensor acquisition mechanism, which was beyond the scope of this
evaluation. This could be easily added
to this experiment as a future extension.
Summary
Trade studies of this kind can help in analysis of the
system performance for a variety of operating conditions. Early understanding of optimal performance
can same significant prototype testing and deliver much more robust system
operation. The model for this presentation was built in a few hours using
existing VisualSim library models. A
more advanced model could consider effects of redundant processing and
consolidation of functional modules on a single processing board. Also, trade-off between the partitioning of
the functional nodes from the UML diagram on to different boards, separate
1553B or VME bus structures can also be evaluated. Finally, channel
interference and jamming can easily be included as refinements to further
explore operational effects on performance.
The model has been constructed using highly modular
components creating a design platform.
The bus in this model can be easily replaced with the VME bus
architecture to evaluate the performance on a different backplane. The channel could be modified to try a more unreliable
channel or the use of a cellular standard. Future spiral engineering
possibilities can be tried and quickly determined to be feasible or not.