Conversation
There was a problem hiding this comment.
Benchmark Results
Details
| Benchmark | Current: a5ec189 | Previous: ab66435 | Performance Ratio |
|---|---|---|---|
| startup_benchmark Build Time | 84.01 s |
85.34 s |
0.98 ❗ |
| startup_benchmark File Size | 0.79 MB |
0.79 MB |
1.00 ❗ |
| Startup Time - 1 core | 0.93 s (±0.03 s) |
0.90 s (±0.03 s) |
1.03 |
| Startup Time - 2 cores | 0.91 s (±0.04 s) |
0.92 s (±0.03 s) |
1.00 |
| Startup Time - 4 cores | 0.94 s (±0.02 s) |
0.91 s (±0.03 s) |
1.03 |
| multithreaded_benchmark Build Time | 82.34 s |
87.89 s |
0.94 ❗ |
| multithreaded_benchmark File Size | 0.90 MB |
0.90 MB |
1.00 ❗ |
| Multithreaded Pi Efficiency - 2 Threads | 88.90 % (±8.46 %) |
88.54 % (±6.86 %) |
1.00 |
| Multithreaded Pi Efficiency - 4 Threads | 43.95 % (±2.87 %) |
42.55 % (±2.91 %) |
1.03 |
| Multithreaded Pi Efficiency - 8 Threads | 25.30 % (±1.77 %) |
24.85 % (±2.03 %) |
1.02 |
| micro_benchmarks Build Time | 95.39 s |
96.16 s |
0.99 ❗ |
| micro_benchmarks File Size | 0.90 MB |
0.90 MB |
1.00 ❗ |
| Scheduling time - 1 thread | 69.09 ticks (±4.74 ticks) |
67.10 ticks (±2.62 ticks) |
1.03 |
| Scheduling time - 2 threads | 37.45 ticks (±3.24 ticks) |
37.89 ticks (±4.27 ticks) |
0.99 |
| Micro - Time for syscall (getpid) | 2.85 ticks (±0.29 ticks) |
2.94 ticks (±0.26 ticks) |
0.97 |
| Memcpy speed - (built_in) block size 4096 | 65584.28 MByte/s (±46535.38 MByte/s) |
65755.51 MByte/s (±46918.05 MByte/s) |
1.00 |
| Memcpy speed - (built_in) block size 1048576 | 30311.81 MByte/s (±25095.01 MByte/s) |
29839.81 MByte/s (±24853.68 MByte/s) |
1.02 |
| Memcpy speed - (built_in) block size 16777216 | 25888.84 MByte/s (±21648.96 MByte/s) |
26219.29 MByte/s (±21914.53 MByte/s) |
0.99 |
| Memset speed - (built_in) block size 4096 | 65812.34 MByte/s (±46681.27 MByte/s) |
66150.74 MByte/s (±47185.97 MByte/s) |
0.99 |
| Memset speed - (built_in) block size 1048576 | 31119.14 MByte/s (±25553.86 MByte/s) |
30623.85 MByte/s (±25290.73 MByte/s) |
1.02 |
| Memset speed - (built_in) block size 16777216 | 26691.64 MByte/s (±22171.79 MByte/s) |
26973.68 MByte/s (±22375.69 MByte/s) |
0.99 |
| Memcpy speed - (rust) block size 4096 | 59285.02 MByte/s (±43401.31 MByte/s) |
59073.04 MByte/s (±43613.94 MByte/s) |
1.00 |
| Memcpy speed - (rust) block size 1048576 | 30504.67 MByte/s (±25401.80 MByte/s) |
29946.27 MByte/s (±24865.31 MByte/s) |
1.02 |
| Memcpy speed - (rust) block size 16777216 | 25508.88 MByte/s (±21207.79 MByte/s) |
25600.61 MByte/s (±21469.00 MByte/s) |
1.00 |
| Memset speed - (rust) block size 4096 | 60441.29 MByte/s (±44233.57 MByte/s) |
59547.79 MByte/s (±43893.71 MByte/s) |
1.02 |
| Memset speed - (rust) block size 1048576 | 31309.34 MByte/s (±25870.72 MByte/s) |
30706.88 MByte/s (±25286.03 MByte/s) |
1.02 |
| Memset speed - (rust) block size 16777216 | 26299.88 MByte/s (±21727.66 MByte/s) |
26370.64 MByte/s (±21957.32 MByte/s) |
1.00 |
| alloc_benchmarks Build Time | 92.07 s |
91.85 s |
1.00 ❗ |
| alloc_benchmarks File Size | 0.86 MB |
0.86 MB |
1.00 ❗ |
| Allocations - Allocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Deallocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Pre-fail Allocations | 100.00 % |
100.00 % |
1 |
| Allocations - Average Allocation time | 12482.56 Ticks (±207.41 Ticks) |
11871.89 Ticks (±350.26 Ticks) |
1.05 ❗ |
| Allocations - Average Allocation time (no fail) | 12482.56 Ticks (±207.41 Ticks) |
11871.89 Ticks (±350.26 Ticks) |
1.05 ❗ |
| Allocations - Average Deallocation time | 1109.94 Ticks (±596.35 Ticks) |
1544.97 Ticks (±974.86 Ticks) |
0.72 |
| mutex_benchmark Build Time | 93.31 s |
97.09 s |
0.96 ❗ |
| mutex_benchmark File Size | 0.90 MB |
0.90 MB |
1.00 ❗ |
| Mutex Stress Test Average Time per Iteration - 1 Threads | 12.86 ns (±0.57 ns) |
13.08 ns (±0.74 ns) |
0.98 |
| Mutex Stress Test Average Time per Iteration - 2 Threads | 17.80 ns (±3.88 ns) |
20.74 ns (±10.85 ns) |
0.86 |
This comment was automatically generated by workflow using github-action-benchmark.
Details
| Benchmark | Current: a5ec189 | Previous: 9f7322b | Performance Ratio |
|---|---|---|---|
| startup_benchmark Build Time | 87.00 s |
85.88 s |
1.01 ❗ |
| startup_benchmark File Size | 0.75 MB |
0.75 MB |
1.00 ❗ |
| Startup Time - 1 core | 0.95 s (±0.03 s) |
0.94 s (±0.04 s) |
1.00 |
| Startup Time - 2 cores | 0.92 s (±0.03 s) |
0.95 s (±0.03 s) |
0.98 |
| Startup Time - 4 cores | 0.95 s (±0.03 s) |
0.94 s (±0.03 s) |
1.01 |
| multithreaded_benchmark Build Time | 89.48 s |
88.07 s |
1.02 ❗ |
| multithreaded_benchmark File Size | 0.86 MB |
0.86 MB |
1.00 ❗ |
| Multithreaded Pi Efficiency - 2 Threads | 90.69 % (±8.34 %) |
88.63 % (±8.04 %) |
1.02 |
| Multithreaded Pi Efficiency - 4 Threads | 44.93 % (±3.78 %) |
43.38 % (±3.45 %) |
1.04 |
| Multithreaded Pi Efficiency - 8 Threads | 26.22 % (±2.25 %) |
25.20 % (±1.80 %) |
1.04 |
| micro_benchmarks Build Time | 86.24 s |
94.04 s |
0.92 ❗ |
| micro_benchmarks File Size | 0.86 MB |
0.86 MB |
1.00 ❗ |
| Scheduling time - 1 thread | 64.46 ticks (±3.57 ticks) |
65.35 ticks (±4.26 ticks) |
0.99 |
| Scheduling time - 2 threads | 37.70 ticks (±5.44 ticks) |
35.89 ticks (±3.71 ticks) |
1.05 |
| Micro - Time for syscall (getpid) | 3.16 ticks (±0.39 ticks) |
3.00 ticks (±0.23 ticks) |
1.05 |
| Memcpy speed - (built_in) block size 4096 | 67494.18 MByte/s (±47919.65 MByte/s) |
65860.77 MByte/s (±46845.85 MByte/s) |
1.02 |
| Memcpy speed - (built_in) block size 1048576 | 29380.27 MByte/s (±24217.68 MByte/s) |
29576.49 MByte/s (±24431.93 MByte/s) |
0.99 |
| Memcpy speed - (built_in) block size 16777216 | 26415.45 MByte/s (±22148.86 MByte/s) |
28311.09 MByte/s (±23599.33 MByte/s) |
0.93 |
| Memset speed - (built_in) block size 4096 | 68452.55 MByte/s (±48544.66 MByte/s) |
66744.29 MByte/s (±47481.33 MByte/s) |
1.03 |
| Memset speed - (built_in) block size 1048576 | 30126.30 MByte/s (±24629.42 MByte/s) |
30393.81 MByte/s (±24910.59 MByte/s) |
0.99 |
| Memset speed - (built_in) block size 16777216 | 27027.36 MByte/s (±22459.51 MByte/s) |
29111.57 MByte/s (±24069.85 MByte/s) |
0.93 |
| Memcpy speed - (rust) block size 4096 | 59732.90 MByte/s (±43914.22 MByte/s) |
59011.76 MByte/s (±43397.34 MByte/s) |
1.01 |
| Memcpy speed - (rust) block size 1048576 | 29302.53 MByte/s (±24182.92 MByte/s) |
29563.17 MByte/s (±24538.23 MByte/s) |
0.99 |
| Memcpy speed - (rust) block size 16777216 | 26824.86 MByte/s (±22489.31 MByte/s) |
28365.88 MByte/s (±23635.44 MByte/s) |
0.95 |
| Memset speed - (rust) block size 4096 | 60819.38 MByte/s (±44637.20 MByte/s) |
59323.20 MByte/s (±43549.67 MByte/s) |
1.03 |
| Memset speed - (rust) block size 1048576 | 30060.59 MByte/s (±24614.67 MByte/s) |
30357.47 MByte/s (±24994.77 MByte/s) |
0.99 |
| Memset speed - (rust) block size 16777216 | 27098.01 MByte/s (±22575.15 MByte/s) |
29162.11 MByte/s (±24104.49 MByte/s) |
0.93 |
| alloc_benchmarks Build Time | 80.98 s |
91.81 s |
0.88 ❗ |
| alloc_benchmarks File Size | 0.82 MB |
0.82 MB |
1.00 ❗ |
| Allocations - Allocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Deallocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Pre-fail Allocations | 100.00 % |
100.00 % |
1 |
| Allocations - Average Allocation time | 4957.59 Ticks (±127.23 Ticks) |
15902.44 Ticks (±330.83 Ticks) |
0.31 ❗ |
| Allocations - Average Allocation time (no fail) | 4957.59 Ticks (±127.23 Ticks) |
15902.44 Ticks (±330.83 Ticks) |
0.31 ❗ |
| Allocations - Average Deallocation time | 817.48 Ticks (±75.86 Ticks) |
1433.42 Ticks (±676.51 Ticks) |
0.57 |
| mutex_benchmark Build Time | 81.97 s |
90.98 s |
0.90 ❗ |
| mutex_benchmark File Size | 0.86 MB |
0.86 MB |
1.00 ❗ |
| Mutex Stress Test Average Time per Iteration - 1 Threads | 12.74 ns (±0.82 ns) |
13.06 ns (±0.76 ns) |
0.98 |
| Mutex Stress Test Average Time per Iteration - 2 Threads | 89.62 ns (±4.64 ns) |
20.38 ns (±8.77 ns) |
4.40 ❗ |
This comment was automatically generated by workflow using github-action-benchmark.
1bcd51c to
2f3ab43
Compare
2f3ab43 to
944570d
Compare
844587c to
d8017b5
Compare
|
pci_types is updated as of #2220. |
938d062 to
1ba6965
Compare
1ba6965 to
019e89f
Compare
| #[repr(C)] | ||
| pub(crate) struct MsixEntry { | ||
| addr_low: u32, | ||
| addr_high: u32, | ||
| data: u32, | ||
| control: u32, | ||
| } |
There was a problem hiding this comment.
Would it make sense to upstream this to pci_types?
That crate currently does not use any volatile operations at all, though. Is volatile necessary here? Should pci_types use volatile for some current operations? Or does this live in the PCI config region and should be accessed via pci_types::ConfigRegionAccess instead?
There was a problem hiding this comment.
It is part of the PCI-e specification, so I may make sense to.
Volatile is necessary and ConfigRegionAccess does not apply as it is an object that is mapped to the BAR space. I am not aware of any pci_types operations that are missing volatile accesses.
src/drivers/net/virtio/pci.rs
Outdated
|
|
||
| let irq = device.get_irq(); | ||
| if irq.is_none() { | ||
| warn!("No interrupt lanes found for virtio-net."); |
There was a problem hiding this comment.
Should this not be an error instead?
There was a problem hiding this comment.
According to the section 7.5.1.1.13 of the PCIe specification version 6.0, "A [Interrupt Pin Register] value of 00h indicates that the Function uses no legacy interrupt Message(s)," so a device may be well-behaving and still not have an interrupt pin and line, in which case we may still be able to use MSI-X. There are reserved and unknown values and values that indicate no connection to the interrupt controller but for get_irq we do not differentiate between these and return None just like we do for the no legacy interrupt message(s) case.
There was a problem hiding this comment.
We could change it to also check for the existence of the MSI-X capability and print an error if both are unavailable.
With the new |
5971332 to
5ae6cfc
Compare
cloud-hypervisor only supports MSI-X interrupts for PCI devices, so support for MSI-X is needed to support running on it. Additionally, MSI-X can allow us to set separate interrupt handlers for the configuration change interrupts and queue updates (even per-queue handlers once we have multiple queues) in the future. MSI-X is not working correctly or at all on all platforms, so it is exposed as an optional feature. Co-authored-by: Martin Kröning <mkroening@posteo.net>
5ae6cfc to
a5ec189
Compare
cloud-hypervisor only supports MSI-X interrupts for PCI devices, so support for MSI-X is needed to support running on it. Additionally, MSI-X can allow us to set separate interrupt handlers for the configuration change interrupts and queue updates (even per-queue handlers once we have multiple queues) in the future.
The implementation does not work on QEMU with the TAP network device (tested with
cargo xtask ci rs --arch x86_64 --profile dev --package httpd --features ci,hermit/virtio-net,hermit/msix qemu --accel --sudo --devices virtio-net-pci --tap) and possibly other platforms as well. It works with QEMU user network device in addition to cloud-hypervisor, by adding thehermit/msixfeature to the regular run command.