VLA Benchmark

Performance analysis of Vision-Language-Action models on heterogeneous hardware

Sortable Columns
Hardware Capability Memory (GB) Cost ($) Latency (ms) Energy (J) CE (10⁶) CT (10²) CET (10⁵)
* CE = Cost × Latency (Scaled 10⁶) * CT = Cost × Time (Scaled 10²) * CET = Cost × Energy × Time (Scaled 10⁵) * Lower numerical values across all metrics represent superior performance efficiency.
* The generic "CPU" designation refers to the 11th Gen Intel Core i7-11700 processor.
Anonymous Submission