Get started with Inference Engine Arena in minutes
curl -LsSf https://astral.sh/uv/install.sh | sh
NousResearch/Meta-Llama-3.1-8B
using vLLM.
Starting an Engine
NousResearch/Meta-Llama-3.1-8B
. You can add any vLLM parameters that are compatible with vllm serve
after vllm
.Optional: You can also set environment variables as you need.Running Benchmarks on the Engine
./results
by default. Refer to the Dashboard and Leaderboard for more details.Stopping an Engine
/example_yaml/Meta-Llama-3.1-8B-varied-max-num-seq.yaml
as an example, which benchmarks the same benchmark type with different max-num-seqs
configurations.
Tips: You may also refer to other examples in the /example_yaml
directory. And runyaml section for more details.
Example YAML Configuration
Dashboard
Leaderboard
--no-login
flag, you’ll need to log in to authorize the upload. We recommend starting with a single JSON file upload to complete the login process, then using the command to upload all your data. Alternatively, you can first share your results in the dashboard using the “Share Subrun to Global Leaderboard” button. Don’t worry about duplicate submissions - our system automatically deduplicates any repeated data.