LLM Model Comparison Dashboard

Interactive visualization of performance and quality metrics across different LLM models and precision types

Overview

This dashboard presents a comprehensive comparison of various Large Language Models (LLMs) across different precision types (INT4, INT8, and FP16). The metrics include performance indicators such as latency and throughput, as well as quality metrics like BLEU score, ROUGE scores, and other text quality measurements.

Performance Metrics

Quality Metrics

Precision Comparison

Model Details

DeepSeek-R1-Distill-Llama-8B (INT8)

Latency (ms)
30186.90
Throughput (tokens/sec)
4.44
BLEU Score
0.0326
ROUGE-L
0.0571
CHRF Score
0.0545
Unique n-grams
0.9515
Entropy
5.9461

DeepSeek-R1-Distill-Qwen-1.5B (INT8)

Latency (ms)
7178.57
Throughput (tokens/sec)
18.67
BLEU Score
0.0444
ROUGE-L
0.1071
CHRF Score
0.1754
Unique n-grams
0.4605
Entropy
3.3649

DeepSeek-R1-Distill-Qwen-7B (INT8)

Latency (ms)
28169.97
Throughput (tokens/sec)
4.76
BLEU Score
0.0282
ROUGE-L
0.0533
CHRF Score
0.0521
Unique n-grams
0.8750
Entropy
5.4952

FP16DeepSeek-R1-Distill-Llama-8B (INT4)

Latency (ms)
67071.10
Throughput (tokens/sec)
2.00
BLEU Score
0.0317
ROUGE-L
0.0545
CHRF Score
0.0543
Unique n-grams
0.9340
Entropy
6.0050

FP16DeepSeek-R1-Distill-Qwen-1.5B (INT4)

Latency (ms)
24860.84
Throughput (tokens/sec)
5.39
BLEU Score
0.0222
ROUGE-L
0.0594
CHRF Score
0.0589
Unique n-grams
0.7228
Entropy
5.4924

FP16DeepSeek-R1-Distill-Qwen-7B (INT4)

Latency (ms)
216863.59
Throughput (tokens/sec)
0.62
BLEU Score
0.0256
ROUGE-L
0.0638
CHRF Score
0.0493
Unique n-grams
0.8750
Entropy
5.5701

FP16gemma-2-2b-it (INT4)

Latency (ms)
5640.46
Throughput (tokens/sec)
6.03
BLEU Score
0.1700
ROUGE-L
0.2609
CHRF Score
0.2500
Unique n-grams
0.9524
Entropy
4.2971

FP16gemma-2-9b-it (INT4)

Latency (ms)
67532.38
Throughput (tokens/sec)
0.52
BLEU Score
0.1700
ROUGE-L
0.2609
CHRF Score
0.2450
Unique n-grams
0.9524
Entropy
4.2971

FP16gemma-2b-it (INT4)

Latency (ms)
7267.17
Throughput (tokens/sec)
6.74
BLEU Score
0.0836
ROUGE-L
0.1379
CHRF Score
0.1652
Unique n-grams
0.7857
Entropy
3.8424

FP16llama-3.2-1b-instruct (INT4)

Latency (ms)
7877.08
Throughput (tokens/sec)
13.33
BLEU Score
0.0272
ROUGE-L
0.0741
CHRF Score
0.0478
Unique n-grams
0.8795
Entropy
5.1314

FP16llama-3.2-3b-instruct (INT4)

Latency (ms)
27402.13
Throughput (tokens/sec)
4.89
BLEU Score
0.0320
ROUGE-L
0.0561
CHRF Score
0.0566
Unique n-grams
0.8571
Entropy
5.7332

FP16minicpm3-4b (INT4)

Latency (ms)
9406.69
Throughput (tokens/sec)
2.13
BLEU Score
0.4463
ROUGE-L
0.6000
CHRF Score
0.5478
Unique n-grams
0.8889
Entropy
2.7255

FP16notus-7b-v1 (INT4)

Latency (ms)
64559.65
Throughput (tokens/sec)
2.09
BLEU Score
0.0237
ROUGE-L
0.0619
CHRF Score
0.0603
Unique n-grams
0.8421
Entropy
5.6973

FP16qwen2.5-0.5b-instruct (INT4)

Latency (ms)
9054.84
Throughput (tokens/sec)
14.69
BLEU Score
0.0411
ROUGE-L
0.0811
CHRF Score
0.0853
Unique n-grams
0.9390
Entropy
5.6794

FP16qwen2.5-1.5b-instruct (INT4)

Latency (ms)
21806.81
Throughput (tokens/sec)
4.26
BLEU Score
0.0643
ROUGE-L
0.1579
CHRF Score
0.1518
Unique n-grams
0.8679
Entropy
4.6972

FP16qwen2.5-3b-instruct (INT4)

Latency (ms)
6514.87
Throughput (tokens/sec)
4.91
BLEU Score
0.1645
ROUGE-L
0.2667
CHRF Score
0.2091
Unique n-grams
0.9333
Entropy
3.7736

FP16qwen2.5-7b-instruct (INT4)

Latency (ms)
60172.13
Throughput (tokens/sec)
2.21
BLEU Score
0.0401
ROUGE-L
0.0723
CHRF Score
0.0733
Unique n-grams
0.8929
Entropy
5.3836

FP16tiny-llama-1b-chat (INT4)

Latency (ms)
16058.43
Throughput (tokens/sec)
8.10
BLEU Score
0.0470
ROUGE-L
0.1224
CHRF Score
0.1500
Unique n-grams
0.9444
Entropy
4.4188

FP16zephyr-7b-beta (INT4)

Latency (ms)
60639.01
Throughput (tokens/sec)
2.23
BLEU Score
0.0237
ROUGE-L
0.0612
CHRF Score
0.0584
Unique n-grams
0.9158
Entropy
5.6141

INT4DeepSeek-R1-Distill-Llama-8B (FP16)

Latency (ms)
29102.83
Throughput (tokens/sec)
4.60
BLEU Score
0.0256
ROUGE-L
0.0682
CHRF Score
0.0657
Unique n-grams
0.8864
Entropy
5.4736

INT4DeepSeek-R1-Distill-Qwen-1.5B (FP16)

Latency (ms)
7707.98
Throughput (tokens/sec)
17.38
BLEU Score
0.0503
ROUGE-L
0.1429
CHRF Score
0.1407
Unique n-grams
0.9792
Entropy
4.6348

INT4DeepSeek-R1-Distill-Qwen-7B (FP16)

Latency (ms)
56090.15
Throughput (tokens/sec)
2.39
BLEU Score
0.0387
ROUGE-L
0.0682
CHRF Score
0.0683
Unique n-grams
0.8161
Entropy
5.3503

INT4gemma-2-2b-it (FP16)

Latency (ms)
4597.32
Throughput (tokens/sec)
7.40
BLEU Score
0.1700
ROUGE-L
0.2609
CHRF Score
0.2500
Unique n-grams
0.9524
Entropy
4.2971

INT4gemma-2-9b-it (FP16)

Latency (ms)
17309.56
Throughput (tokens/sec)
2.20
BLEU Score
0.1410
ROUGE-L
0.2222
CHRF Score
0.2132
Unique n-grams
0.9600
Entropy
4.5639

INT4gemma-2b-it (FP16)

Latency (ms)
4337.92
Throughput (tokens/sec)
7.84
BLEU Score
0.1432
ROUGE-L
0.2353
CHRF Score
0.2166
Unique n-grams
0.7647
Entropy
3.3372

INT4llama-3.2-1b-instruct (FP16)

Latency (ms)
5683.33
Throughput (tokens/sec)
23.58
BLEU Score
0.0401
ROUGE-L
0.1132
CHRF Score
0.1888
Unique n-grams
0.5833
Entropy
3.6101

INT4llama-3.2-3b-instruct (FP16)

Latency (ms)
13235.87
Throughput (tokens/sec)
10.12
BLEU Score
0.0292
ROUGE-L
0.0517
CHRF Score
0.0656
Unique n-grams
0.2609
Entropy
3.5028

INT4qwen2.5-0.5b-instruct (FP16)

Latency (ms)
3718.02
Throughput (tokens/sec)
35.77
BLEU Score
0.0339
ROUGE-L
0.0789
CHRF Score
0.0814
Unique n-grams
0.6970
Entropy
5.1826

INT4qwen2.5-1.5b-instruct (FP16)

Latency (ms)
3660.75
Throughput (tokens/sec)
9.83
BLEU Score
0.2141
ROUGE-L
0.3529
CHRF Score
0.2989
Unique n-grams
0.9412
Entropy
3.7345

INT4qwen2.5-3b-instruct (FP16)

Latency (ms)
6167.70
Throughput (tokens/sec)
3.24
BLEU Score
0.2985
ROUGE-L
0.5000
CHRF Score
0.3152
Unique n-grams
0.8889
Entropy
2.7255

INT4qwen2.5-7b-instruct (FP16)

Latency (ms)
28736.82
Throughput (tokens/sec)
4.63
BLEU Score
0.0278
ROUGE-L
0.0615
CHRF Score
0.0709
Unique n-grams
0.8889
Entropy
5.3339

INT4tiny-llama-1b-chat (FP16)

Latency (ms)
4734.72
Throughput (tokens/sec)
28.51
BLEU Score
0.0422
ROUGE-L
0.0741
CHRF Score
0.0786
Unique n-grams
0.5500
Entropy
4.5125

INT4zephyr-7b-beta (FP16)

Latency (ms)
30112.27
Throughput (tokens/sec)
4.48
BLEU Score
0.0242
ROUGE-L
0.0667
CHRF Score
0.0634
Unique n-grams
0.9355
Entropy
5.8065

gemma-2-2b-it (INT8)

Latency (ms)
2728.30
Throughput (tokens/sec)
12.46
BLEU Score
0.1700
ROUGE-L
0.2609
CHRF Score
0.2500
Unique n-grams
0.9524
Entropy
4.2971

gemma-2-9b-it (INT8)

Latency (ms)
16624.33
Throughput (tokens/sec)
2.89
BLEU Score
0.1019
ROUGE-L
0.1667
CHRF Score
0.1596
Unique n-grams
0.9706
Entropy
4.8522

gemma-2b-it (INT8)

Latency (ms)
1450.64
Throughput (tokens/sec)
13.10
BLEU Score
0.2985
ROUGE-L
0.4000
CHRF Score
0.3343
Unique n-grams
0.8889
Entropy
2.7255

llama-3.2-1b-instruct (INT8)

Latency (ms)
5606.72
Throughput (tokens/sec)
23.90
BLEU Score
0.0083
ROUGE-L
0.0364
CHRF Score
0.0342
Unique n-grams
0.9065
Entropy
5.9116

llama-3.2-3b-instruct (INT8)

Latency (ms)
7577.43
Throughput (tokens/sec)
10.03
BLEU Score
0.0712
ROUGE-L
0.1304
CHRF Score
0.1245
Unique n-grams
0.8750
Entropy
4.8006

qwen2.5-0.5b-instruct (INT8)

Latency (ms)
3331.82
Throughput (tokens/sec)
39.92
BLEU Score
0.0463
ROUGE-L
0.1200
CHRF Score
0.1431
Unique n-grams
0.6027
Entropy
4.2134

qwen2.5-1.5b-instruct (INT8)

Latency (ms)
7612.31
Throughput (tokens/sec)
17.47
BLEU Score
0.0336
ROUGE-L
0.0625
CHRF Score
0.0549
Unique n-grams
0.9900
Entropy
6.3202

qwen2.5-3b-instruct (INT8)

Latency (ms)
3270.38
Throughput (tokens/sec)
9.48
BLEU Score
0.1432
ROUGE-L
0.2500
CHRF Score
0.1967
Unique n-grams
0.8824
Entropy
3.7345

qwen2.5-7b-instruct (INT8)

Latency (ms)
6035.79
Throughput (tokens/sec)
3.98
BLEU Score
0.2346
ROUGE-L
0.4615
CHRF Score
0.3615
Unique n-grams
0.9091
Entropy
3.2776

tiny-llama-1b-chat (INT8)

Latency (ms)
4178.08
Throughput (tokens/sec)
27.29
BLEU Score
0.0547
ROUGE-L
0.1538
CHRF Score
0.2061
Unique n-grams
0.9839
Entropy
4.1264

zephyr-7b-beta (INT8)

Latency (ms)
28617.65
Throughput (tokens/sec)
4.72
BLEU Score
0.0237
ROUGE-L
0.0612
CHRF Score
0.0580
Unique n-grams
0.9158
Entropy
5.6352

Contributors

Contributor 1
Godreign Elgin Y
GitHub Profile
LinkedIn Profile
Contributor 2
Aditya Krishna R S
GitHub Profile
LinkedIn Profile