基于kcores大语言模型推理专用显存天梯作为参考,运行 llama-3.1-70b-instruct-4bit 模型的情况下,计算单位显卡对应token生成数量(理论性能,未计算损耗,仅供参考),并进行排名。
单位显卡(或集群)每秒理论token数量排行:
显卡名称显卡数量每秒总token显卡平均token排名NVIDIA GB200 NVL72112000120001NVIDIA GB200 Grace Blackwell Superchip1333.33333.332NVIDIA B200 SXM 192GB1170.83170.833NVIDIA H100 PCIe/SXM5 96GB170704NVIDIA H100 SXM5 80GB170705NVIDIA H800 SXM5 80GB170706NVIDIA H100 PCIe/CNX 80GB142.542.57NVIDIA A100/A100X SXM4 80GB142.542.58NVIDIA A800 SXM4 80GB142.542.59NVIDIA H800 PCIe 80GB142.542.510NVIDIA H100 SXM5 64GB142.0842.0811NVIDIA A100 PCIe 80GB140.4240.4212NVIDIA A800 80GB Active Ampere140.4240.4213NVIDIA GRID/DRIVE A100A 32GB277.9238.9614NVIDIA GRID A100B 48GB138.9638.9615NVIDIA GeForce RTX 5090 32GB (Preliminary)274.6737.33516NVIDIA A100 PCIe/SXM4 40GB26532.517NVIDIA A800 40GB Active Ampere26532.518NVIDIA A30X 24GB250.8325.41519NVIDIA Tesla V100 SXM2 16GB (version 2019)494.1723.542520NVIDIA Tesla V100S PCIe 32GB247.0823.5421NVIDIA GeForce RTX 5080 16GB (Preliminary)485.3321.332522NVIDIA GeForce RTX 4090 24GB242.0821.0423NVIDIA GeForce RTX 3090 Ti 24GB242.0821.0424NVIDIA Tesla V100 SXM3 32GB240.8820.4425NVIDIA RTX 6000 Ada 48GB1202026NVIDIA GeForce RTX 3090 24GB239.0119.50527NVIDIA A30 PCIe 24GB238.8819.4428NVIDIA GeForce RTX 3080 12GB (Ti 12GB)476.0319.007529NVIDIA Tesla V100 PCIe/SXM2/DGXS 32GB237.4218.7130NVIDIA Tesla V100 PCIe/SXM2 16GB474.7518.687531NVIDIA GeForce RTX 5070 Ti 16GB (Preliminary)474.6718.667532NVIDIA Quadro GV100 32GB236.1818.0933NVIDIA TITAN V CEO Edition 32GB236.1818.0934NVIDIA L40/L40G 24GB2361835NVIDIA L20 48GB1181836NVIDIA L40/L40S 48GB1181837NVIDIA RTX 5880 Ada 48GB1181838Apple MacStudio M1 Ultra 64GB117.0717.0739Apple MacStudio M1 Ultra 128GB117.0717.0740Apple MacStudio M2 Ultra 64GB117.0717.0741Apple MacStudio M2 Ultra 128GB117.0717.0742Apple MacStudio M2 Ultra 192GB117.0717.0743DDR6 12 Channel 8400 512GB116.816.844NVIDIA A16 PCIe 64GB116.6816.6845NVIDIA RTX A5500 Ampere 24GB2321646NVIDIA RTX A5000 Ampere 24GB2321647NVIDIA RTX A6000 Ampere 48GB1161648NVIDIA GeForce RTX 3080 10GB8126.7215.8449NVIDIA GeForce RTX 3080 Ti 20GB463.3615.8450NVIDIA GeForce RTX 4080 SUPER 16GB461.3615.3451NVIDIA Quadro GP100 16GB461.0215.25552NVIDIA Tesla P100 SXM2/DGXS 16GB461.0215.25553NVIDIA GeForce RTX 4080 16GB459.7314.932554NVIDIA A40 PCIe 48GB114.514.555NVIDIA Tesla P10 24GB228.9314.46556NVIDIA GeForce RTX 4070 Ti SUPER 16GB456.0314.007557NVIDIA GeForce RTX 5070 12GB (Preliminary)4561458NVIDIA Quadro RTX 6000 24GB2281459NVIDIA TITAN RTX 24GB2281460NVIDIA Quadro RTX 8000 48GB1141461NVIDIA TITAN V 12GB454.2813.5762NVIDIA RTX A4500 Ampere 20GB453.3313.332563NVIDIA GeForce RTX 2080 Ti 11GB8102.6712.8337564NVIDIA GeForce RTX 3070 Ti 8GB8101.3812.672565NVIDIA GeForce RTX 3060 Ti GDDR6X 8GB8101.3812.672566NVIDIA GeForce RTX 3070 Ti 16GB450.6912.672567NVIDIA A10M 24GB225.0112.50568NVIDIA A10/A10G PCIe 24GB225.0112.50569NVIDIA RTX 5000 Ada 32GB2241270Intel Arc A770 16GB446.6711.667571NVIDIA Tesla P100 PCIe 12GB445.7611.4472NVIDIA TITAN Xp 12GB445.6311.407573Apple MacBook Pro M4 Max 64GB111.3811.3874Apple MacBook Pro M4 Max 128GB111.3811.3875Apple MacBook Pro M4 Max 48GB222.7511.37576NVIDIA Project DIGITS 128GB110.6710.6777Intel Arc A750 8GB885.3310.6662578Intel Arc A580 8GB885.3310.6662579NVIDIA GeForce RTX 4080 12GB442.0210.50580NVIDIA GeForce RTX 4070 Ti 12GB442.0210.50581NVIDIA GeForce RTX 4070 SUPER 12GB442.0210.50582NVIDIA GeForce RTX 4070 12GB442.0210.50583NVIDIA Tesla K80 24GB220.0510.02584Intel Arc B580 12GB4389.585DDR5 12 Channel 4800 512GB19.389.3886NVIDIA GeForce RTX 3070 8GB874.679.3337587NVIDIA GeForce RTX 3060 Ti 8GB874.679.3337588NVIDIA GeForce RTX 2070 8GB874.679.3337589NVIDIA GeForce RTX 2080 8GB874.679.3337590NVIDIA GeForce RTX 5060 8GB (Preliminary)874.679.3337591NVIDIA RTX A4000 Ampere 16GB437.339.332592NVIDIA Quadro RTX 5000 16GB437.339.332593NVIDIA Quadro P6000 24GB218.039.01594NVIDIA GeForce RTX 4080 Mobile 12GB436995NVIDIA RTX 4500 Ada 24GB218996NVIDIA Tesla T10 16GB435.858.962597NVIDIA Quadro RTX 4000 8GB869.338.6662598Apple MacStudio M1 Max 32GB217.078.53599Apple MacStudio M2 Max 32GB217.078.535100Apple MacBook Pro M3 Max 48GB217.078.535101Apple MacStudio M1 Max 64GB18.538.53102Apple MacStudio M2 Max 64GB18.538.53103Apple MacStudio M2 Max 96GB18.538.53104Apple MacBook Pro M3 Max 64GB18.538.53105Apple MacBook Pro M3 Max 128GB18.538.53106Intel Arc B570 10GB863.337.91625107NVIDIA GeForce RTX 3060 12GB4307.5108NVIDIA RTX 4000 Ada 20GB4307.5109NVIDIA Tesla P40 24GB214.467.23110NVIDIA Tesla M10 32GB213.876.935111NVIDIA Tesla M60 16GB426.736.6825112NVIDIA RTX 4000 SFF Ada 20GB426.676.6675113NVIDIA Tesla T4/T4G 16GB426.676.6675114NVIDIA L4 24GB212.56.25115DDR5 8 Channel 4800 512GB16.256.25116NVIDIA Tesla M40 24GB212.026.01117NVIDIA Tesla M40 12GB424.036.0075118NVIDIA GeForce RTX 4060 Ti 8GB8486119NVIDIA RTX A2000 12GB Ampere4246120NVIDIA GeForce RTX 4060 Ti 16GB4246121Apple MacMini M4 Pro 48GB211.385.69122Apple MacMini M4 Pro 64GB15.695.69123Apple MacBook Pro M4 Pro 64GB15.695.69124Apple MacMini M4 Pro 24GB422.755.6875125NVIDIA GeForce RTX 4060 8GB845.335.66625126NVIDIA Quadro P4000 8GB840.555.06875127NVIDIA GeForce RTX 3060 8GB8405128NVIDIA RTX 2000 Ada 16GB418.674.6675129NVIDIA GeForce RTX 3050 8GB837.334.66625130NVIDIA Jetson AGX Orin 64GB14.274.27131NVIDIA Jetson AGX Orin 32GB28.534.265132NVIDIA A2 16GB416.684.17133DDR4 8 Channel 3200 512GB (EPYC SP3 LGA-4189)14.174.17134Apple MacMini M2 Pro 16GB833.334.16625135Apple MacMini M2 Pro 32GB28.334.165136NVIDIA Tesla P4 8GB832.054.00625137NVIDIA RTX A1000 Ampere 8GB8324138NVIDIA T1000 8GB Turing826.673.33375139Apple MacBook Pro M3 Pro 36GB26.43.2140DDR4 6 Channel 2933 384GB (LGA-3647)12.852.85141NVIDIA Jetson AGX Xavier 16GB411.382.845142NVIDIA Jetson AGX Xavier 32GB25.692.845143Apple MacMini M4 16GB8202.5144Apple MacMini M4 24GB4102.5145Apple MacMini M4 32GB252.5146Apple MacBook Pro M4 32GB252.5147NVIDIA Jetson Orin NX 8GB817.072.13375148NVIDIA Jetson Orin NX 16GB48.532.1325149Apple MacBook Pro M3 24GB48.532.1325150Jetson Orin Nano Super 8GB8172.125151Apple MacMini M2 16GB816.672.08375152Apple MacMini M2 24GB48.332.0825153DDR4 4 Channel 3200 256GB (LGA-2011-3)11.561.56154NVIDIA Jetson Orin Nano 8GB811.381.4225155Apple MacMini M1 16GB811.111.38875156NVIDIA Jetson Xavier NX 16GB44.981.245157NVIDIA Jetson Xavier NX 8GB89.951.24375158
天梯原文如下: