model.language_model.layers.59.mlp.experts.213.down_proj.qweight is duplicated in shard 39 and 40
1
#5 opened 12 days ago
by
pfn0
Intel/Qwen3.5-397B-A17B-int4-AutoRound.guff wanted!
#4 opened 21 days ago
by
MartinPatterson
Any chance we could get a tuned (non-RTN) version like with 122B model?
#3 opened 22 days ago
by
arcticgus
sglang vllm?
1
#2 opened 22 days ago
by
willfalco
[ISSUE] W4A16 Kernel Selection Failure on Ampere A100 with TP > 1
#1 opened about 1 month ago
by
brownyeyez