-
deepseek-ai/DeepSeek-V3-Base
Updated • 17.3k • 1.69k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 69 -
Qwen2.5 Bakeneko 32b Instruct Awq
⚡2Chat with an AI assistant in Japanese
-
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡3Chat with an AI to generate detailed responses
Eduardo Espina
Edespina
·
AI & ML interests
None yet
Organizations
None yet
MWT
-
deepseek-ai/DeepSeek-V3-Base
Updated • 17.3k • 1.69k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 69 - Running on ZeroAgents2
Qwen2.5 Bakeneko 32b Instruct Awq
⚡2Chat with an AI assistant in Japanese
- SleepingAgents3
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡3Chat with an AI to generate detailed responses
models 0
None public yet
datasets 0
None public yet