video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model Paper • 2502.11775 • Published Feb 17, 2025 • 9
NousResearch/DeepHermes-3-Llama-3-8B-Preview Text Generation • 8B • Updated Apr 10, 2025 • 1.6k • • 362
Running on Zero Agents Featured 2.87k F5-TTS 🗣 2.87k F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Paper • 2410.13863 • Published Oct 17, 2024 • 37
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 674
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? Paper • 2408.13257 • Published Aug 23, 2024 • 26
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering Paper • 2408.09702 • Published Aug 19, 2024 • 11
Trelis/Meta-Llama-3-8B-Instruct-function-calling Text Generation • 8B • Updated Jul 23, 2024 • 30 • • 45
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models Paper • 2405.15574 • Published May 24, 2024 • 55