🧑‍🚀 open source must win

Victor Mustar PRO

victor

725 1384 5390

victormustar

AI & ML interests

Building the UX of this website

Recent Activity

liked a model about 2 hours ago

Lightricks/LTX-2.3-22b-IC-LoRA-Clean-Plate

upvoted a collection about 10 hours ago

Xiaomi-Robotics-U0

liked a model about 10 hours ago

XiaomiRobotics/Xiaomi-Robotics-U0

View all activity

Organizations

reacted to SeaWolf-AI's post with 🔥 14 days ago

Post

5172

🔓 We ran genuine quantum key-recovery on 'real IBM quantum hardware' — and pushed the frontier well past the largest hardware demos we're aware of (which sat at N=4).

Using Simon's algorithm on ibm_kingston, we recovered the secret key of two symmetric-cipher structures:
• Even–Mansour — N=5 → N=10
• 3-round Feistel (DES-family) — block 6 → 8

Each verified against an 'independent control key', using error mitigation only (no QEC).

🧭 Honest scope: this is not a quantum speedup (the effective difficulty tracks the classical birthday bound ~2^{n/2}), not a break of real AES/RSA, and not 16-round DES (ours is 3-round). The recovery method is reserved for a forthcoming paper; formal record status is pending peer review.

📄 Write-up: https://huggingface.co/blog/FINAL-Bench/quantum
🕹️ Try it live in your browser: https://vidraft-quantumos.hf.space/crypto
🏆 Leaderboard: FINAL-Bench/quantum-bench-leaderboard

#quantum #cryptography #quantumcomputing

reacted to constannnt's post with ❤️ 25 days ago

Post

10385

We are excited to announce Sipp.sh: a high-performance library for running AI inference locally and in the cloud through a unified API.

We began to realize that an LLM isn't just a chat interface for information retrieval. It can be integrated directly into web, games, or productivity apps to handle continuous monitoring and decision-making. It can act as a sort of "second brain,” the silent hand that guides and helps a user without them even realizing it. We see this as the next frontier of UX design, but this is only possible if developers have access to low-cost, zero-latency compute and absolute data privacy.

That's why we created Sipp. It’s an opinionated library that lets developers integrate local AI into any application, giving them the superpowers to completely rethink user experiences across the web, games, and desktop.

To achieve this, we built an entirely new stack in Rust and C++, working alongside the llama.cpp project. Through our work, we were able to contribute back to that community to help upgrade the GGML WebGPU backend. This deep optimization is what enables our fast, responsive decode speeds directly in the browser. Sipp ships as a zero-dependency library for desktop and web, achieving 3x to 5x speedup in token decode compared to popular alternatives.

We are already seeing some incredible use cases emerge from this, from continuous monitoring using local vision to the dynamic generation of game elements in a real-time wizard vs. wizard game.

The best part? It's fully open-source!

We see this as the start of a dialogue about what the future of user interaction is going to look like, and we built Sipp to lay the foundation for that exciting future. Check out the live demos on our site, run your own benchmarks, or come hang out with us in our Discord.

Website: https://www.sipp.sh/
Github: https://github.com/noumena-labs/Sipp

1 reply

reacted to projectlosangeles's post with 🤗 27 days ago

Post

9686

🔥Check out HeartMuLa!!! 🔥

The best open-sourced music generation model in terms of lyrics controllability and music quality!!!

HeartMuLa/HeartMuLa-oss-3B-happy-new-year

❤️Listen to amazing HeartMuLa output samples here:
https://soundcloud.com/aleksandr-sigalov-61/sets/heartmula ❤️

@victor

8 replies

reacted to Jaward's post with 🔥 about 1 month ago

Post

9178

Our preprint is out!
We attempt to model human teaching behaviors into agents yielding a unified framework that enables adaptive personalized learning experiences:
LectūraAgents addresses the prevailing limitations in current AI learning systems with three essential capabilities:
(1) a hierarchical multi-agent architecture modeled on academic standards. we observe that agents collaborating across hierarchies yield better personalized learning outcomes.
(2) an adaptive embodied teaching mechanism, in which the instructor agent executes visible and pedagogically motivated teaching actions (e.g. handwrite, highlight, circle etc) on contents in a teaching environment while speaking.
(3) to achieve this we propose a novel teaching action-speech alignment algorithm (TASA) that dynamically aligns speech with visual teaching actions: specifically, TASA temporally chops up speech segments into word-level tokens, performs salience heuristics analysis on learning contents (texts, images etc) then identifies relevant regions to apply pedagogical teaching actions that guide attention and augment understanding.

We conducted several experiments to assess these capabilities: starting with pedagogical evaluation of the various components under frontier models, comparative analysis with existing frameworks and an efficacy study with real students.

Results show consistent gains in standard instructional metrics (curated by expert educators) spanning lecture content quality, embodied teaching quality, assessment, and personalization over baseline systems, positioning LectūraAgents as a pedagogically grounded framework for personalized learning at scale.

Paper: LectūraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching (2606.16428)
Data: Jaward/lectura-agents-data

replied to rockypod's post about 2 months ago

wait first time I hear about it - absolute fan of Svelte I'll try it!

reacted to rockypod's post with 🔥 about 2 months ago

Post

207

Still working on Svelte Coder. I will experiment with some other models so that I can get Svelte 5 trained better.

1 reply

posted an update about 2 months ago

Post

2667

Sharing how I built the LongCat-Video-Avatar 1.5 Space (+500k views on X) in one agent session. Gave a coding agent its own AI lab on ZeroGPU, framed the goal, walked away. It designed, deployed, tested against the live API, fixed, shipped.

Full recipe with the copy-paste prompt: https://huggingface.co/blog/victor/building-zerogpu-spaces-autonomously

1 reply

reacted to qgallouedec's post with 🔥 2 months ago

Post

10546

Shipped hf-sandbox! 🥡

🧪 Running an eval that executes model-generated C on a few thousand prompts? You probably don't want any of that on your laptop.
Just shipped hf-sandbox, a Modal-style sandbox API on top of Hugging Face Jobs. Spin up an isolated, ephemeral container, run untrusted code, get the result back. No Docker on your laptop, no infra to manage.

Just pip install hf-sandbox.

Early days (v0.1); feedback and issues very welcome:
👉 https://github.com/huggingface/hf-sandbox

1 reply

reacted to mipo57's post with 👍 2 months ago

Post

1537

How do you train self-paying rl agents with jax? New colab that will set you up with Jaxpot is here: https://colab.research.google.com/drive/1-rm_Bh8CNaM861We97ZoicfgKxz0xOSi?usp=sharing

1 reply

reacted to Tonic's post with 🤗 3 months ago

Post

4377

🙋🏻‍♂️ Hey there folks,

since everyone liked my previous announcement post ( https://huggingface.co/posts/Tonic/338509028435394 ) so much , i'm back with more high quality proceedural datasets in the Geospacial domain for SFT training !

Check this one out :
NuTonic/sat-bbox-metadata-sft-v1

the goal is to be able to train vision models on multiple images for remote sensing analysis with one shot .

hope you like it ! 🚀

2 replies

reacted to Enderchef's post with ❤️ 3 months ago

Post

8496

Hi, everyone!
Please follow, like, and support the work of CompactAI-O !
Spread the word!

9 replies

replied to their post 3 months ago

https://huggingface.co/datasets/victor/car-game-pi-traces

reacted to asigalov61's post with 🔥 3 months ago

Post

4333

🔥🎵 ➕ 🖹 🔥Check out my new large-scale MIDI + Lyrics dataset!!!

asigalov61/Lyrics-MIDI-Dataset

~179k MIDIs with corresponding Lyrics to play with!!! 🤗

If you liked the dataset, please ❤️

Any feedback and/or suggestions are also appreciated 🤗

reacted to SeaWolf-AI's post with 🔥 3 months ago

Post

5972

🧬 Darwin-27B-Opus: 86.9% on GPQA Diamond — World #5, Zero Training
We are excited to share Darwin-27B-Opus, a 27B model that achieved 86.9% on GPQA Diamond — ranking #5 globally on the HuggingFace leaderboard — without a single gradient update.

How? Darwin breeds pretrained models through evolutionary FFN crossbreeding. The father (Qwen3.5-27B) provides the reasoning architecture; the mother (Claude 4.6 Opus Reasoning Distilled) contributes structured chain-of-thought knowledge. CMA-ES automatically discovers optimal per-layer blending ratios — no human tuning required.

The result surpasses the original Qwen3.5-27B (85.5%), GLM-5.1 (744B, 86.2%), and Qwen3.5-122B (86.6%). A 27B model outperforming 744B — with zero training, zero data, one GPU, ~2 hours.

We also confirmed hybrid vigor on Korean benchmarks: Darwin-27B-KR (2nd generation offspring) surpassed both parents on CLIcK, winning 7 out of 11 categories. The evolutionary optimizer independently assigned 93% of FFN from the Korean-specialized mother while preserving 93% of attention from the reasoning-specialized father — autonomously validating our core principle: FFN carries knowledge, Attention carries reasoning.

📊 Public release: 10 days → 300+ community derivatives, 120K+ downloads.

🔗 Links:
Darwin-27B-Opus: FINAL-Bench/Darwin-27B-Opus
article: https://huggingface.co/blog/FINAL-Bench/darwin-gpqa
Darwin Family Collection: https://huggingface.co/collections/FINAL-Bench/darwin-family

If foundation models are raw ore, Darwin is the forge. We are just getting started. 🔥

reacted to prithivMLmods's post with ❤️ 3 months ago

Post

6257

A new comparator on Spaces showcases Standard FLUX.2 Decoder vs. FLUX.2 Small Decoder. The Small Decoder is ~1.4× faster, uses ~1.4× less VRAM, and maintains near-identical image quality. It has ~28M parameters with narrower channels [96, 192, 384, 384] vs. [128, 256, 512, 512], and the demo supports sequence generation by running both decoders simultaneously and comparing the results side by side.

🤗 Comparator: https://huggingface.co/spaces/prithivMLmods/Flux.2-4B-Decoder-Comparator
🔗 FLUX.2-small-decoder: black-forest-labs/FLUX.2-small-decoder
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/Flux.2-4B-Encoder-Comparator
🚁 Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

🤗 > App built on the Gradio SDK. To learn more, visit the app page or the respective model pages.

posted an update 3 months ago

Post

6211

Want to share my enthusiasm for zai-org/GLM-5.1 here too 🔥

I think we have it: our open source Claude Code = GLM-5.1 + Pi (https://pi.dev/) - Built a Three.js racing game to eval and it's extremely impressive. Thoughts:

- One-shot car physics with real drift mechanics (this is hard)

- My fav part: Awesome at self iterating (with no vision!) created 20+ Bun.WebView debugging tools to drive the car programmatically and read game state. Proved a winding bug with vector math without ever seeing the screen

- 531-line racing AI in a single write: 4 personalities, curvature map, racing lines, tactical drifting. Built telemetry tools to compare player vs AI speed curves and data-tuned parameters

- All assets from scratch: 3D models, procedural textures, sky shader, engine sounds, spatial AI audio!

- Can do hard math: proved road normals pointed DOWN via vector cross products, computed track curvature normalized by arc length to tune AI cornering speed

You are going to hear about this model a lot in the next months - open source let's go - and thanks z-ai🚀🚀

5 replies

reacted to Juanxi's post with 🔥 3 months ago

Post

4449

📢 Awesome Multimodal Modeling

We introduce Awesome Multimodal Modeling, a curated repository tracing the architectural evolution of multimodal intelligence—from foundational fusion to native omni-models.

🔹 Taxonomy & Evolution:

Traditional Multimodal Learning – Foundational work on representation, fusion, and alignment.
Multimodal LLMs (MLLMs) – Architectures connecting vision encoders to LLMs for understanding.
Unified Multimodal Models (UMMs) – Models unifying Understanding + Generation via Diffusion, Autoregressive, or Hybrid paradigms.
Native Multimodal Models (NMMs) – Models trained from scratch on all modalities; contrasts early vs. late fusion under scaling laws.
💡 Key Distinction:
UMMs unify tasks via generation heads; NMMs enforce interleaving through joint pre-training.

🔗 Explore & Contribute: https://github.com/OpenEnvision/Awesome-Multimodal-Modeling

3 replies

reacted to qgallouedec's post with 🔥 4 months ago

Post

2487

TRL v1.0 is out!

Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.

The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.

What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.

pip install --upgrade trl

Victor Mustar PRO

AI & ML interests

Recent Activity

Organizations

victor's activity