Shubhashis Roy Dipta PRO
dipta007
AI & ML interests
Multimodal Understanding, Reasoning, Generation
Recent Activity
new activity 6 days ago
dipta007/DecomposeRL:Correct source corpora (14 per paper) and label-balance wording new activity 6 days ago
dipta007/decomposeRL-7b:Fix Highlights accuracy to match results table (macro 86.3, micro 84.4)Organizations
VC-Inspector (ACL 2026 Main)
-
Advancing Reference-free Evaluation of Video Captions with Factual Analysis
Paper • 2509.16538 • Published • 2 -
dipta007/VCInspector-7B
Image-Text-to-Text • 8B • Updated • 3 • 1 -
dipta007/VCInspector-3B
Image-Text-to-Text • 4B • Updated • 10 • 1 -
dipta007/ActivityNet-FG-It
Viewer • Updated • 242k • 121
BIRD-Synthetic
Synthetically rephrased version of BIRD dataset
DAGGER
A graph based CoT for math reasoning (DAGGER tokens <<<< CoT Tokens)
Barta
GanitLLM (ACL 2026 Findings)
-
GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO
Paper • 2601.06767 • Published • 1 -
dipta007/Ganit
Viewer • Updated • 32.3k • 105 -
dipta007/GanitLLM-4B_SFT_CGRPO
Text Generation • 196k • Updated • 65 -
dipta007/GanitLLM-4B_SFT_GRPO
Text Generation • 196k • Updated • 5 • 1
Q2E (AACL 2025 Main)
Datasets used in the paper: Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
-
Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
Paper • 2506.10202 • Published -
dipta007/Q2E_MultiVENT_LLAMA_3.3_70B_InternVL_38B_Funiform_16_noASR
Viewer • Updated • 2.39k • 6 -
dipta007/Q2E_MultiVENT_LLAMA_3.3_70B_InternVL_38B_Funiform_16_ASR
Viewer • Updated • 2.39k • 8 -
dipta007/Q2E_MSRVTT-1kA_LLAMA_3.3_70B_InternVL_38B_Funiform_16_ASR
Viewer • Updated • 1k • 6
DecomposeRL
Barta
VC-Inspector (ACL 2026 Main)
-
Advancing Reference-free Evaluation of Video Captions with Factual Analysis
Paper • 2509.16538 • Published • 2 -
dipta007/VCInspector-7B
Image-Text-to-Text • 8B • Updated • 3 • 1 -
dipta007/VCInspector-3B
Image-Text-to-Text • 4B • Updated • 10 • 1 -
dipta007/ActivityNet-FG-It
Viewer • Updated • 242k • 121
GanitLLM (ACL 2026 Findings)
-
GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO
Paper • 2601.06767 • Published • 1 -
dipta007/Ganit
Viewer • Updated • 32.3k • 105 -
dipta007/GanitLLM-4B_SFT_CGRPO
Text Generation • 196k • Updated • 65 -
dipta007/GanitLLM-4B_SFT_GRPO
Text Generation • 196k • Updated • 5 • 1
BIRD-Synthetic
Synthetically rephrased version of BIRD dataset
Q2E (AACL 2025 Main)
Datasets used in the paper: Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
-
Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
Paper • 2506.10202 • Published -
dipta007/Q2E_MultiVENT_LLAMA_3.3_70B_InternVL_38B_Funiform_16_noASR
Viewer • Updated • 2.39k • 6 -
dipta007/Q2E_MultiVENT_LLAMA_3.3_70B_InternVL_38B_Funiform_16_ASR
Viewer • Updated • 2.39k • 8 -
dipta007/Q2E_MSRVTT-1kA_LLAMA_3.3_70B_InternVL_38B_Funiform_16_ASR
Viewer • Updated • 1k • 6
DAGGER
A graph based CoT for math reasoning (DAGGER tokens <<<< CoT Tokens)