Instructions to use COinCO/Context_Classification_Models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use COinCO/Context_Classification_Models with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="COinCO/Context_Classification_Models")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("COinCO/Context_Classification_Models", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use COinCO/Context_Classification_Models with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "COinCO/Context_Classification_Models"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "COinCO/Context_Classification_Models",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/COinCO/Context_Classification_Models

SGLang

How to use COinCO/Context_Classification_Models with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "COinCO/Context_Classification_Models" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "COinCO/Context_Classification_Models",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "COinCO/Context_Classification_Models" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "COinCO/Context_Classification_Models",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use COinCO/Context_Classification_Models with Docker Model Runner:
```
docker model run hf.co/COinCO/Context_Classification_Models
```

Context_Classification_Models / README.md

ruitongs

Upload README.md with huggingface_hub

69c10fa verified 3 months ago

preview code

raw

history blame contribute delete

3.33 kB

	---
	base_model: Qwen/Qwen2.5-VL-3B-Instruct
	library_name: transformers
	pipeline_tag: image-text-to-text
	tags:
	- qwen2.5-vl
	- lora
	- sft
	- context-classification
	- out-of-context-detection
	- coinco
	license: cc-by-4.0
	---

	# COinCO Context Classification Models

	Authors: Tianze Yang\, Tyson Jordan\, Ruitong Sun\*, Ninghao Liu, Jin Sun
	\*Equal contribution
	Affiliation: University of Georgia

	## Overview

	Fine-grained context classification models for detecting out-of-context objects in images. Each model is a fully merged Qwen2.5-VL-3B-Instruct fine-tuned via LoRA on the [COinCO dataset](https://huggingface.co/datasets/COinCO/COinCO-dataset).

	The models classify whether an object (marked by a red bounding box) is in-context or out-of-context based on three criteria:

	\| Model \| Criterion \| Description \|
	\|-------\|-----------\|-------------\|
	\| `co_occurrence/` \| Co-occurrence \| Whether the object can reasonably appear together with other objects in the scene \|
	\| `location/` \| Location \| Whether the object is placed in a physically and contextually reasonable position \|
	\| `size/` \| Size \| Whether the object's size is proportional and realistic relative to other objects \|

	## How to Use

	```python
	from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
	import torch

	# Choose a model: "co_occurrence", "location", or "size"
	model_id = "COinCO/Context_Classification_Models"
	subfolder = "co_occurrence" # or "location" or "size"

	model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
	model_id,
	subfolder=subfolder,
	torch_dtype=torch.float16,
	device_map="auto",
	)
	processor = AutoProcessor.from_pretrained(model_id, subfolder=subfolder)
	```

	## Training Details

	- Base Model: [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)
	- Method: LoRA fine-tuning (merged into base model)
	- Dataset: [COinCO](https://huggingface.co/datasets/COinCO/COinCO-dataset) inpainted images with multi-model consensus labels
	- Training Data: ~5,000 samples per criterion from the training split
	- Epochs: 3
	- Learning Rate: 2e-4
	- LoRA Rank: See adapter config for details

	## Evaluation Results

	### Inpainted Test Set (binary classification: In-context vs Out-of-context)

	\| Criterion \| Baseline (Qwen2.5-VL-3B) \| Fine-tuned \| Improvement \|
	\|-----------\|--------------------------\|------------\|-------------\|
	\| Co-occurrence \| 75.54% \| 80.82% \| +5.28% \|
	\| Location \| 74.43% \| 71.05% \| -3.38% \|
	\| Size \| 50.21% \| 66.01% \| +15.80% \|

	### Real COCO Images (shortcut learning detection, higher = less shortcut reliance)

	\| Criterion \| Baseline \| Fine-tuned \| Improvement \|
	\|-----------\|----------\|------------\|-------------\|
	\| Co-occurrence \| 88.95% \| 87.00% \| -1.95% \|
	\| Location \| 47.55% \| 91.35% \| +43.80% \|
	\| Size \| 52.55% \| 83.20% \| +30.65% \|

	## Related Resources

	- Paper: "Common Inpainted Objects In-N-Out of Context"
	- Dataset: [COinCO/COinCO-dataset](https://huggingface.co/datasets/COinCO/COinCO-dataset)
	- Code: [YangTianze009/COinCO](https://github.com/YangTianze009/COinCO)

	## Citation

	```bibtex
	@article{yang2025coinco,
	title={Common Inpainted Objects In-N-Out of Context},
	author={Tianze Yang and Tyson Jordan and Ruitong Sun and Ninghao Liu and Jin Sun},
	year={2025}
	}
	```