Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
38
7
52
Michael Anthony
PRO
MikeDoes
Follow
IvanRes25's profile picture
AMHF's profile picture
saraffurniture's profile picture
124 followers
·
55 following
http://www.ai4privacy.com
MikeDoesDo
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
posted
an
update
about 6 hours ago
PII leakage isn't just a model problem — it's a data problem. A recent paper takes a hard look at how well current systems actually detect and redact personal data at scale. One of their key conclusions is something the privacy community keeps rediscovering: without large, structured, and diverse PII datasets, evaluation collapses into guesswork. To ground their experiments, the authors benchmarked their approach using the 500K PII-Masking dataset from AI4Privacy, leveraging its scale and coverage to test real-world redaction behavior rather than toy examples. What's interesting here isn't just the model performance — it's what the evaluation reveals. The paper shows that many systems appear robust under narrow tests but fail once PII appears in varied formats, contexts, and combinations. This gap between "works in theory" and "works in practice" is exactly where privacy risks emerge. This is the value of open, research-grade datasets: They expose failure modes early They make comparisons reproducible They let the community measure progress honestly When researchers build on shared data foundations, everyone benefits — from academic insight to safer downstream applications. 🔗 Read the full paper here: https://arxiv.org/abs/2407.08792
liked
a dataset
2 days ago
ai4privacy/openpii-masking-mini-10k
updated
a dataset
6 days ago
ai4privacy/pwi-masking-100k
View all activity
Organizations
MikeDoes
's models
21
Sort: Recently updated
MikeDoes/mmbert-multilingual-20250916-212213
0.1B
•
Updated
Sep 16, 2025
•
6
MikeDoes/mmbert-multilingual-20250916-202535
Updated
Sep 16, 2025
MikeDoes/mmbert-multilingual-20250916-170430
0.1B
•
Updated
Sep 16, 2025
•
3
MikeDoes/mmbert-multilingual-20250916-173350
0.3B
•
Updated
Sep 16, 2025
•
2
MikeDoes/mmbert-multilingual-20250916-170450
Updated
Sep 16, 2025
MikeDoes/mmbert-multilingual-20250916-155621
0.3B
•
Updated
Sep 16, 2025
•
4
MikeDoes/mmbert-multilingual-20250916-155528
Fill-Mask
•
0.1B
•
Updated
Sep 16, 2025
•
2
MikeDoes/mmbert-multilingual-20250916-145114
0.3B
•
Updated
Sep 16, 2025
•
1
MikeDoes/mmbert-multilingual-20250916-143043
Updated
Sep 16, 2025
MikeDoes/mmbert-multilingual-20250916-133611
0.3B
•
Updated
Sep 16, 2025
•
1
MikeDoes/mmbert-multilingual-20250916-130537
Fill-Mask
•
0.3B
•
Updated
Sep 16, 2025
•
10
MikeDoes/mmbert-multilingual-20250916-120850
Fill-Mask
•
0.3B
•
Updated
Sep 16, 2025
•
1
MikeDoes/mmbert-multilingual-20250916-114740
Fill-Mask
•
0.3B
•
Updated
Sep 16, 2025
•
3
MikeDoes/mmbert-multilingual-20250916-103748
Fill-Mask
•
0.3B
•
Updated
Sep 16, 2025
•
1
MikeDoes/modernbert-english-ner-20250808-034913
Token Classification
•
0.1B
•
Updated
Aug 8, 2025
MikeDoes/modernbert-english-ner-20250806-110517
0.1B
•
Updated
Aug 6, 2025
•
1
MikeDoes/quick-ner-model-20250726-011948
Token Classification
•
0.1B
•
Updated
Jul 27, 2025
MikeDoes/eurobert-ner-model-20250726-134739
Token Classification
•
0.2B
•
Updated
Jul 27, 2025
•
1
MikeDoes/eurobert-ner-model-20250726-082438
Updated
Jul 26, 2025
MikeDoes/quick-ner-model-20250726-004735
Updated
Jul 25, 2025
MikeDoes/test_night
Updated
Jul 25, 2025