Computer Vision & Multimedia Datasets

Followers

May 11, 20251 yr

These repositories specialize in images, video, annotations, and multimodal resources. Ideal for training models in object detection, segmentation, captioning, and scene understanding.

Tools:

Visual Genome – Image dataset with region descriptions, relationships, and attributes for scene understanding. Commonly used in VQA and multimodal AI.
Million Song Dataset – A comprehensive music dataset for recommendation systems, beat analysis, and music intelligence.
Tabby Vision Dataset – Offers labeled images for vision research, especially in medical or environmental contexts.

Create an account or sign in to comment

Share on Facebook
Share on X
{lang="reddit_text"
Share via email
Share on Pinterest

Followers

Go to topic listing

Computer Vision & Multimedia Datasets

Featured Replies

Tools:

Create an account or sign in to comment

Who's Online (See full list)

Lead AI Transformation without coding

Most Solved

Forum Statistics

Member Statistics

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)