May 11, 20251 yr These portals provide datasets used in peer-reviewed research, academic collaborations, and institutional projects. They are often high-quality, well-documented, and domain-specific—ideal for AI model evaluation, reproducibility studies, or cross-discipline insights. Tools: Papers with Code – Datasets – Links benchmark datasets with research papers and model performance. Great for comparing SOTA across tasks. AllenAI Datasets – Curated by the Allen Institute for AI, includes NLP, vision, and reasoning datasets like SciQ and Aristo. Ideal for language understanding and commonsense reasoning tasks. IEEE DataPort – A platform for scientific datasets, competitions, and academic benchmarks. Supports fields like IoT, robotics, and telecommunications. Mendeley Data – A repository for scientific data linked to publications. Encourages reproducibility and collaboration across research fields. B2SHARE (EUDAT) – European data infrastructure hosting cross-discipline research datasets. Includes geoscience, engineering, and climate data. DL ACM Dataset (MovieLens) – MovieLens-based benchmark for collaborative filtering and recommender systems.
Create an account or sign in to comment