About 925,000,000 results
Open links in new tab
  1. GitHub - huggingface/datasets: The largest hub of ready-to-use ...

    🤗 Datasets is a lightweight library providing two main features: one-line dataloaders for many public datasets: one-liners to download and pre-process any of the major public datasets (image datasets, …

  2. datasets · GitHub Topics · GitHub

    Dec 29, 2025 · GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

  3. GitHub - ncbi/datasets: NCBI Datasets is a new resource that lets you ...

    NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. - ncbi/datasets

  4. GitHub - datasets/commons: DataHub commons. Wiki catalog of …

    DataHub commons. Wiki catalog of interesting and important datasets - datasets/commons

  5. GitHub - luminati-io/Free-datasets: A collection of multiple free ...

    This repository contains a collection of free datasets with thousands of records for use in data analysis, machine learning, and research. The datasets span multiple domains, from business to social media …

  6. A collection of datasets originally distributed in R packages

    Rdatasets is a collection of 3499 datasets which were originally distributed alongside the statistical software environment R and some of its add-on packages. The goal is to make these data more …

  7. TensorFlow Datasets - GitHub

    TFDS is a collection of datasets ready to use with TensorFlow, Jax, ... - tensorflow/datasets

  8. Releases · huggingface/datasets - GitHub

    🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools - huggingface/datasets

  9. Google Research Datasets - GitHub

    Datasets released by Google Research. Google Research Datasets has 172 repositories available. Follow their code on GitHub.

  10. GitHub - allenai/olmocr: Toolkit for linearizing PDFs for LLM datasets ...

    About Toolkit for linearizing PDFs for LLM datasets/training Readme Apache-2.0 license Contributing