HuggingFace community-driven open-source library of datasets
ð¤ Datasets is a lightweight library providing two main features:
one-line dataloaders for many public datasets: one-liners to download and pre-process any of the major public datasets (image datasets, audio datasets, text datasets in 467 languages and dialects, etc.) provided on the HuggingFace Datasets Hub. With a simple command like squad_dataset = load_dataset("rajpurkar/squad"), get any of these datasets ready to use in a dataloader for training/evaluating a ML model (Numpy/Pandas/PyTorch/TensorFlow/JAX),
efficient data pre-processing: simple, fast and reproducible data pre-processing for the public datasets as well as your own local datasets in CSV, JSON, text, PNG, JPEG, WAV, MP3, Parquet, etc. With simple commands like processed_dataset = dataset.map(process_example), efficiently prepare the dataset for inspection and ML model evaluation and training.
ð Documentation ð Find a dataset in the Hub ð Share a dataset on the Hub
ð¤ Datasets is designed to let the community easily
- RPM
- python3-datasets-3.6.0-1.lbn42.noarch.rpm
- Summary
- HuggingFace community-driven open-source library of datasets
- URL
- https://pypi.org/project/datasets
- Group
- Unspecified
- License
- ZPL
- Source
-
python-datasets-3.6.0-1.lbn42.src.rpm
- Checksum
- 8f1447615a7b4cbe33f32a8944d1d2415a403fb1255a1e468f17c9fc8fae5010
- Signing Signature
- RSA/SHA512, Sun 05 Apr 2026 07:37:36 PM AEST, Key ID d760880122ab8392
- Build Date
- 2025/09/13 15:20:34
- Requires
-
/usr/bin/python3
python3.13dist(fsspec) >= 2023.1
python3.13dist(fsspec[http]) >= 2023.1
python3.13dist(huggingface-hub) >= 0.24
python3.13dist(numpy) >= 1.17
python3.13dist(pyarrow) >= 15
python3.13dist(pyyaml) >= 5.1
python3.13dist(requests) >= 2.32.2
python3.13dist(tqdm) >= 4.66.3
- Provides
-
python-datasets = 3.6.0-1.lbn42
python3-datasets = 3.6.0-1.lbn42
python3.13-datasets = 3.6.0-1.lbn42
python3.13dist(datasets) = 3.6
python3dist(datasets) = 3.6
- Obsoletes