SlimPajama

Path:/datasets/ai/slim-pajama
URL:https://huggingface.co/datasets/cerebras/SlimPajama-627B
Downloaded:12-28-2024
Cite:Shen, Zhiqiang, et al. “Slimpajama-dc: Understanding data combinations for llm training.” arXiv preprint arXiv:2309.10818 (2023)
Variant:
    Bibtex:
    @article{shen2023slimpajama, title={Slimpajama-dc: Understanding data combinations for llm training}, author={Shen, Zhiqiang and Tao, Tianhua and Ma, Liqun and Neiswanger, Willie and Liu, Zhengzhong and Wang, Hongyi and Tan, Bowen and Hestness, Joel and Vassilieva, Natalia and Soboleva, Daria and others}, journal={arXiv preprint arXiv:2309.10818}, year={2023}}