Aleph-Alpha

Path:/datasets/ai/aleph-alpha
URL:https://huggingface.co/Aleph-Alpha
Downloaded:10-06-2025
Cite:Burns, Thomas F., et al. “Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation.” arXiv preprint arXiv:2505.00022 (2025).
Variant:
    Bibtex:
    @article{burns2025aleph, title={Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation}, author={Burns, Thomas F and Parcalabescu, Letitia and W{"a}ldchen, Stephan and Barlow, Michael and Ziegltrum, Gregor and Stampa, Volker and Harren, Bastian and Deiseroth, Bj{"o}rn}, journal={arXiv preprint arXiv:2505.00022}, year={2025} }