A user on the online forum 4chan has leaked a massive 270GB of data purportedly belonging to The New York Times. This leak includes what is claimed to be the source code for the newspaper’s digital operations.

  • 🇦🇺𝕄𝕦𝕟𝕥𝕖𝕕𝕔𝕣𝕠𝕔𝕕𝕚𝕝𝕖@lemm.ee
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    16
    ·
    edit-2
    5 months ago

    Thats a lot of data but surly its not all their articles cos I’d very much like to train mixtral7x8b on it along with 4chan data and shir from the dark web. Surly there is a project where such a model is public and being trained on literally everything regardless of legality.

    EDIT: why am i getting downvoted?