• hanni@lemmy.one
    link
    fedilink
    English
    arrow-up
    23
    ·
    1 year ago

    No doubt they will train it on private messages, if they haven’t already.

  • karpintero@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    edit-2
    1 year ago

    Not surprising. Everyone should be wary of what they share online and not expect that companies will respect their privacy. That said, I wish it weren’t so and training AI with our data seems dystopian

  • AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    This is the best summary I could come up with:


    MENLO PARK, California, Sept 28 (Reuters) - Meta Platforms (META.O) used public Facebook and Instagram posts to train parts of its new Meta AI virtual assistant, but excluded private posts shared only with family and friends in an effort to respect consumers’ privacy, the company’s top policy executive told Reuters in an interview.

    “We’ve tried to exclude datasets that have a heavy preponderance of personal information,” Clegg said, adding that the “vast majority” of the data used by Meta for training was publicly available.

    The product will be able to generate text, audio and imagery and will have access to real-time information via a partnership with Microsoft’s (MSFT.O) Bing search engine.

    Those posts were used to train Emu for the image generation elements of the product, while the chat functions were based on Llama 2 with some publicly available and annotated datasets added, a Meta spokesperson told Reuters.

    Some companies with image-generation tools facilitate the reproduction of iconic characters like Mickey Mouse, while others have paid for the materials or deliberately avoided including them in training data.

    OpenAI, for instance, signed a six-year deal with content provider Shutterstock this summer to use the company’s image, video and music libraries for training.


    The original article contains 603 words, the summary contains 201 words. Saved 67%. I’m a bot and I’m open source!