Source: https://front-end.social/@fox/110846484782705013

Text in the screenshot from Grammarly says:

We develop data sets to train our algorithms so that we can improve the services we provide to customers like you. We have devoted significant time and resources to developing methods to ensure that these data sets are anonymized and de-identified.

To develop these data sets, we sample snippets of text at random, disassociate them from a user’s account, and then use a variety of different methods to strip the text of identifying information (such as identifiers, contact details, addresses, etc.). Only then do we use the snippets to train our algorithms-and the original text is deleted. In other words, we don’t store any text in a manner that can be associated with your account or used to identify you or anyone else.

We currently offer a feature that permits customers to opt out of this use for Grammarly Business teams of 500 users or more. Please let me know if you might be interested in a license of this size, and I’II forward your request to the corresponding team.

    • Frog-Brawler@kbin.social
      link
      fedilink
      arrow-up
      0
      arrow-down
      1
      ·
      1 year ago

      I appreciate you spreading open source alternatives, but this is one of those things that needs an HR solution; not IT.

  • Michael@lemmy.perthchat.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Yeah Grammarly was selling all your data LONG before the AI showed up.

    Funny how some people are only nervous now that their data might be used to train a language model. I was always more worried about spooks! :)

    • Poggervania@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      Companies selling consumer data for profit and marketeering: i sleep

      Companies using consumer data to train AI models:
      R E A L S H I T

    • Jaded@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      It’s because certain companies are stirring the pot and manipulating. They want people mad so they can put restrictions on training AI, to stifle the open source scene.

  • Crul@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I see you posted this article to 4 communities. According to the comments on this post if you use the cross post function (in the default web frontend), it will only show once in the feeds instead of 4 times (which can be a bit annoying).

    Thanks

    EDIT: post link and aclaration regarding the UI

  • library_napper@monyet.cc
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    How much do you have to pay for them to not monitor your every keystroke, including all your IP and passwords?

    Oh, that’s their business model, right.

  • fiat_lux@kbin.social
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    Even as someone who declines all cookies where possible on every site, I have to ask. How do you think they are going to be able to improve their language based services without using language learning models or other algorithmic evaluation of user data?

    I get that the combo of AI and privacy have huge consequences, and that grammarly’s opt-out limits are genuinely shit. But it seems like everyone is so scared of the concept of AI that we’re harming research on tools that can help us while the tools which hurt us are developed with no consequence, because they don’t bother with any transparency or announcement.

    Not that I’m any fan of grammarly, I don’t use it. I think that might be self-evident though.

    • harmonea@kbin.social
      link
      fedilink
      arrow-up
      0
      ·
      1 year ago

      Framing this solely as fear is extremely disingenuous. Speaking only for myself: I’m not against the development of AI or LLMs in general. I’m against the trained models being used for profit with no credit or cut given to the humans who trained it, willing or unwilling.

      It’s not even a matter of “if you aren’t the paying customer, you’re the product” - massive swaths of text used to train AIs were scraped without permission from sources whose platforms never sought to profit from users’ submissions, like AO3. Until this is righted (which is likely never, I admit, because the LLM owners have no incentive whatsoever to change this behavior), I refuse to work with any site that intends to use my work to train LLMs.

      • Jaded@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        1 year ago

        Models need vast amounts of data. Paying individual users isnt feasible, and like you said most of it can be scraped.

        The only way I see this working is if scraped content is a no go and then you pay the website, publishing house, record company, etc which kills any open source solution and doesn’t really help any of the users or creators that much. It also paves the way for certain companies owning a lot of our economy as we move towards an AI driven society.

        It’s definitely a hot mess but the way I see it, the more restrictive we are with it, the more gross monopolies we create for no real gains.

        • Laticauda@lemmy.ca
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          1 year ago

          I mean they’re not even giving credit or asking permission, which both cost nothing. Make a site where people can volunteer their own work, program the ai to generate a list of citations of all the works it used data from when it provides output (I know that this might be lengthy, that’s fine), if you implement it into any sites or software make it so that people can opt out of having their data used, etc. It’s not that hard.

          • Jaded@lemmy.dbzer0.com
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            1 year ago

            Most of the data is scraped, it’s not up to the website. You can’t give a list of citation since it isn’t a search engine, it doesn’t know where the information comes from and it’s highly transformative, it melds information from hundreds if not thousand of different sources.

            If it worked only with volunteer work, there would simply be not enough data.

            Any law restricting data use in AI is only going to benefit corporations, there isn’t a solution for individual content creators. You can’t pay them for the drop in the bucket they add, thee logistics are insane. You can let them opt out, but then you need to do the same for whole websites which leads to a corporate hellscape where three companies own our whole economy since they are the only ones who can train ais.