With the latest announcement regarding google allegedly paying reddit 60million per year for access to user created content to train their AI, what is stopping companies from using the freely available information on the lemmyverse to do it for free?

How does everyone feel about the likelihood of this already happening and should something be done about it?

  • Oisteink@feddit.nl
    link
    fedilink
    English
    arrow-up
    13
    ·
    9 months ago

    They have to pay Reddit now as the api is gone. I’m quite certain that at least one of the companies scraping the web to train their LLM have been using it.

    And I’m quite certain that this happens to fediverse as well. You don’t even need an api, just set up your own instance. Make a few thousand accounts and sub all over using these. You got all the data in a nice db