• NevermindNoMind@kbin.social
    link
    fedilink
    arrow-up
    5
    ·
    1 year ago

    The problem with this theory is that they could have done two tiered pricing. Reddit could have charged TPA developers one price and the LLM trainers a much higher price for API access. In fact, I believe that is exactly what Reddit is doing, they just haven’t been public about what they are trying to charge the LLMs. The Verge asked Spez about whether the LLM folks are biting on this and what that price would be, he just responded that they are “in talks.”

    If Reddit didn’t want to kill TPAs, they also could have given them a year or so to figure out their business models, rather than the 30 days they were given. Hell, Reddit could have backed down at any point and extended the time period for implementing charges.

    If Spez thinks he’s going to make money off LLMs, I think he’s delusional. The OpenAIs, Googles, and Metas out there have already used the Reddit data to train their models. That ship has sailed. The focus in the LLM world now is making better models, more compact models, refining their answers and making them more accurate, etc. The days of throwing vast amounts of random data at these models is probably over. For GPT 5, OpenAI is probably not looking to spend 50 million on new Reddit comments. Instead they will spend that to hire experts to revise GPT 4s outputs and use that as training data.

    • Icalasari@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Plus, scraping exists. No need to pay for API access if it can scrape what is publically available