In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.

  • FaceDeer@kbin.social
    link
    fedilink
    arrow-up
    3
    ·
    1 year ago

    The question is for example if you train an AI on GPL code, does the output of the model constitute a derivative work?

    This question is completely independent of whether the code was generated by an AI or a human. You compare code A with code B, and if the judge and jury agree that code A is a derivative work of code B then you win the case. If the two bodies of work don’t have sufficient similarities then they aren’t derivative.

    If no, I can write some simple AI that is “trained” to regurgitate its output on a prompt

    You’ve reinvented copy-and-paste, not an “AI.” AIs are deliberately designed to not copy-and-paste. What would be the point of one that did? Nobody wants that.

    Filtering the code through something you call an AI isn’t going to have any impact on whether you get sued. If the resulting code looks like copyrighted code, then you’re in trouble. If it doesn’t look like copyrighted code then you’re fine.

    • maynarkh@feddit.nl
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      AIs are deliberately designed to not copy-and-paste.

      AI is a marketing term, not a technical one. You can call anything “AI”, but it’s usually predictive models that get called that.

      AIs are deliberately designed to not copy-and-paste. What would be the point of one that did? Nobody wants that.

      For example if the powers that be decided to say licenses don’t apply once you feed material through an “AI”, and failed to define AI, you could say you wrote this awesome OS using an AI that you trained exclusively using Microsoft proprietary code. Their licenses and copyright and stuff doesn’t apply to AI training data so you could sell that new code your AI just created.

      It doesn’t even have to be 100% identical to Windows source code. What if it’s just 80%? 50%? 20%? 5%? Where is the bar where the author can claim “that’s my code!”?

      Just to compare, the guys who set out to reimplement Win32 APIs for use in Linux (the thing that made it into MacOS as well now) deliberately would not accept help from anyone who ever saw any Microsoft source code for fear of being sued. The bar was that high when it was a small FOSS organization doing it. It was 0%, proven beyond a doubt.

      Now that Microsoft is the author, it’s not a problem when Github Copilot spits out GPL code word for word, ironically together with its license.