In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.

  • db0@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    4
    ·
    1 year ago

    I agree with google, only I go a step further and say any AI model trained on public data should likewise be public for all and have its data sources public as well. Can’t have it both ways Google.

    • Domi@lemmy.secnd.me
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      To be fair, Google releases a lot of models as open source: https://huggingface.co/google

      Using public content to create public models is also fine in my book.

      But since it’s Google I’m also sure they are doing a lot of shady stuff behind closed doors.

  • andresil@lemm.ee
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    1 year ago

    Copyright law is gaslighting at this point. Piracy being extremely illegal but then this kind of shit being allowed by default is insane.

    We really are living under the boot of the ruling classes.

  • nightmaaaare@lemmy.one
    link
    fedilink
    arrow-up
    3
    ·
    1 year ago

    Personally I’d rather stop posting creative endeavours entirely than simply let it be stolen and regurgitated by every single company who’s built a thing on the internet.

    • Roundcat@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I just take comfort in the fact that my art will never be good enough for a generative Ai to steal.

      • Dizzy Devil Ducky@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        If it’s on any major platform, these companies will probably still use it since I doubt at that point if they were allowed to scrape the whole internet they’d have any human looking over the art used.

        It’ll just be thrown in with everything else similar to how I always seem to find paper towels in the dryer after doing laundry.

    • AbsolutelyNotABot@feddit.it
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I think the topic is more complex than that.

      Otherwise you could say you’d rather stop posting creative endeavours entirely than simply let it be stolen and regurgitated by every single artist who use internet for references and inspiration.

      There’s not only the argument “but companies do so for profit” because many artist do the same, maybe they are designers, illustrators or other and you’ll work will give them ideas for their commissions

        • AbsolutelyNotABot@feddit.it
          link
          fedilink
          arrow-up
          2
          ·
          1 year ago

          let people reuse each other’s melodies

          I think this is an interesting example, because it’s already like this. Songs reusing other sampled songs are released all the time, and it’s all perfectly legal. Only making a copy is illegal. No one can sue you if you create a character that resembles mickey mouse, but you can’t use mickey mouse.

          And pharmaceutical patents serves the same scope, they encourage the company to release publicly papers, data and synthesis methods so that other people can learn and research can move faster.

          And the whole point of this is exactly regulating AI like people, no one will come after you because you’ve read something and now you have an opinion about it, no body will get angry if you’ve saw an Instagram post and now you have some ideas for your art.

          Of course the distinction between likeness and copy is not that defined, but that’s part of the whole debacle

  • FaceDeer@kbin.social
    link
    fedilink
    arrow-up
    3
    ·
    1 year ago

    Copyright law already allows generative AI systems to scrape the internet. You need to change the law to forbid something, it isn’t forbidden by default. Currently, if something is published publicly then it can be read and learned from by anyone (or anything) that can see it. Copyright law only prevents making copies of it, which a large language model does not do when trained on it.

    • maynarkh@feddit.nl
      link
      fedilink
      arrow-up
      3
      ·
      1 year ago

      A lot of licensing prevents or constrains creating derivative works and monetizing them. The question is for example if you train an AI on GPL code, does the output of the model constitute a derivative work?

      If yes, Github Copilot is illegal as it produces code that should comply to multiple conflicting license requirements. If no, I can write some simple AI that is “trained” to regurgitate its output on a prompt, and run a leaked copy of Windows through it, then go around selling Binbows and MSFT can’t do anything about it.

      The truth is mostly between the two, this is just piracy, which always has been a gray area because of the difficulty of prosecuting it, previously because the perpetrators were many and hard to find, now it’s because the perpetrators are billion dollar companies with expensive lawyer teams.

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        3
        ·
        1 year ago

        The question is for example if you train an AI on GPL code, does the output of the model constitute a derivative work?

        This question is completely independent of whether the code was generated by an AI or a human. You compare code A with code B, and if the judge and jury agree that code A is a derivative work of code B then you win the case. If the two bodies of work don’t have sufficient similarities then they aren’t derivative.

        If no, I can write some simple AI that is “trained” to regurgitate its output on a prompt

        You’ve reinvented copy-and-paste, not an “AI.” AIs are deliberately designed to not copy-and-paste. What would be the point of one that did? Nobody wants that.

        Filtering the code through something you call an AI isn’t going to have any impact on whether you get sued. If the resulting code looks like copyrighted code, then you’re in trouble. If it doesn’t look like copyrighted code then you’re fine.

        • maynarkh@feddit.nl
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          AIs are deliberately designed to not copy-and-paste.

          AI is a marketing term, not a technical one. You can call anything “AI”, but it’s usually predictive models that get called that.

          AIs are deliberately designed to not copy-and-paste. What would be the point of one that did? Nobody wants that.

          For example if the powers that be decided to say licenses don’t apply once you feed material through an “AI”, and failed to define AI, you could say you wrote this awesome OS using an AI that you trained exclusively using Microsoft proprietary code. Their licenses and copyright and stuff doesn’t apply to AI training data so you could sell that new code your AI just created.

          It doesn’t even have to be 100% identical to Windows source code. What if it’s just 80%? 50%? 20%? 5%? Where is the bar where the author can claim “that’s my code!”?

          Just to compare, the guys who set out to reimplement Win32 APIs for use in Linux (the thing that made it into MacOS as well now) deliberately would not accept help from anyone who ever saw any Microsoft source code for fear of being sued. The bar was that high when it was a small FOSS organization doing it. It was 0%, proven beyond a doubt.

          Now that Microsoft is the author, it’s not a problem when Github Copilot spits out GPL code word for word, ironically together with its license.

      • AbsolutelyNotABot@feddit.it
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        then go around selling Binbows and MSFT can’t do anything about it

        I think this already happen. A very practical example, windows GUI has been copied by many Linus distros. And with windows 11 there’s clearly a reference to Apple MacOS GUI with a sparkling of Google material design.

        Should apple and Google be able to sue Microsoft because it “copied” their work? Should Google be able to sue apple because they “copied” the notification drop-down in iOS?

        As you say it’s really a grey area because the only reason we consider AI code to be “regurgitated” while human code to be “inspired” is only because we give humans more recognition of their intellectual abilities.

          • nous@programming.dev
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Someone getting sued does not mean they are wrong or that they lost the case. Each case needs to look at the works in question and decide if that perceptual case violates copy write. Lots of things are taken into account here, and even is small elements might have been used or be similar does not automatically win the case.

            There is also a difference between some implementation and the overall feature in question. For instance, APIs are not copy writeable, nor are cords in music, nor what something does overall. Only specific implementations are copy writeable.

            The same can apply to AI - if it generates a work that if a human did it it would violate copy write then it does - if not then it does not. But AI shows a different problem. That of scale. There is only a limited amount of work that a human can do. But an AI can produce vastly more content - enough that a case by case evaluation of infringement might not be viable. And if that becomes the case then AI works might need to be treated differently from human created works - or maybe how the models are created and how they can use copy writed works. The current laws were never designed with the speed at which AI can work in mind.

      • BlameThePeacock@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        A human is a derivative work of its training data, thus a copyright violation if the training data is copyrighted.

        The difference between a human and ai is getting much smaller all the time. The training process is essentially the same at this point, show them a bunch of examples and then have them practice and provide feedback.

        If that human is trained to draw on Disney art, then goes on to create similar style art for sale that isn’t a copyright infringement. Nor should it be.

        • Phanatik@kbin.social
          link
          fedilink
          arrow-up
          0
          ·
          1 year ago

          This is stupid and I’ll tell you why.
          As humans, we have a perception filter. This filter is unique to every individual because it’s fed by our experiences and emotions. Artists make great use of this by producing art which leverages their view of the world, it’s why Van Gogh or Picasso is interesting because they had a unique view of the world that is shown through their work.
          These bots do not have perception filters. They’re designed to break down whatever they’re trained on into numbers and decipher how the style is constructed so it can replicate it. It has no intention or purpose behind any of its decisions beyond straight replication.
          You would be correct if a human’s only goal was to replicate Van Gogh’s style but that’s not every artist. With these art bots, that’s the only goal that they will ever have.

          I have to repeat this every time there’s a discussion on LLM or art bots:
          The imitation of intelligence does not equate to actual intelligence.

          • frog 🐸@beehaw.org
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Absolutely agreed! I think if the proponents of AI artwork actually had any knowledge of art history, they’d understand that humans don’t just iterate the same ideas over and over again. Van Gogh, Picasso, and many others, did work that was genuinely unique and not just a derivative of what had come before, because they brought more to the process than just looking at other artworks.

        • 50gp@kbin.social
          link
          fedilink
          arrow-up
          0
          ·
          1 year ago

          a human does not copy previous work exactly like these algorithms, whats this shit take?

      • conciselyverbose@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Derivative works are only copyright violations when they replicate substantial portions of the original without changes.

        The entirety of human civilization is derivative works. Derivative works aren’t infringement.

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        It is not a derivative work, the model does not contain any recognizable part of the original material that it was trained on.

            • frog 🐸@beehaw.org
              link
              fedilink
              arrow-up
              0
              ·
              1 year ago

              The point is that if the model doesn’t contain any recognisable parts of the original material it was trained on, how can it reproduce recognisable parts of the original material it was trained on?

              • ricecake@beehaw.org
                link
                fedilink
                arrow-up
                0
                ·
                1 year ago

                That’s sorta the point of it.
                I can recreate the phrase “apple pie” in any number of styles and fonts using my hands and a writing tool. Would you say that I “contain” the phrase “apple pie”? Where is the letter ‘p’ in my brain?

                Specifically, the AI contains the relationship between sets of words, and sets of relationships between lines, contrasts and colors.
                From there, it knows how to take a set of words, and make an image that proportionally replicates those line pattern and color relationships.

                You can probably replicate the Getty images watermark close enough for it to be recognizable, but you don’t contain a copy of it in the sense that people typically mean.
                Likewise, because you can recognize the artist who produced a piece, you contain an awareness of that same relationship between color, contrast and line that the AI does. I could show you a Picasso you were unfamiliar with, and you’d likely know it was him based on the style.
                You’ve been “trained” on his works, so you have internalized many of the key markers of his style. That doesn’t mean you “contain” his works.

  • modulus@lemmy.ml
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    Worth considering that this is already the law in the EU. Specifically, the Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market has exceptions for text and data mining.

    Article 3 has a very broad exception for scientific research: “Member States shall provide for an exception to the rights provided for in Article 5(a) and Article 7(1) of Directive 96/9/EC, Article 2 of Directive 2001/29/EC, and Article 15(1) of this Directive for reproductions and extractions made by research organisations and cultural heritage institutions in order to carry out, for the purposes of scientific research, text and data mining of works or other subject matter to which they have lawful access.” There is no opt-out clause to this.

    Article 4 has a narrower exception for text and data mining in general: “Member States shall provide for an exception or limitation to the rights provided for in Article 5(a) and Article 7(1) of Directive 96/9/EC, Article 2 of Directive 2001/29/EC, Article 4(1)(a) and (b) of Directive 2009/24/EC and Article 15(1) of this Directive for reproductions and extractions of lawfully accessible works and other subject matter for the purposes of text and data mining.” This one’s narrower because it also provides that, “The exception or limitation provided for in paragraph 1 shall apply on condition that the use of works and other subject matter referred to in that paragraph has not been expressly reserved by their rightholders in an appropriate manner, such as machine-readable means in the case of content made publicly available online.”

    So, effectively, this means scientific research can data mine freely without rights’ holders being able to opt out, and other uses for data mining such as commercial applications can data mine provided there has not been an opt out through machine-readable means.

    • frog 🐸@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I think the key problem with a lot of the models right now is that they were developed for “research”, without the rights holders having the option to opt out when the models were switched to for-profit. The portfolio and gallery websites, from which the bulk of the artwork came from, didn’t even have opt out options until a couple of months ago. Artists were therefore considered to have opted in to their work being used commercially because they were never presented with the option to opt out.

      So at the bare minimum, a mechanism needs to be provided for retroactively removing works that would have been opted out of commercial usage if the option had been available and the rights holders had been informed about the commercial intentions of the project. I would favour a complete rebuild of the models that only draws from works that are either in the public domain or whose rights holders have explicitly opted in to their work being used for commercial models.

      Basically, you can’t deny rights’ holders an ability to opt out, and then say “hey, it’s not our fault that you didn’t opt out, now we can use your stuff to profit ourselves”.

  • YⓄ乙 @aussie.zone
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    Can we get some young politicians elected who has a degree in IT ? Boomers dont understand technology that’s why these companies keeps screwing the people.

  • Gutless2615@ttrpg.network
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    It’s not turning copyright law on its head, in fact asserting that copyright needs to be expanded to cover training a data set IS turning it on its head. This is not a reproduction of the original work, its learning about that work and and making a transformative use from it. An AI isn’t copying the original, its learning about the relationships that original has to the other pieces in the data set.

    • phillaholic@lemm.ee
      link
      fedilink
      arrow-up
      0
      ·
      1 year ago

      The lines between learning and copying are being blurred with AI. Imagine if you could replay a movie any time you like in your head just from watching it once. Current copyright law wasn’t written with that in mind. It’s going to be interesting how this goes.

      • jarfil@beehaw.org
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        1 year ago

        Imagine if you could replay a movie any time you like in your head just from watching it once.

        Two points:

        1. These AIs can’t do that; they need thousands or millions of repetitions to “learn” the movie, and every time they “replay” the movie it is different from the original.

        2. “learning by rote” is something fleshbags can do, and are actually required to by most education systems.

        So either humans have been breaking the copyright all this time, or the machines aren’t breaking it either.

        • SokathHisEyesOpen@lemmy.ml
          link
          fedilink
          arrow-up
          0
          ·
          1 year ago

          Well fleshbags have to pay several years worth of salary to get their education, so by your comparison, Google’s AI should too.

          • MachineFab812@discuss.tchncs.de
            link
            fedilink
            arrow-up
            0
            ·
            1 year ago

            Imagine thinking Public Education doesn’t count. Or that no one without a college degree ever invented anything useful. That’s before we get to your notion of “College SHOULD be expensive, for everyone, always”.

            The problem with education is NOT that some people pay less for theirs, or nothing at all, nor that some even have the audacity to learn quickly. AI could help everyone to have a chance to learn cheaply, even quickly.