Artificial Intelligence, at the crossroads between law and technology: an interview with Enrico Grandi, AI developer

1. Protection of works generated by Artificial Intelligence

In copyright law, there is debate about the possibility of protecting works generated through generative AI and about the ownership of such creations. A work of art is protectable only if it is “original,” i.e. it expresses the author’s personality through free and creative choices. If we take the example of photography, the author must be able to his/her choices regarding composition, focus, lighting, post-production, etc.

Today, we can generally conclude that while case law sometimes considers AI-assisted works as original, provided there is evidence of the author’s intervention beyond the mere “prompts,” it denies protection to works autonomously generated by AI, where the author only provides a “prompt” to the machine. This position is justified by the argument that “prompts” wouldn’t be specific and predictable instructions and wouldn’t be creative enough to express the personality of the author.

In light of your experience as a programmer of machine learning and artificial intelligence software, what is your opinion on the lack of creativity in “prompts”, the absence of expression of the user’s personality, as well as the unpredictability of the result? Is human intervention truly so limited and inconsequential?

My experience in developing software based on machine learning and AI technologies, particularly advanced language models, has led me to recognize the crucial importance and significant impact that crafting effective prompts has on the final output of AI models. This phenomenon, known as “prompt engineering,” is emerging as a specific field of study, aiming to optimize prompts to achieve better and more relevant results.

“Prompts” can vary widely in terms of detail and creativity. For example, a generic prompt like “create an image of a landscape” may produce a widely unpredictable and not particularly personalized result. In contrast, a detailed and well-conceived prompt, specifying the time of day, environmental features, and desired atmosphere, demonstrates how human intervention can be profoundly creative and reflect the user’s personality. This type of interaction suggests that the humanity and personality of the author can be conveyed through prompts, significantly influencing the generative output of AI.

The very nature of generative AI entails an element of unpredictability; providing the same prompt at different times or to different models can lead to variable results. This feature can be interpreted both as a limitation and as a source of creativity, as the variety of results can surprise and exceed initial expectations, leading to unique creations that reflect a dialogue between human intent and AI processing capabilities.

The creativity manifested in works generated by AI can thus be seen as a reflection of human creativity, which is intrinsic in the training data and instructions provided through prompts. This raises fundamental questions about the nature of creativity and originality. AI models, while capable of producing new works, are based on existing data, similar to what happens in human creative processes, which often reinterpret, and rework already known ideas.

The quote attributed to Picasso, echoed by Steve Jobs, “Good artists copy, great artists steal,” effectively illustrates this concept. Picasso referred to the adoption and adaptation of existing ideas to create something new and revolutionary, a common practice in many forms of creative expression, including technology. This principle is also applicable in the context of AI: AI models “steal” ideas from the vast data they are trained on to create something that may seem new and original. The challenge and opportunity for humans lie in knowing how to guide this process, integrating and adapting AI outputs in ways that reflect creative and original intentions.

In conclusion, human intervention in the creative process through AI goes far beyond simply formulating prompts. The specificity, creativity of prompts, and the ability to interact with the generated results can indeed reflect the creative and personal choices of the user. Therefore, the question of protecting works generated through AI should consider the depth of human interaction in the creative process. In light of this, it is essential that, in the future, jurisprudence can adequately recognize and evaluate human contribution, including innovative prompt use and critical selection of results, as distinctive elements of originality and creativity in AI-generated works.

2. “Certification” of the creative process of works generated through Artificial Intelligence

In light of these considerations, in the event that the court’s position changes, to demonstrate the originality, the author must be able to track the “prompts” and their dialogue with the machine.

I ask you then, how can one archive with a certain date their “dialogue” with artificial intelligence?

Saving one’s “dialogue” with AI with a certain date is an important issue to ensure the attribution and intellectual property of creations generated with artificial intelligence systems. Although technology to certify the veracity and dating of such dialogues is available, its actual use depends on a series of factors, including regulation and accessibility of such tools to various users.

Large companies offering artificial intelligence services certainly have the technical capabilities to implement data certification systems. In this context, the introduction of regulations explicitly recognizing companies as certifying entities for data generated through their services could be useful. This approach could facilitate the protection of creations generated with AI assistance, providing a clear and recognized mechanism to attest to their origin and the time of their creation.

For smaller AI models that may not have the support of large companies, the situation is more complex. In these cases, the development and implementation of dedicated public services for certifying dialogues with AI could offer a solution. Such services could act as neutral intermediaries, providing certifications of a certain date accessible to all users, regardless of the scale of their AI usage. In this sense, decentralized technologies, such as blockchain, represent a promising opportunity to address this challenge. The use of blockchain to record dialogues with AI could offer a secure and transparent method for certifying their date without the need for a central certifying authority.

3. Training of Artificial Intelligence, Copyright Infringement, and the “Data Mining” Exception

For copyright infringement, it is necessary for the work to be reproduced, i.e., fixed on a physical or digital support. In the first cases of copyright infringement during the training of generative AI occurring overseas, one of the defenses of big tech is to claim that AI training would be akin to human learning based on observation. Consequently, during training, no digital copies of works or databases would be made, as the software would be able to extract data from works or databases without making digital copies, not even partial ones.

How true are these claims, what is your opinion about it? And how easy is it to verify the sources of training in the absence of a “disclosure” obligation?

From a technical point of view, it is correct to say that the neural networks within AI models do not contain direct digital copies of works. Instead, they process and store abstract features and patterns detected in the training data, transforming them into a series of mathematical parameters and weights within the network. This process allows models to generate new and unique outputs that, while inspired by the original inputs, are not direct copies of them.

However, to “learn” in this way, it is undeniable that the original data must have been accessible to the AI system in some form. This means that, at some point, programmers and companies had to upload and use these works within their systems for training. This process raises significant legal and ethical questions regarding copyright and the use of protected works without the consent of rights holders.

Verifying the sources used for training AI models in the absence of a “disclosure” obligation is extremely difficult. Without mandatory transparency, there is no straightforward way to determine exactly which data were used to train a specific model. This opacity further complicates the issue of copyright responsibility in the era of AI.

In this regard, in Europe, there is discussion about the possibility of applying the “data mining” exception provided by Directive (EU) 2019/790 to generative AI. However, this exception does not apply if the copyright holder has exercised their right to “opt out,” i.e., has expressed their dissent “appropriately, for example through tools that allow automated reading in case of publicly available content.”

What is your advice for authors who wish to express through automated tools their dissent to the use of a specific work? Is it possible to imagine a sort of invisible “watermark” consisting of metadata to be integrated into every content published online?

In the AI Act, the mandatory publication of the data with which they were trained is mandatory for certain AI models. At the same time, there emerges a need to protect the rights of those who do not want their works to be used to train these technologies. Managing this dynamic could be relatively simpler for large-scale models developed by major companies, which are subject to greater control by authorities. However, the situation becomes more complicated for smaller models, developed by small companies or independent programmers, where ensuring respect for these rights could be more challenging. In this context, the responsibility and ethics of those who produce AI models become crucial. Education and awareness about respecting copyright should be promoted within the AI developer community.

Regarding the communication of dissent to the use of specific works, authors could consider integrating metadata, a kind of invisible “watermark,” into every content published online. These metadata could contain clear information about the conditions of use of the work, including the explicit prohibition of training AI models without the author’s consent. Implementing such metadata would require standardization and support from online platforms and publishing tools so that the metadata are recognized and respected by AI systems during the data collection phase. Furthermore, it would be desirable to develop technologies and protocols that allow automatic verification of compliance with the conditions imposed by the authors, facilitating the task of authorities in monitoring data usage.

4. Civil Liability of Artificial Intelligence

Talking about responsibility, can we speak about “decision-making autonomy” of AI? How much does the programmer influence the output and decisions made by AI, such as in generative AI or autonomous driving?

The reflection on the possibility of attributing decision-making autonomy to artificial intelligence (AI) and on the responsibility for the actions taken by these models raises fundamental issues in the field of technological ethics and regulation.

Currently, AI systems, even the most advanced ones, do not possess autonomy in a human sense. These systems operate following algorithms and processing data according to parameters and logics defined by their developers. What is sometimes perceived as “autonomy” is actually the result of complex mathematical processing, capable of simulating aspects of human logic or behavior but without consciousness, intentionality, or authentic emotions. The question of responsibility in AI use is particularly complex. The difficulty in interpreting the decision-making processes of AI algorithms, especially in large neural networks, makes it difficult to attribute responsibility clearly and unambiguously. This uncertainty extends to all actors involved: developers, technology providers, and end users. Developers and programmers, through technical choices and training criteria of the models, significantly influence the behavior of the algorithms. However, the challenging interpretation of these systems can make it complex, if not impossible, to predict or explain all possible responses of the AI to previously unencountered situations during the training phase. Furthermore, the issue of responsibility becomes even more complicated when AI is implemented in critical applications, such as autonomous driving systems, where incorrect decisions can have direct consequences on people’s safety. In these contexts, determining who is responsible – the algorithm creator, the vehicle manufacturer, the end-user, or a combination of these – is a challenge that requires careful legal and ethical examination.

In this landscape, avoiding anthropomorphizing AI is essential to maintain a clear understanding of its capabilities and limitations. Attributing human qualities such as awareness or intentionality to AI can lead to unrealistic expectations or erroneous evaluations of its functioning and reliability. The need for clear regulation and ethical guidelines is evident, to ensure that the use of AI is safe, ethical, and legally responsible. Such regulation should take into account the technical complexity of AI systems and the chain of responsibility, from conception to final use. Addressing these issues is essential to harnessing the benefits of AI while minimizing the risks associated with its implementation in society.