Overview and out perspective for artificial content detection solutions [1/2]

Rédigé par Alexis Léautier

 - 

07 March 2025


Recent excitement surrounding generative artificial intelligence systems has been accompanied by the realisation that these tools could lead to new risks in terms of intellectual property, data protection, reputational damage and deception of various kinds. In this article, we analyse various solutions proposed for recognising artificial content and look ahead to the future of these techniques.

Contents :

- Overview and perspective for artificial content detection solutions [1/2]

- Digital watermarks: a salutary transparency measure? [2/2]

 

Digital watermarking

At first glance, adding a watermark or digital watermark is a well-known technique when it refers to adding a mark or text to a document, as in the example in the image illustrating this article. However, this solution may in fact refer to different techniques in practice, depending on the purpose of the watermark and the object to be watermarked. It can be a simple layer superimposed on an image, as in the illustration in this article, or an imperceptible watermark resulting from a more sophisticated treatment. Documents, databases, AI models or even the output of these models could be watermarked, with the aim of identifying their origin, as well as identifying that these objects have been used by, or come from, generative AI. Initiatives from private players are also beginning to emerge, such as Google's SynthIDGoogle's or Microsoft's watermarking of Bing productions. All these techniques are covered in a separate article, which explains how they work, and their advantages and disadvantages.

In the case of artificial content detection, these methods are relatively effective for documents containing a large amount of information, such as videos, images or sound recordings. However, they are much less robust when it comes to detecting text, particularly if it is short.

 

Ex-post detection methods

Digital watermarking is one solution for detecting artificially generated text, but it has significant technical limitations and requires the cooperation of the designer of the generative AI. Alternatives do exist, however. With regard to the detection of artificial images and videos, certain techniques described in Verdoliva et al., 2020, in particular, seem promising. Among the techniques identified, some are based on the detection of residues resulting from the manipulation or generation of an image, on the analysis of the entire image resulting in a classification similar to that encountered in adversarial networks (GAN), or on the analysis of facial features in the case of deepfakes. These methods rely as much on expert systems as on artificial intelligence techniques based on learning. As far as artificial text detection is concerned, however, the picture is more negative.

Pegoraro et al (2023) carried out a study of existing methods for detecting artificial text for which no measures had been taken during its generation. The study evaluates around twenty detection tools, including tools developed by the OpenAI company itself and by third parties such as the ZeroGPT tool. The study reached two main conclusions:

  • Existing methods are unable to detect artificial content robustly: the recall measured is close to 50% at most, which means that the tools are wrong one time out of two on artificial text,
  • the methods evaluated cannot be generalised to all LLMs: since they are based on specific training on the output of a specific LLM, they fail to detect the output of other language models.

The study also assumes that no changes have been made to the generated text, which makes the task easier, but seems unrealistic in a context where someone would want to use artificial text maliciously. In addition, these tools have other shortcomings. In particular, a recently observed flaw is the discrimination of people writing in a language other than their mother tongue, with the text they would have generated being more frequently identified as artificial, according to an article in The Guardian.

 

Recommendations from international institutions for detecting artificial content

Although there is no legal vacuum surrounding generative AI, as explained in the article "What regulations are needed for the design of generative AI", France does not yet have any specific provisions for this category of system. Similarly, internationally, while most of the principles and recommendations on artificial intelligence produced by the forums do mention the need to guarantee the transparency of AI systems, these texts remain technologically neutral and do not recommend any particular technology that meets this purpose.

The OECD recommendation on AI thus asks AI players to commit to ensuring transparency and responsible disclosure of information related to AI systems, and in particular to inform stakeholders of their interactions with AI systems (which can be interpreted as intended for use as chatbots, as also proposed by the CNPEN in its opinion on the subject), but without specifying anything about how to implement this recommendation in practice.
 The Unesco recommendation on the ethics of AI also requires AI players to inform users when a product or service is provided directly or via an AI system.

These recommendations, adopted in 2019 and 2021 respectively, do not seem to measure up to the challenges posed by generative AI in terms of content manipulation, as these systems have only recently been made available to the public and integrated into numerous applications. The definition of AI proposed by the OECD does not include generative AI (the definition is currently being revised). The Council of Europe's Convention on Artificial Intelligence, which is currently being negotiated, takes generative AI into account, but its article on the transparency of AI systems does not recommend any particular technology.

The G7 digital ministers are currently attempting to draw up a code of conduct for developers of advanced AI systems, as part of the Hiroshima process on generative AI. One of the guiding principles that will normally form part of this future code of conduct is the development and deployment of mechanisms such as digital watermarking and other techniques enabling end users to identify content generated by AI systems. 

Last July, the White House obtained a series of voluntary commitments from some of the leading companies in the field of artificial intelligence - Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI - aimed at ensuring the development of safe and transparent artificial intelligence systems. Among these commitments, these companies will have to develop technical mechanisms, such as digital watermarking, enabling end users to know when content is produced by generative AI.

It is also interesting to note that the provisional measures for the management of generative artificial intelligence services recently adopted by China require the labelling of images, videos and other content generated by AI, in accordance with a regulation that came into force at the beginning of 2023 governing deepfakes/deep synthesis (administrative provisions on deepfakes in Internet information services, approved on 25 November 2022). China is the first country to have legally regulated hypertructuring, defined as technologies that use generative algorithms, such as deep learning and virtual reality, to create text, images, audio, video, virtual scenes or other information. Hypertructuring service providers must therefore use effective technical marking measures that do not affect the use of content generated by users of the service. The following services must therefore be visibly labelled by the provider - as they are likely to cause confusion or misunderstanding on the part of the public: 

  • Intelligent dialogue/writing and other services that simulate physical people to generate or modify text;
  • Editing services that generate speech such as synthesized human voices, imitated voices or that significantly modify personal identity characteristics;
  • Editing services such as face generation, replacement, or manipulation, posture manipulation and other image and video generation of persons or significant changes in personal identity characteristics;
  • Services for generating or editing immersive simulation scenes;
  • Other services whose function is to generate or significantly modify information content.

For all other services, the provider must include a function that allows the content generated to be labelled in a visible manner and invite users of the service to use this function. It is therefore forbidden for any organisation or individual to use technical means to remove these labels.

 

What does the future hold for digital watermarking ?

Given this panorama, it seems that digital watermarking is the most robust solution for detecting artificial content. However, certain risks associated with the use of digital watermarking remain, although they can sometimes be addressed by specific measures for clearly identified uses. For others, digital watermarking may still lack maturity. Here are some thoughts on the future of this technique.

Protection of personal data when it may be used to train AI models

This protection seems possible insofar as they could be watermarked. This practice, which hosting providers (such as social networks or database providers) seem best placed to implement in practice, would make it possible to ensure traceability of users' personal data and prevent it being collected by harvesting and then reused to train models without it being possible to prove this use ex-post. It could complement legal measures (licences, conditions of use) and other technical measures (such as configuration files telling AI robots whether or not to use certain data).

Transparency of artificial content

By watermarking artificial content, providers of generative AI would ensure that this content could not be used in a misleading context. The hosts and re-users of the models could in turn implement the necessary measures to ensure that the watermarking is present and integrated into the content generated and brought to the attention of the end users, in particular by means of processes to make it visible when it is not and by measures to inform the public about the techniques used.  However, watermarking would be an ineffective measure to date as far as artificial text is concerned.

Damage to a person's reputation caused by artificial content

The risk of generative AIs producing inaccurate content about people is now well established, whether it be an error, commonly referred to as "hallucinations", or a deliberate inaccuracy, as in the case of deepfakes, for example with pornographic or political content. In this context, digital watermarking would make it possible to prove the artificial nature of content, thereby offering people a means of recourse. However, here again, the methods for watermarking artificial texts may not be sufficiently mature.

Harmonising practices

There are currently few initiatives to standardise digital watermarking. In order to harmonise practices and make them more widespread, an inventory of best practices seems crucial. To this end, the development of standards for the techniques mentioned above would be particularly beneficial. The standardisation of the format of artificially generated content made possible by these standards could facilitate the reading of digital watermarks and promote the implementation of harmonised interpretation tools, particularly by website publishers. Control over the watermarking detection function could also be handed over to a trusted entity, which would be responsible for it, to prevent it being divulged and used to alter watermarks. This is not yet a trend, but it could be seen by suppliers of generative AI in conjunction with website publishers as a solution enabling them to comply with the regulation of such content.



Article rédigé par Alexis Léautier , Ingénieur Expert