Strategies Against AI Deepfakes | en

Deepfake Engines: A Technical Analysis

Deepfakes are rapidly evolving, posing unprecedented challenges to social trust and information security. Since the ability to prevent the proliferation of deepfakes hinges on a comprehensive understanding of the technology behind them, this article will explore how to prevent artificial intelligence deepfakes.

At the heart of deepfakes lie generative models, a type of AI capable of learning from vast datasets and generating realistic images, videos, and audio. In recent years, Generative Adversarial Networks (GANs) have evolved into diffusion models, which are even more potent. Therefore, a technical analysis of these generative engines is necessary to create a robust prevention framework.

Adversarial Games: Generative Adversarial Networks (GANs)

A GAN consists of two neural networks: a generator and a discriminator. The generator’s task is to create synthetic data that mimics real-world data. It starts with random input, often called a latent vector, and tries to transform this into a coherent output. The discriminator, on the other hand, acts as a classifier, evaluating the data to determine whether it is real (from a genuine training dataset) or fake (created by the generator).

The training process involves a continuous feedback loop between the two networks, akin to a zero-sum game. The generator creates a fake image and passes it to the discriminator, which also receives real images from the training set. The discriminator then predicts the authenticity of each image. If the discriminator correctly identifies the generator’s output as fake, it provides feedback. The generator uses this feedback via backpropagation to adjust its internal parameters, so it can produce a more convincing image during the next iteration. At the same time, the discriminator adjusts its own parameters to better spot fakes. This adversarial competition continues until the system reaches an equilibrium point, sometimes called the Nash equilibrium, where the generator’s output is so realistic that the discriminator can no longer reliably distinguish it from real data and guesses at roughly 50% accuracy.

GANs have proven to be effective at generating synthetic media and have formed the basis for many impactful deepfake models. Architectures such as Deep Convolutional GANs (DCGANs) have introduced key improvements by replacing pooling layers and using batch normalization to improve stability. NVIDIA’s StyleGAN and its successors StyleGAN2 and StyleGAN3 have achieved unprecedented levels of photorealism in facial generation by fixing feature artifacts and advancing the model architecture. Other variants like CycleGAN have enabled style transfer tasks and have been widely utilized in applications like Face App to change a person’s perceived age.

Despite their power, GANs are notoriously difficult to train. The delicate balance between the generator and discriminator is easily disrupted, leading to training instability, slow convergence, or a critical failure mode known as “mode collapse.” Mode collapse occurs when the generator finds a weakness in the discriminator and exploits it by only generating a limited variety of outputs that it knows can fool the discriminator, failing to capture the true diversity of the training data. These inherent challenges, and the subtle artifacts they often produce, became aprimary target for early deepfake detection systems.

Reversing Chaos: Diffusion Models

The most recent technological innovation in generative AI has decisively shifted towards a new class of models: diffusion models. Inspired by non-equilibrium thermodynamics, diffusion models operate on fundamentally different principles than the adversarial competition of GANs. They are probabilistic generative models that can generate exceptionally high-quality and diverse data by learning to reverse a gradual corruption process.

The mechanism of diffusion models is a two-phase process:

Forward Diffusion Process: This stage systematically and incrementally adds small amounts of Gaussian noise to an image over a series of time steps (e.g., T steps). This is a Markov chain process, where each step is conditioned on the previous one, gradually degrading the image until, at the final time step T, it becomes indistinguishable from pure, unstructured noise.
Reverse Denoising Process: The core of the model is a neural network (often based on a U-Net architecture) that is trained to reverse this process. It learns to predict the noise that was added at each time step during the forward pass and subtract it. Once trained, the model can generate new, high-quality images by starting from a sample of random noise and iteratively applying this learned “denoising” function to step backward through the time steps, transforming chaos into a coherent sample from the original data distribution.

This iterative refinement process enables diffusion models to achieve levels of photorealism and diversity that surpass even the best GANs. Their training process is also significantly more stable than that of GANs, avoiding issues such as mode collapse and resulting in more reliable and diverse outputs. This technological advantage has made diffusion models the foundation for the most prominent and powerful generative AI tools today, including text-to-image models like OpenAI’s DALL-E 2, Google’s Imagen, and Stability AI’s Stable Diffusion, as well as text-to-video models like OpenAI’s Sora. The widespread availability and exceptional output quality of these models have dramatically escalated the deepfake threat.

Methods of Manipulation

Whether GAN or diffusion model based, the underlying generative engines are applied via several specific techniques to create deepfake videos. These methods manipulate various aspects of the target video to achieve the desired deceptive effects.

Re-enactment: This technique transfers facial expressions, head movements, and speech-related motions from a source actor onto a target subject in a video. The process typically involves three primary steps: first, tracking facial features in both the source and target videos; second, aligning these features with a generic 3D face model using a consistency metric; and third, transferring the expressions from source to target, followed by subsequent refinements to enhance realism and coherence.
Lip-Syncing: Lip-syncing deepfakes are specifically devoted to manipulating speech, primarily using audio input to generate realistic mouth movements. The audio is converted into dynamic mouth shapes and textures, which are meticulously matched and blended with the target video to create the illusion that the target person is speaking the input audio.
Text-Based Synthesis: This highly sophisticated approach modifies video based on a textual script. It functions by analyzing the text into its constituent phonemes (units of sound) and visemes (the visual representation of speech sounds). These are then matched with corresponding sequences in the source video, and the parameters of a 3D head model are used to generate and smooth lip movements to match the new text, allowing one to edit the words a person appears to be saying, verbatim.

The technological evolution from GANs to diffusion models is more than an incremental improvement; it is a paradigm shift that fundamentally alters the strategic landscape for deepfake prevention. While powerful, GANs have known architectural weaknesses, such as training instability and mode collapse, which often result in predictable and detectable artifacts in the image’s frequency domain. Accordingly, an entire generation of detection tools were built specifically to identify these GAN-specific fingerprints. Diffusion models, however, generate more diverse, realistic, and statistically-authentic outputs because they’re more stable to train and are therefore lacking in many of their predecessors’ obvious flaws.

As a result, a significant portion of the existing deepfake detection infrastructure is rapidly becoming obsolete. Studies demonstrate a “drastic performance drop” when detectors trained on GAN-generated imagery are applied to content from diffusion models. Notably, detectors trained on diffusion model imagery can successfully identify GAN-generated content, but not vice versa, suggesting that diffusion models represent a more sophisticated and challenging class of forgeries. Indeed, this has effectively reset the technological arms race, demanding a re-engineering of defensive strategies to address the unique and subtler characteristics of diffusion-generated media.

Furthermore, the “black box” nature of these generative models increases the complexity of source prevention efforts. Both GANs and diffusion models operate in an unsupervised or semi-supervised fashion, learning to mimic the statistical distribution of a dataset without explicit semantic labels. Rather than learning “what a face is” in a way that a human could understand, they learn “what pixel patterns are possible in a face dataset.” This makes it exceptionally difficult to program constraints directly into the generation process (e.g., “do not generate harmful images”). The model is simply optimizing a mathematical function: either fooling the discriminator or reversing the noise process. This means prevention cannot rely on internally policing the core algorithms. The most viable interventions must occur either before generation (by controlling training data) or after generation (through detection, watermarking, and provenance), as the act of creation itself is inherently resistant to direct governance.

Comparative Analysis of Generative Engines

Understanding the strategic differences between GANs and diffusion models is crucial for any stakeholder, from policymakers to corporate security officers. The shift in technological dominance from the former to the latter has profound impacts on detection difficulty, the potential for deception, and the overall threat landscape.

Feature	Generative Adversarial Networks (GANs)	Diffusion Models	Strategic Implications
Core Mechanism	Generator and discriminator compete in a zero-sum game.	Neural network learns to reverse a gradual “noising” process.	Iterative refinement process of diffusion yields higher fidelity and fewer structural errors.
Training Process	Notoriously unstable; prone to “mode collapse” and slow convergence.	Training process is stable and reliable, but computationally intensive.	Lower barrier to achieving high-quality results with diffusion models is democratizing the threat.
Output Quality	Capable of generating high-quality images, but may contain subtle artifacts.	Currently highest levels of photorealism and diversity; often indistinguishable from real photos.	Forgeries become more convincing, eroding the “seeing is believing” heuristic and challenging human detection.
Detectability	Older detection methods are often tuned to find GAN-specific artifacts (e.g., frequency imbalances).	Rendering many GAN-based detectors obsolete. Images contain fewer obvious artifacts and more closely match real data statistics.	Deepfake “arms race” has been reset. Detection R&D must pivot to focus on diffusion-specific signatures.
Notable Models	StyleGAN, CycleGAN	DALL-E, Stable Diffusion, Imagen, Sora	The most powerful and widely-used tools are now diffusion-based, accelerating the threat.

Digital Immune System: A Comparative Analysis of Detection Methods

In response to the surge in synthetic media, a diverse field of detection methods has emerged, forming a nascent “digital immune system.” These techniques range from forensic analysis of digital artifacts to novel approaches that probe for underlying biological signals. However, the effectiveness of this immune system is continuously challenged by the rapid evolution of generative models and the use of adversarial attacks designed to evade detection. The ongoing struggle between creation and detection is a “Red Queen” paradox, where defenders must continuously innovate simply to maintain the status quo.

Forensic Analysis of Digital Artifacts

The most established category of deepfake detection involves the forensic analysis of digital artifacts: the subtle defects and inconsistencies left behind by the generation process. These flaws and inconsistencies tend to be difficult and undetectable to the naked eye, but can be identified by specialized algorithms.

Visual and Anatomical Inconsistencies: Early generative models, and even some current ones, struggle to perfectly replicate the complexities of human anatomy and real-world physics. Detection methods exploit these shortcomings by analyzing specific anomalies in the media. These include unnatural blink patterns, i.e., excessive, insufficient, or entirely absent blinking (often due to a lack of closed-eye images in the training data), robotic or inconsistent eye movements, and constrained lip or mouth shapes where the bottom teeth are never visible. Other indicators are the lack of subtle nostril dilation during speech, lighting and shading inconsistencies that do not match the surrounding environment, and erroneous or missing reflections on eyeglasses, or other reflective surfaces.
Pixel and Compression Analysis: These techniques work at a lower level, examining the digital structure of the image or video. Error Level Analysis (ELA) is a method for identifying regions in an image that have been compressed at different levels. Because manipulated regions are often resaved or recompressed, they may display different error levels than the original parts of the image, highlighting forgeries. Closely related to this is edge and blending analysis, which scrutinizes the boundaries and contours between synthetic elements (e.g., a swapped face) and the real background. These regions may expose manipulation through signs like inconsistent pixelation, unnatural sharpness or blurring, and subtle differences in color and texture.
Frequency Domain Analysis: Rather than analyzing pixels directly, these methods transform the image into its frequency components, looking for unnatural patterns. Because GANs have a generator with upsampling architecture, they often leave behind characteristic spectral artifacts that create periodic patterns not present in real images. While effective for most GANs, this method is less successful against diffusion models, which generate images with a more natural frequency profile. However, some research suggests that diffusion models may still exhibit detectable mismatches in high-frequency details compared to real images, offering a potential avenue for detection.

Biological Signal Analysis: The Deepfake “Heartbeat”

A newer, and highly promising, area within deepfake detection involves analyzing the presence of authentic biological signals within the media. The core premise is that while generative models are becoming increasingly adept at replicating visual appearances, they cannot simulate the underlying physiological processes of a living person.

The leading technique in this area is remote photoplethysmography (rPPG). This technique uses standard cameras to detect subtle, periodic changes in skin color that occur as the heart pumps blood into the superficial blood vessels of the face. In a real video of a person, this produces a faint but consistent pulse signal. In a deepfake, this signal is often absent, distorted, or inconsistent.

Detection methods involves several steps:

Signal Extraction: rPPG signals are extracted from multiple regions of interest (ROIs) on the face in the video.
Signal Processing: The raw signals are cleaned of noise and then processed (typically using Fast Fourier Transform (FFT)) to analyze their time and spectral domain characteristics. The FFT can reveal the dominant frequencies in the signal, which correspond to the heart rate.
Classification: A classifier (e.g., a CNN) is trained to distinguish between the coherent, rhythmic patterns of a real heartbeat and the noisy, inconsistent, or absent signals found in deepfakes.

This methodology has achieved very high detection accuracy in controlled experimental settings, with some studies reporting accuracies as high as 99.22%. However, this method has a critical vulnerability. More advanced deepfake techniques, particularly those involving re-enactment, can inherit the physiological signal from a source or “driver” video. This means that the deepfake may exhibit a perfectly normal and consistent rPPG signal. It simply happens to be the heartbeat of the source actor, not the person depicted in the final video. This discovery challenges the simple assumption that deepfakes lack physiological signals and raises the bar for detection. Future methods must move beyond merely checking for the presence of a pulse and instead validate the physiological consistency and identity-specific characteristics of that signal.

The Detection Arms Race: Challenges from Diffusion Models and Adversarial Attacks

The field of deepfake detection is defined by a relentless arms race. Detection methods are continuously developed, and generative models continuously evolve to overcome them. The recent rise of diffusion models and the utilization of adversarial attacks are two of the most significant challenges posed to modern detectors.

Failure to Generalize: A major weakness of many detection models is their inability to generalize. A detector trained to identify forgeries from a specific generative model (e.g., StyleGAN2) or on a specific dataset often fails when confronted with new manipulation techniques or different data domains. Diffusion models make this problem particularly acute. Because their outputs contain fewer obvious artifacts, are more diverse, and more closely match the statistical properties of real images, they can effectively evade detectors designed for GANs. To address this, researchers are developing new and more challenging benchmark datasets that incorporate state-of-the-art diffusion deepfakes to drive the creation of more robust and generalizable detectors.
Adversarial Attacks: Even highly accurate detectors are vulnerable to direct subversion via adversarial attacks. In this scenario, an attacker introduces tiny, imperceptible perturbations to the pixels of a deepfake image. While these changes are invisible to the human eye, they are specifically designed to exploit weaknesses in the detector’s neural network, causing it to misclassify the fake image as real. This threat exists in both “white box” settings (where the attacker has complete knowledge of the detector’s architecture) and the more realistic “black box” setting (where the attacker can only query the detector and observe its output).

In response, the research community is focusing on developing next-generation detectors with enhanced resiliency. Key strategies include:

Training Data Diversity: Augmenting training datasets to include a wide variety of forgeries from both GANs and diffusion models, as well as diverse image domains, has proven to increase generalizability.
Advanced Training Strategies: Novel techniques such as “momentum contrastive learning” are being explored to help the model train more effectively on heterogeneous datasets by weighting samples based on how difficult they are to classify.
Robust Architectures: New architectures are being designed to be inherently more resistant to attack. One promising approach is the use of disjoint ensembles, where multiple models are trained on different, non-overlapping subsets of the image’s frequency spectrum. This forces the attacker to find perturbations that can fool multiple models simultaneously, a far more difficult task. Other hybrid approaches fuse features from both the spatial and frequency domains to build a more comprehensive model of the data.

The constant back-and-forth between generative and detection techniques indicates that any static defense is doomed to obsolescence. As generative models evolve to eliminate signs like blink anomalies or GAN artifacts, detectors must shift to more subtle signals, such as high-frequency mismatches or rPPG signatures. In turn, generative models can be trained to mimic those signals, as seen in the inheritance of rPPG from source videos. This perpetual loop indicates that a prevention strategy that relies solely on reactive detection is engaged in an expensive and likely unwinnable arms race.

The most enduring detection strategies will likely be those that exploit fundamental gaps between digital simulations and physical reality. While visual artifacts are flaws in the simulation that can be progressively patched with better algorithms and more compute power, it is far more difficult for AI to model the emergent properties of biology and physics from first principles. A generative model does not “understand” the human cardiovascular system. It merely learns to replicate pixel patterns associated with a face. While it may be trained to mimic the visual results of a heartbeat, generating a physiologically consistent and accurate signal for a new identity from scratch would require modeling an entire biological system, a challenge of a higher order. Accordingly, the most robust detection research will focus on these “physics gaps,” encompassing not only rPPG but potentially other indicators such as subtle breathing patterns, involuntary pupil dilation, and micro-expressions, all of which are controlled by complex biological processes that are difficult to simulate with high fidelity.

Establishing Digital Trust: Proactive Prevention Through Watermarking and Provenance

Given the inherent limitations of purely reactive detection strategies, a more resilient and sustainable approach to deepfake mitigation involves proactive measures. These techniques are designed to establish trust and accountability within the digital media ecosystem, from the moment of creation. Rather than focusing on identifying forgeries after they have been created and disseminated, this paradigm shifts the emphasis to validating the authenticity and origin of legitimate content. The two leading techniques in this space are forensic digital watermarking and blockchain-based content provenance.

Forensic Digital Watermarking: Invisible Signatures

Forensic digital watermarking is a proactive technique that embeds a unique and imperceptible identifier directly into digital content such as images, videos, or documents. Unlike visible watermarks (e.g., a logo overlaid on an image), forensic watermarks are hidden within the data of the file itself and are designed to be exceptionally robust. A well-designed forensic watermark can survive common file operations, including compression, cropping, resizing, color adjustments, and even screen-captures or screen-to-camera captures.

In the context of deepfake prevention, forensic watermarking serves several key functions:

Source Tracking and Accountability: By embedding unique information that identifies the creator, user, or distribution channel, watermarks can be used to trace the source of a malicious deepfake if it is leaked or misused. For example, in a video-on-demand (VOD) or enterprise environment, the system can use A/B watermarking to provide each user with a slightly different, uniquely watermarked version of the video. If a copy appears online, thewatermark can be extracted to identify the exact source of the leak, providing strong evidence for legal or administrative action.
Authenticity Verification: Watermarks can serve as a seal of authenticity for official content. Government agencies, corporations, or news organizations can embed unique watermarks into their legitimate media. This allows verification of authentic communications and helps to detect and block attempts at impersonation using deepfakes.
Lifecycle Tracking: Proponents suggest that watermarks can be integrated at various stages of the content lifecycle. Watermarks can be embedded upon upload to social media, messaging apps, or even by the deepfake creation application itself to create a traceable record of how manipulated content is generated and disseminated.

Advanced watermarking techniques are being developed specifically to counter deepfake manipulation. One novel method involves designing a neural network that embeds the watermark directly into the identity features of a face image. This makes the watermark highly sensitive to face-swapping manipulations, as that operation inherently alters the identity features and therefore corrupts the watermark, while remaining robust to traditional image modifications like compression or resizing.

Despite their promise, watermarks face significant challenges. First, watermarks are not invulnerable. Research has shown that it is possible to “dissolve” or reconstruct images using adversarial techniques (particularly those using diffusion models), effectively removing the embedded watermark. Second, and more importantly, the effectiveness of watermarking as a systemic solution depends on widespread adoption. Currently, there are no laws or regulations requiring deepfake applications or social platforms to implement watermarking, which has left its usage voluntary and fragmented.

Blockchain and Content Provenance: An Immutable Ledger

A complementary proactive strategy is to use blockchain technology to establish content provenance: a reliable, verifiable, and tamper-proof record of a media file’s origin and lifecycle history. This approach leverages the core properties of blockchain—its decentralization and immutability—to create a permanent, public record of authenticity.

The approach to establish blockchain-based provenance typically involves three steps:

Content Fingerprinting: Upon initial creation or upload to a participating platform, a unique cryptographic hash is generated from the file’s data. This hash acts as a digital fingerprint; any change to the file, no matter how small, will result in a completely different hash.
Blockchain Recording: This unique hash, along with key metadata (e.g., the creator’s verified digital identity, timestamp, and other relevant details) is recorded as a transaction on the blockchain ledger. Because the ledger is distributed and cryptographically secured, this record is effectively permanent and cannot be altered or deleted.
Ongoing Verification: At any point in the future, anyone or any system can verify the authenticity of that media. They simply compute the current hash of the file in question and compare it to the original hash stored on the blockchain. If the hashes match, it proves that the file has not been altered since the moment it was registered. If the hashes do not match, the file has been tampered with.

This system creates a transparent and verifiable “chain of custody” for digital content. It allows creators to digitally sign their work with their private key, essentially staking their reputation on its authenticity. Platforms can integrate this system to automatically cross-reference content against the blockchain before allowing it to go live and flagging or blocking media that lack a valid provenance record. Research into hybrid systems that combine blockchain-based provenance with digital watermarking has demonstrated that they can achieve extremely high detection accuracies, potentially reaching 95%.

However, like watermarking, blockchain-based provenance also has limitations. Its primary weakness is that it relies on a network effect. The system is only valuable if creators, technology platforms, and consumer devices adopt it as a common standard. Furthermore, it is important to note that this approach verifies the integrity of a digital file from the moment of registration, not the authenticity of the content itself. A creator could register a deepfake on the blockchain. The system would only prove that this particular fake file has not been altered since it was originally registered.

The use of these proactive techniques marks a critical shift in the deepfake mitigation strategy. Rather than engaging in a reactive arms race of “detecting the fake,” these methods aim to create a system of “verifying the real.” The arms race is characterized by continuously evolving threats and countermeasures, where new generative models can render sophisticated detectors obsolete overnight. In contrast, proactive measures are applied at or prior to the point of publication. The goal is no longer to prove that a piece of media is fake by discovering a flaw, but to prove that it is real by confirming the presence of a valid watermark or finding a matching entry on an immutable ledger.

This shift has profound implications for the entire information ecosystem. In a world increasingly saturated with synthetic media (estimates suggest that within several years, as much as 90% of online content may be synthetic), the default assumption for consumers and systems must shift from “true until proven fake” to “unverified until proven real.” Proactive techniques like watermarking and provenance provide the technological foundation for this new paradigm. They shift the burden of verification onto creators of legitimate content to validate their works, rather than placing upon consumer the impossible burden of debunking a deluge of potential forgeries.

However, the biggest obstacle to this more resilient future is not technical, but a large-scale coordination problem. The technologies for watermarking and blockchain provenance already exist, but their effectiveness relies entirely on achieving network effects through widespread, standardized adoption. A watermark is useless without a standard way to read it and a blockchain is of little value if major platforms do not query the ledger. For these systems to work at societal scale, they must be integrated at a foundational level: in cameras, in editing software, in social media upload protocols, and in the browsers and applications people use every day. This will require massive industry-wide collaboration, potentially driven by the regulatory mandates and incentives. The success of industry consortia, such as the Content Authenticity and Provenance Alliance (C2PA), which promotes open technical standards for content provenance, will be a crucial bellwether of this strategic shift.

Rule of Law in the Synthetic World: Global Regulatory and Legal Frameworks

As deepfake technologies have pervaded society, governments worldwide have struggled to regulate their use and mitigate their harms. Responses have varied widely, reflecting distinct legal traditions, political systems, and social priorities. A global consensus remains elusive, leading to a fragmented legal landscape across nations and regions. This divergence creates a complex compliance environments for global technology companies and highlights differing philosophical approaches to balancing innovation, free expression, and public safety.

United States: A Patchwork of Federal and State Action

The United States’ approach to deepfake regulation is characterized by a combination of targeted federal laws and a patchwork of broader state-level legislation, all constrained by the strong First Amendment protections on freedom of speech.

At the federal level, the most significant piece of legislation is the “Taking Images and Videos Offline For Everyone and to Keep IT Down Act” (TAKEOFF IT DOWN Act) enacted in May 2025. This law was passed with rare bipartisan support, driven largely by the escalation of AI-generated non-consensual intimate imagery (NCII), or “revenge porn.” The Act is the first federal statute to formally criminilzes the distribution of such content, including AI-generated deepfakes. Its key provisions include:

Criminalization: Prohibits the knowing distribution of non-consensual intimate images, with penalties of up to two years’ imprisonment.
Notice and Takedown Mandates: Requires online platforms that host user-generated content to establish procedures for the removal of flagged NCII content within 48 hours and the deletion of duplicates.
Enforcement: Grants the Federal Trade Commission (FTC) the authority to enforce these regulations against non-compliant platforms.

Other existing federal laws may also be leveraged to address deepfake-related harms. The National Defense Authorization Act (NDAA) includes provisions to address the use of deepfakes in foreign disinformation campaigns. The FTC Act’s prohibition on “unfair or deceptive acts or practices” can be used to target fraud and scams perpetrated by deepfakes, while federal wire fraud statutes can be applied to schemes using fraudulent audio or video.

At the state level, all 50 states and the District of Columbia have enacted laws against NCII, and many have updated those laws to explicitly include deepfakes. States have also actively regulated deepfakes concerning election integrity. Various state laws now mandate clear disclaimers on AI-generated political advertisements or prohibit the distribution of “materially deceptive media” intended to influence elections, particularly within a defined period before voting begins.

The core challenge in US legal debates is how to strike a balance between regulating harmful content and protecting First Amendment rights. Critics of the TAKEOFF IT DOWN Act, for example, caution that its provisions could be abused by malicious actors to demand the takedown of legitimate speech (e.g., parodies or political commentary), and that a 48-hour takedown requirement could place undue burdens on smaller platforms. This has driven legal scholars to explore the application of existing legal frameworks such as Right of Publicity (ROP), which prohibits the unauthorized commercial use of an individual’s likeness, as a potential middle ground that could address harms without infringing on protected speech.

European Union: A Comprehensive, Risk-Based Approach

In contrast to the United States’ targeted, harm-specific approach, the European Union has adopted a broad, comprehensive, and risk-based framework for governing all forms of AI, including the technologies that enable deepfakes. This is primarily achieved through two landmark pieces of legislation: the AI Act and the Digital Services Act (DSA).

The EU AI Act, formally approved in March 2024, is the world’s first comprehensive law on artificial intelligence. It establishes a classification system that regulates AI systems based on the level of risk they pose. Rather than outright banning deepfakes, the Act imposes strict transparency obligations on AI systems that create them. The key provisions include:

Disclosure Requirements: Users must be informed when they interact with content that has been artificially generated or manipulated. All deepfake content (broadly defined as manipulated images, audio, or video of persons, objects, places, or events) must be clearly labeled as such.
Technical Marking: Providers of AI systems that generate synthetic content must ensure that their outputs are marked in a machine-readable format (e.g., via watermarks or metadata) so that they are technically detectable as AI-generated.
Exemptions: These transparency obligations do not apply to content that is manifestly parody or satire, or that is authorized for legitimate purposes such as law enforcement.

The AI Act is complemented by the Digital Services Act (DSA), which regulates the responsibilities of online platforms. Under the DSA, platforms that host user-generated content, including deepfakes, are required to have clear and transparent content moderation policies and offer easily accessible notice-and-takedown mechanisms for illegal content. The EU’s strengthened Code of Practice on Disinformation (which is now co-regulatory and underpinned by the DSA) can impose significant fines (up to 6% of global revenue) against Very Large Online Platforms that fail to adequately address systemic risks such as the spread of disinformation (including deepfakes).

Asia Pacific Approaches: A Spectrum of Control

The regulatory measures across the Asia Pacific region range from comprehensive state control in China to targeted criminal statutes in South Korea and Australia.

China: China has implemented one of the world’s most stringent and comprehensive regulatory frameworks for synthetic media through the “Provisions on the Administration of Deep Synthesis Internet Information Services,” which took effect in January 2023. Driven by social stability priorities, the law mandates that all users of deep synthesis services undergo real-name identity authentication; obtain explicit consent from any individuals being represented; and conspicuously label all AI-generated content. These regulations grants the state broad control over the entire lifecycle of deepfake creation and distribution.
South Korea: South Korea has adopted an aggressive legislative approach focused on specific, high-profile harms. Amendments to the Public Official Election Act prohibit the use of deepfakes for political purposes within 90 days of an election, with severe penalties including imprisonment and hefty fines. In addition, the country’s Act on Special Cases Concerning the Punishment of Sexual Crimes criminalizes the creation, distribution, and even the knowing possession or viewing of non-consensual sexual deepfakes.
Singapore: Singapore’s approach focuses on combating online falsehoods and safeguarding electoral integrity. The Protection from Online Falsehoods and Manipulation Act (POFMA) grants the government broad powers to issue correction or takedown directions against any online content, including deepfakes, deemed to be harmful to the public interest. More specifically, Singapore’s Elections (Online Advertising) Act prohibits the publication of deepfake content depicting political candidates during elections.
Australia: Australia has primarily addressed the deepfake threat through federal criminal statutes. The Criminal Code Amendment (Deepfake Sexual Material) Act 2024, which will take effect in September 2024, creates a new, standalone federal offense for the non-consensual sharing of sexually explicit material via carriage services and explicitly includes material created or altered using artificial intelligence. This federal law complements existing state-level criminal offenses and a civil penalties regime administered by the eSafety Commissioner under the Online Safety Act 2021.

Through these various approaches, it is apparent that legal and regulatory divergence exists worldwide, reflecting fundamentally different societal priorities. For example, the US model prioritizes the protection of free speech and therefore only targets certain egregious harms (such as non-consensual intimate images), while avoiding broader content restrictions. In contrast, the EU framework focuses not on the content of deepfakes but on the risks posed by the underlying AI systems; its primary tool is mandated transparency, intended to empower individuals to make informed judgments rather than outright banning content. Chinese laws and regulations are the most restrictive, reflecting a governance model that prioritizes control of information and the maintenance of social stability above all else. This fragmentation poses significant compliance challenges for global technology platforms and makes it difficult to implement universal technological or policy solutions.

Furthermore, it is apparent that legislation, particularly in Western democracies, has largely been reactive. For example, the laws passed in the US and Australia were directly spurred by public outrage over the sexual exploitation of deepfakes. Likewise, the concern for electoral integrity in many jurisdictions is a direct response to fears of political manipulation. While such laws are crucial for closing loopholes, the need for proactive and adaptable frameworks that anticipate future technological developments and societal harms has become increasingly evident.

This review of global responses underscores the urgent need for more coherent international cooperation. The very nature of online technologies necessitates a coordinated, cross-border approach. International organizations, such as the United Nations, can play a critical role in harmonizing legal standards, promoting shared best practices, and facilitating the exchange of information to combat the global deepfake threat. Furthermore, multistakeholder dialogues involving governments, industry, civil society, and technical experts are essential for developing effective, adaptive, and ethically grounded solutions. As deepfake technologies continue to evolve at an accelerating pace, a collaborative and forward-looking
legal and policy strategy will be critical to ensuring digital trust and safeguarding the integrity of the information ecosystem.

updated at 2025-07-10

# AI # LLM # AIGC