Runyi Li1,Xuanyu Zhang1,Zhipei Xu1,Yongbing Zhang3,Jian Zhang1,2 ✉
1 School of Electronic and Computer Engineering, Peking University
2 Peking University Shenzhen Graduate School-Rabbitpre AIGC Joint Research Laboratory
3 School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
Abstract
With the advent of personalized generation models, users can more readily create images resembling existing content, heightening the risk of violating portrait rights and intellectual property (IP). Traditional post-hoc detection and source-tracing methods for AI-generated content (AIGC) employ proactive watermark approaches; however, these are less effective against personalized generation models. Moreover, attribution techniques for AIGC rely on passive detection but often struggle to differentiate AIGC from authentic images, presenting a substantial challenge.Integrating these two processes into a cohesive framework not only meets the practical demands for protection and forensics but also improves the effectiveness of attribution tasks. Inspired by this insight, we propose a unified approach for image copyright source-tracing and attribution, introducing an innovative watermarking-attribution method that blends proactive and passive strategies. We embed copyright watermarks into protected images and train a watermark decoder to retrieve copyright information from the outputs of personalized models, using this watermark as an initial step for confirming if an image is AIGC-generated.To pinpoint specific generation techniques, we utilize powerful visual backbone networks for classification. Additionally, we implement an incremental learning strategy to adeptly attribute new personalized models without losing prior knowledge, thereby enhancing the model’s adaptability to novel generation methods.We have conducted experiments using various celebrity portrait series sourced online, and the results affirm the efficacy of our method in source-tracing and attribution tasks, as well as its robustness against knowledge forgetting.
1 Introduction
The AI-generated content (AIGC) model, especially the personalized generation model[55], has adverse implications on the copyright and intellectual property (IP) of various visual content such as artworks by artists, portraits of individuals, and photographs by photographers. Images generated through personalized methods may propagate misinformation and infringe upon copyrights, thereby engendering negative societal repercussions.In response to these concerns, significant research efforts have been directed towards AIGC copyright watermarking for images[37, 27, 54]. Several methodologies, including those employing box-free watermarking techniques[50, 43, 51, 19, 35, 44, 28, 38], have been developed to trace back the fine-tuning of generative models like Generative Adversarial Network (GAN)[12, 20, 13, 4] and Diffusion models[17, 33, 31, 7, 34], thereby serving a post-hoc protective function.
For AIGC models, particularly personalized generation models, both the source-tracing of copyright and the attribution of specific methods are important, and need to be accomplished simultaneously[41]. Given the inherent difficulty in preventing the personalized generation of portraits at the source, we opt for a post-hoc evidentiary approach. This approach not only supports copyright source-tracing of generated outputs but also enables the determination of whether an image is AIGC-generated and attribution of the specific personalized generation method employed by the stealer, thus facilitating attribution tasks.
However, existing research struggles to simultaneously accomplish the aforementioned tasks of source-tracing and attribution.Proactive watermark methods[51, 28] involve embedding invisible watermarks into generated outputs to identify copyright and generation method information. However, this approach is incongruent with our scenario, as stealers are unlikely to add their own watermarks to personalized generation models voluntarily.Passive detection methods[23, 45] align with our threat model; however, their performance suffers in distinguishing between AIGC-generated and real images and in attributing different personalized generation methods. Furthermore, considering the continual emergence of newly appeared methods, we also require a scalable approach capable of incrementally updating existing models to address newly proposed personalized generation methods.
To clarify our task and scenario, we clarify the definition of source-tracing and attribution as illustrated in Fig.1: (1) Source-Tracing: "Who does this suspended image belong to?" We aim to accurately retrieve the copyright information of an image; (2) Attribution: "Is this generated by personalized model? If so, by which model?" We aim to first determine whether it is generated via the decoded result, and then judge the specific generation method.
Based on the aforementioned challenges, we propose a novel framework for source-tracing and attribution of personalized generated images, employing a combination of proactive watermark and passive detection mechanisms. We embed portrait images requiring protection with box-free watermarks. Specifically tailored for personalized generation methods, our copyright watermark can be decoded from the results of personalized generation, allowing for the determination of whether a suspicious image is AIGC-generated based on the presence or absence of the watermark. This constitutes the proactive source-tracing watermark, thereby achieving initial attribution for AIGC content detection. For attribution of specific personalized generation methods, we employ a classification network. Furthermore, recognizing the continuous evolution of personalized generation methods, we utilize incremental learning techniques, enabling existing models to attribute new generation methods with minimal data and training costs. Fig.1 shows a scenario of our protecting framework. Our main contributions are summarized as follows:
❑(1) We propose the task of source-tracing and attribution of personalized generation, and design a unified framework that combines proactive watermarking and scalable passive detection.
❑(2) For the source-tracing task, we apply an encoder-decoder structure to embed invisible source-tracing watermarks into images, which can be decoded from the generated results. Through the decoded result of the suspended image, we can identify the copyright, and determine whether the image is personalized generated.
❑(3) For the attribution task, we introduce a hierarchical attribution approach. Leveraging the proactive watermark, we first determine the presence of a copyright watermark, and then employ a visual backbone to figure out the specific generation method. Furthermore, we incorporate incremental learning strategy into this network, enabling scalable attribution of new personalized methods.
❑(4) Experimental results demonstrate the effectiveness of our proposed methods in source-tracing and attributing tasks.
2 Related work
2.1 Model Watermarking Methods
Several varieties of watermarking techniques have been proposed to ascertain the ownership of models and embed owner information within them. These include white-box methods that add watermarks to model weights or model outputs[36, 5]. Assuming the model protector is familiar with the model architecture, white-box watermarks can be embedded into the model’s weights, and the decoded watermark can then represent the identity of the owner. Conversely, if the owner does not understand the model structure, a black-box watermark[2] can be added to the model’s output, such as by constructing a trigger set, to indicate ownership.Furthermore, in some special scenarios where there is no predefined model structure or the scenario is not related to specific model structure, this is referred to as box-free watermarking[50, 43, 51, 19, 35, 44]. The method proposed by Zhang et al.[51]. involves adding specific watermarks to the image processing models. Specifically, we can first add a watermark to the image. The image with the watermark, after being processed by the model, yields an output result from which the watermark can be decoded. Additionally, for images that have not undergone processing by this model, a benign signal will be decoded instead. Our proposed method applies this box-free design, as detailed in Sec.4.2.
2.2 Attribution of Generative Models
To construct a complete chain of evidence in digital forensics, it is not only essential to understand the copyright of an image but also to be interested in the specific generative methods used to produce pirated images. Given a suspicious image, the inference of whether the image is generated by AIGC and the identification of the specific AIGC method that produced the result is referred to as attribution. Current passive judgment regarding AIGC images primarily involves designing special network structures to perform a binary classification between real and fake images, or predict a suspicious map indicating generation trace[8, 9, 6].For the judgment of specific generative methods, it mainly includes two approaches: classification[11, 47, 10] and finger-printing[48, 49, 46, 50, 29]. The classification method generally uses a universal visual network as its backbone, combined with special designs, to achieve differentiation among various generative methods. Finger-printing, on the other hand, focuses on the different characteristics of the results generated by different methods, analyzing the results in terms of frequency domains, feature spaces, and other aspects.To our best knowledge, there have been a few explorations of the finger-printing method regarding the attribution of diffusion-based generation models[21, 30, 42], but these approaches add watermarks to generative models in advance, and thus do not align with our scenario.
3 Problem Statement & Threat Model
The task of intellectual property (IP) protection proposed by us encompasses two main components: (1) source-tracing the copyright of results generated by personalized generation methods to determine the ownership of the image, and (2) distinguishing between suspicious images found online as legal real photographs or illegal results obtained through personalized generation methods. In the case of illegal results, additional identification of the specific model used for generation (such as LoRA[18], InstantID[40], etc.) is required.
❑ Task and Method DefinitionFor the source-tracing task, we aim to identify the copyright of suspended images from the Internet. For the attribution task, we aim to first determine whether the image is AIGC-generated, and if so, we further judge the specific generation method of the image. For box-free watermarking, it means a kind of watermarking technique that embeds invisible watermarks into the modeloutputs, and both the model and the outputs can be protected.
❑ Stealer’s Objective The stealer pertains to the acquisition of a collection of images requiring copyright protection, such as celebrity portraits, animated characters, and endeavors to utilize personalized generation models to produce images containing features resembling those present in the aforementioned images, such as facial characteristics.
❑ Stealer’s KnowledgeThe stealer’s objective is limited to images and does not involve models or network architectures. Therefore, the scenario we set up is a "box-free" environment, wherein both the watermark we add and the process of the stealer’s attack do not involve the structure of the model. Furthermore, the stealer is unaware of the method we use to add the watermark. Given that the current situation involves image IP theft by the stealer, proactive model watermarking is not applicable to our scenario (it is obvious that stealers would not voluntarily add watermarks to their own models).
❑ Stealer’s Capability The stealer has access to the Internet and can obtain the images we need to protect, as well as ample resources to train personalized generation models.
❑ Security Requirements Our proposed watermark should satisfy two requirements: (1) robustness against potential degradation during transmission over networks, such as noise, JPEG compression, and other forms of degradation; (2) preservation of the quality of the image, ensuring that the watermark itself does not impact the visual effect of the image.
4 Methods
4.1 Overview
Motivation
Current proactive forensic watermarking primarily targets non-personalized image generation models, which lacks research pertaining to forensic evidence of results generated by personalized generation models.Existing post-hoc methods for image IP protection mainly rely on proactive forensic watermarking techniques. The efficacy of proactive forensic watermarking typically entails detecting copyright watermarks from suspicious images, while it does not provide insight into which specific AIGC model generated the image, thereby hindering the establishment of comprehensive forensic evidence in practice.
For attribution tasks involving determining whether an image is AIGC-generated and inferring the specific generation method, current approaches mainly rely on passive detection methods such as classification[45, 11] and fingerprinting[49]. However, existing AIGC models can generate highly realistic images, making it challenging to discern their authenticity solely through passive approaches.
Motivated by the aforementioned insights, we propose a unified task of copyright source-tracing and image attribution and employ a combined approach utilizing both proactive watermarking and passive detection methods. By unifying these two tasks, we meet the need of realistic protection forensics, and allows for better performance of the attribution task.
Pipeline
The user adds a watermark image to the original images , resulting in the watermarked image denoted as . This watermarked image can be publicly uploaded on the Internet. A stealer obtains images from from the Internet and illegally generates additional images using a personalized generation model, denoted as . There are also legal images on the Internet similar to , denoted as . When the users encounter suspicious images on the Internet, they can decode them using a watermark decoder . Personalized generated images can be decoded to reveal the copyright watermark, while images from and decode to a black image, serving as a benign signal. If the copyright watermark is decoded, the user can further utilize our method to determine which personalized generation model generated the image via attribution network , thus completing the evidential chain. An illustration of our proposed framework is presented in Fig.2.
4.2 Proactive Watermark for Source-Tracing
Model Watermarking Methods
To safeguard copyright and IP in images, we construct a model-agnostic protection scenario and implement watermarking without box embedding for copyright information embedding and source-tracing. Specifically, we embed the watermark information into the output distribution of the image processing model through a network and extract the watermark using a corresponding extraction network. Besides considering robustness against traditional digital image processing, our approach also defends against model extraction attacks, as box-free watermarking does not require the extractor to have knowledge of the target model’s internal details.
Training and Inference of Watermark
Given a set of original images requiring protection, denoted as , we apply imperceptible image watermarks using an embedding network, resulting in the watermarked images denoted as . The watermark encoder is implemented using a reversible network[53], and the details of network architecture and training process, including loss function, are in AppendixD. These watermarked images are then publicly uploaded online. Subsequently, stealers download these images and utilize them to train personalized generation models such as LoRA[18]. The generated results, denoted as , can be decoded using decoding network to obtain the embedded image watermark, enabling source-tracing of copyright and IP. The watermark decoding network shares the same structure as the aforementioned watermark embedding network. Due to the reversible design of the network, the decoding process is the inverse of the embedding process. Additionally, when decoding legal images , the decoding network should output a black image , serving as a benign signal.The loss function could be expressed as Eq.1:
(1) |
where denotes images generated by generation method , and there are generation methods adapted in training. A detailed training process is shown in Algo.1.
To simulate the potential degradations that may occur during image transmission in real-world network environments, we have introduced three types of degradations to our training data, which include Gaussian noise, Poisson noise, and JPEG compression. The specific settings for these degradations are detailed in Sec.5.1.
4.3 Scalable Proactive and Passive Attribution
Hierarchical Proactive-and-Passive Mechanism for Attribution
Our attribution task consists of two components: (1) determining whether suspicious images found on the Internet are generated by personalized generation models, and (2) if indeed generated by an AIGC model, identifying the specific model used. Since we can detect watermarks on AIGC-generated results, we can design a mechanism that combines proactive and passive methods. Initially, we utilize proactive watermarks, which are relatively easy to detect, to differentiate between benign legal images and AIGC-generated results. Subsequently, we employ passive methods to attribute the specific generation method.
For discerning the specific generation method, we draw inspiration from[45] and utilize an efficient visual backbone to classify the results generated by different methods. The loss function is as Eq.2:
(2) |
where is cross-entropy loss, is the attribution classifier, and is the label of image .
Increment Learning Strategy for Scalable Attribution
In real-world scenarios, the types of personalized generation models are constantly evolving and updating. If we choose to fine-tune the existing model every time we encounter a new personalized generative model, this approach will result in a significant training cost and will encounter the catastrophic forgetting problem brought by training on new tasks. To efficiently attribute new generation methods without suffering from catastrophic forgetting problem, we can employ the regularization-based incremental learning strategy[39], wherein we preserve the knowledge of the original model while learning knowledge of the new generation methods. Following the regularization method of generating new training samples[26], we apply the following training strategy:
(1) Prepare an extra dataset by generating new images using , where is generated by , and the scale of is smaller than the training dataset (including images generated by new appeared personalized model). In our experiments, the length of is of dataset .
(2) Fine-tuning the attribution network using and , via the following loss function as Eq.3:
(3) |
Note that there are kinds of generation methods now. We set as the weight of the incremental part loss, and the related ablation studies are shown in Sec.5.3. Through the way of adding a light-weight extra dataset into training process, the original attribution is able to learn new knowledge without suffering from forgetting previous task. The detailed process is shown in Algo.2.
5 Experiments
5.1 Implementation Details
Portrait IP Method Watermark: NeurIPS Watermark: IceShore Diva Zhang et al.[51] 42.4496 14.7672 46.4082 17.1526 Zhang et al.[51]† 43.7054 28.2597 47.4658 38.4593 Ours 49.3972 38.2023 50.7175 48.9043 Sportsman Zhang et al.[51] 36.4110 14.7998 39.6772 16.8941 Zhang et al.[51]† 41.5621 28.3615 40.7506 37.8944 Ours 50.2480 44.9456 50.8997 44.2858 Actor Zhang et al.[51] 41.0348 16.7669 44.5886 12.4662 Zhang et al.[51]† 42.9745 28.3445 45.8301 37.9649 Ours 50.2719 53.7781 50.4068 47.7241 Actress Zhang et al.[51] 37.3238 14.8029 35.0916 17.0501 Zhang et al.[51]† 39.3804 28.0742 37.2112 37.8417 Ours 50.5714 44.3980 50.6827 42.1217
❑ DatasetDue to the lack of existing data for IP protection, we collect a set of publicly available photos of celebrities from the Internet to serve as the IP information requiring protection, thereby proposing an IP protection dataset. Considering the diversity and fairness of the data, we select portraits of celebrities from different genders and skin colors, labeled as "Diva" (175 images), "Sportsman" (94 images), "Actor" (177 images), and "Actress" (206 images)✉✉✉To protect the privacy of the celebrities, their real names are not used.. All images are cropped to include the face and its surrounding area, and resized to 256256 pixelsThen for each IP, we utilize four personalized generation models (LoRA[18], InstantID[40], PhotoMaker[25], DreamBooth[32]), with each method producing 1,000 images separately, and the prompts are generated by Large-Language-Models (LLMs) like ChatGPT[1].We split the dataset into training and validation sets in an 8:2 ratio.For the images of copyright watermark, we use the logo of NeurIPS and a photo of ice shore, denoted as NeurIPS and IceShore in the following experiments. More details of our dataset are in AppendixE.
❑ Network StructureOur network architecture consists of an embedding and decoder for the watermark, as well as a classification network for the specific attribution generation task. For the watermark network, we employ the reversible network structure from EditGuard[53]. Regarding the attribution network, we follow the setting of[45] and utilize EfficientFormer[24] as the backbone network for the vision task. All networks are trained from scratch. The detailed network architectures and training process of watermark embedding can be found in the AppendixD.
❑ SettingsFor the watermark source-tracing task, we set the learning rate to 1e-4, batch size=1, and use three kinds of degradations, including Gaussian noise, Poisson noise, and JPEG compression, to simulate the loss of image quality in real network transmissions. We set the variation of Gaussian noise from 1 to 16, and JPEG compression quality randomly from 70 to 95.We set the learning rate to 2e-3 and batch size=32 for the attribution task of specific generation methods.In the incremental learning, we additionally generated 200 images for each method, with the remaining settings same as above.All experiments are done on 4 NVIDIA 3090 GPUs.
5.2 Evaluation of Source-Tracing and Attribution Task
Evaluation of Source-Tracing
Generated Ours Zhang et al.[51] Zhang et al.[51]† Generated Ours Zhang et al.[51] Zhang et al.[51]†
Portrait IP Method Type Method AIGC Detection Method Judge Overall Accuracy Diva Passive ResNet[16] 23.58% 33.34% 24.96% KNN[14] 92.36% 65.03% 63.50% Prompt Inversion[23] - 69.54% - EfficientFormer[24] 86.28% 95.00% 91.80% Proactive Multi-Watermark 82.14% 78.57% 80.01% Proactive & Passive Ours 96.77% 95.00% 95.92% Sportsman Passive ResNet[16] 24.61% 34.97% 25.00% KNN[14] 97.27% 58.40% 59.04% Prompt Inversion[23] - 61.33% - EfficientFormer[24] 90.96% 95.80% 90.97% Proactive Multi-Watermark 89.58% 51.76% 59.76% Proactive & Passive Ours 98.05% 95.80% 96.69% Actor Passive ResNet[16] 24.92% 31.10% 24.93% KNN[14] 89.45% 67.86% 68.40% Prompt Inversion[23] - 58.66% - EfficientFormer[24] 87.03% 98.83% 87.10% Proactive Multi-Watermark 86.88% 50.81% 59.43% Proactive & Passive Ours 93.54% 98.83% 96.07% Actress Passive ResNet[16] 23.67% 34.98% 24.90% KNN[14] 92.17% 66.63% 67.46% Prompt Inversion[23] - 62.00% - EfficientFormer[24] 87.09% 93.38% 89.69% Proactive Multi-Watermark 92.06% 65.08% 76.06% Proactive & Passive Ours 92.55% 93.38% 92.99%
For the box-free watermark source-tracing method, we compared against the approach of Zhang et al.[51], with evaluation metrics including (1) the difference between the image before and after watermark embedding, and (2) the accuracy of decoding the watermark from generated images . The quantitative results are presented in Tab.1, and the visualization of watermarking embedding and source-tracing is demonstrated in Fig.3 and Tab.2. Both quantitative and qualitative results have shown that our watermark framework can preserve the quality of images, and can be extracted correctly. We have also done a security analysis for our watermarking method and Zhang et al.[51], and detailed description and results are in AppendixA.
Evaluation of Attribution
For the attribution task, we compared common visual baselines, including the supervised ResNet[16] and the clustering method KNN[14], to demonstrate that this task is challenging and non-trivial. We also compare our method to Prompt Inversion[23], which uses LLM to obtain possible prompts for the AIGC image and use it to find the most similar generative model. Subsequently, to validate the effectiveness of our proposed proactive watermarking and passive detection methods, we compared them with the direct use of EfficientFormer for 4-class classification (1 label for true image and 3 labels for 3 generation models). Additionally, a trivial approach is to directly let the decoder output different images for different generation methods, thereby simultaneously achieving copyright source-tracing and method attribution, named as Multi-Watermark. While this is indeed a clever approach, we compare this potential approach and find that it could not effectively accomplish the task. The results of the comparison task are shown in Tab.3.
Please note that there are some box-free watermarking, attribution, and source-tracing methods that are not included in our evaluation results, and the comparison and explanation of why they are not available in our settings are listed in AppendixB.
Portrait IP Method Knowledge Forget Diva Vanilla Fine-Tuning 4.11% Incremental Strategy (Ours) 0.35% Sportsman Vanilla Fine-Tuning 25.63% Incremental Strategy (Ours) 0.71% Actor Vanilla Fine-Tuning 3.39% Incremental Strategy (Ours) 1.52% Actress Vanilla Fine-Tuning 8.27% Incremental Strategy (Ours) 2.29%
5.3 Ablation Studies
In this subsection, we demonstrate the enhancement achieved through the joint proactive watermarking and passive detection mechanism, as well as the reduction in knowledge forgetting brought about by incremental learning strategy compared to direct fine-tuning methods.
❑ Effect of Incremental Learning Strategy
For the attribution of newly appeared personalized models, we fine-tune the models obtained in Sec.4.3 using (1) direct fine-tuning and (2) incremental learning strategies separately. The quantitative results are shown in Tab.4, which shows that our incremental strategy helps the attribution network mitigate the knowledge forgetting significantly. The metric Knowledge Forget is calculated by , where is the accuracy of the fine-tuned model, and for original model[15].
❑ Choice of Hyper-parameterIn Sec.4.3, we introduced the loss function to fine-tune the original attribution network with an extra supplement dataset, to reduce the knowledge forgetting. This loss function incorporates a hyper-parameter denoted as, balancing the two constituent parts of the loss. To validate the influence of , we perform a grid-search on it using "Sportsman" IP, with the findings depicted in Fig.5.
5.4 Method Analysis
❑ Robustness Analysis To evaluate the robustness of our source-tracing watermark, we have subjected the images in the test set to Gaussian noise and JPEG compression, with degradation intensities exceeding those set during the training process. The visualization of the effects is depicted in the accompanying Fig.5. Results show that our watermark is enough robust to common degradations on Internet transmission.
❑ Different Base Models in Personalized GenerationSome newly personalized generative models, such as PhotoMaker[25], use pre-trained base models and generate images via the base model. Assuming that the stealer trains a base model on their own, instead of the one provided by the official code or API, it could lead to failure in source-tracing. We test this possible scenario using PhotoMaker as an example, and detailed results are in AppendixA, which shows that our watermark is detectable to unseen base models.
6 Conclusion
To safeguard the intellectual property (IP) and portrait rights of images, we introduce the task of copyright source-tracing and attribution of AIGC misuse methods in this paper. We propose a novel watermarking method that combines proactive and passive approaches, enabling the decoding of copyright watermarks from images generated by personalized models and identifying the specific generation methods. Furthermore, we employ an incremental learning strategy to efficiently attribute newly emerging personalized methods, thereby avoiding catastrophic forgetting issues caused by new data. To evaluate our proposed method, we prepare a dataset consisting of a series of portraits and their corresponding personalized generation results. Experimental results on this dataset demonstrate the effectiveness of our proposed approach. Our method can be extended to various applications involving copyright protection and defense against illegal personalized model generation.
References
- [1]https://openai.com/chatgpt/.
- [2]Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, and Joseph Keshet.Turning your weakness into a strength: Watermarking deep neural networks by backdooring.In 27th USENIX Security Symposium (USENIX Security 18), pages 1615–1631, 2018.
- [3]Benedikt Boehm.Stegexpose-a tool for detecting lsb steganography.arXiv preprint arXiv:1410.6656, 2014.
- [4]Andrew Brock, Jeff Donahue, and Karen Simonyan.Large scale gan training for high fidelity natural image synthesis.arXiv preprint arXiv:1809.11096, 2018.
- [5]Huili Chen, BitaDarvish Rouhani, Cheng Fu, Jishen Zhao, and Farinaz Koushanfar.Deepmarks: A secure fingerprinting framework for digital rights management of deep learning models.In Proceedings of the 2019 on International Conference on Multimedia Retrieval, pages 105–113, 2019.
- [6]DavideAlessandro Coccomini, Nicola Messina, Claudio Gennaro, and Fabrizio Falchi.Combining efficientnet and vision transformers for video deepfake detection.In International conference on image analysis and processing, pages 219–229. Springer, 2022.
- [7]Prafulla Dhariwal and Alexander Nichol.Diffusion models beat gans on image synthesis.In Advances in Neural Information Processing Systems (NeurIPS), 2021.
- [8]Himanshu Dutta, Aditya Pandey, and Saurabh Bilgaiyan.Ensembledet: ensembling against adversarial attack on deepfake detection.Journal of Electronic Imaging, 30(6):063030–063030, 2021.
- [9]Wanying Ge, Jose Patino, Massimiliano Todisco, and Nicholas Evans.Explaining deep learning models for spoofing and deepfake detection with shapley additive explanations.In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6387–6391. IEEE, 2022.
- [10]Sharath Girish, Saksham Suri, SaiSaketh Rambhatla, and Abhinav Shrivastava.Towards discovery and attribution of open-world gan generated images.In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14094–14103, 2021.
- [11]Michael Goebel, Lakshmanan Nataraj, Tejaswi Nanjundaswamy, TajuddinManhar Mohammed, Shivkumar Chandrasekaran, and BSManjunath.Detection, attribution and localization of gan generated images.arXiv preprint arXiv:2007.10466, 2020.
- [12]Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio.Generative adversarial networks.Communications of the ACM, 2020.
- [13]Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and AaronC Courville.Improved training of wasserstein gans.In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- [14]Gongde Guo, Hui Wang, David Bell, Yaxin Bi, and Kieran Greer.Knn model-based approach in classification.In On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003. Proceedings, pages 986–996. Springer, 2003.
- [15]Jiangpeng He, Runyu Mao, Zeman Shao, and Fengqing Zhu.Incremental learning in online scenario.In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13926–13935, 2020.
- [16]Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.Deep residual learning for image recognition.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- [17]Jonathan Ho, Ajay Jain, and Pieter Abbeel.Denoising diffusion probabilistic models.In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- [18]EdwardJ Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, LuWang, and Weizhu Chen.LoRA: Low-rank adaptation of large language models.In International Conference on Learning Representations (ICLR), 2022.
- [19]Ziheng Huang, Boheng Li, Yan Cai, Run Wang, Shangwei Guo, Liming Fang, Jing Chen, and Lina Wang.What can discriminator do? towards box-free ownership verification of generative adversarial networks.In Proceedings of the IEEE/CVF international conference on computer vision, pages 5009–5019, 2023.
- [20]Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen.Progressive growing of gans for improved quality, stability, and variation.arXiv preprint arXiv:1710.10196, 2017.
- [21]Changhoon Kim, Kyle Min, Maitreya Patel, Sheng Cheng, and Yezhou Yang.Wouaf: Weight modulation for user attribution and fingerprinting in text-to-image diffusion models.arXiv preprint arXiv:2306.04744, 2023.
- [22]Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi.Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation.In International Conference on Machine Learning (ICML), 2022.
- [23]Meiling Li, Zhenxing Qian, and Xinpeng Zhang.Regeneration based training-free attribution of fake images generated by text-to-image generative models.arXiv preprint arXiv:2403.01489, 2024.
- [24]Yanyu Li, Geng Yuan, Yang Wen, JuHu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, and Jian Ren.Efficientformer: Vision transformers at mobilenet speed.Advances in Neural Information Processing Systems, 35:12934–12949, 2022.
- [25]Zhen Li, Mingdeng Cao, Xintao Wang, Zhongang Qi, Ming-Ming Cheng, and Ying Shan.Photomaker: Customizing realistic human photos via stacked id embedding.arXiv preprint arXiv:2312.04461, 2023.
- [26]Zhizhong Li and Derek Hoiem.Learning without forgetting.IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017.
- [27]Chumeng Liang and Xiaoyu Wu.Mist: Towards improved adversarial examples for diffusion models.arXiv preprint arXiv:2305.12683, 2023.
- [28]Ruinan Ma, Yu-an Tan, Shangbo Wu, Tian Chen, Yajie Wang, and Yuanzhang Li.Unified high-binding watermark for unconditional image generation models.arXiv preprint arXiv:2310.09479, 2023.
- [29]Guangyu Nie, Changhoon Kim, Yezhou Yang, and YiRen.Attributing image generative models using latent fingerprints.In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 26150–26165. PMLR, 23–29 Jul 2023.
- [30]Guangyu Nie, Changhoon Kim, Yezhou Yang, and YiRen.Attributing image generative models using latent fingerprints.In International Conference on Machine Learning, pages 26150–26165. PMLR, 2023.
- [31]Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer.High-resolution image synthesis with latent diffusion models.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- [32]Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman.Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation.In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- [33]Jiaming Song, Chenlin Meng, and Stefano Ermon.Denoising diffusion implicit models.In International Conference on Learning Representations (ICLR), 2021.
- [34]Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever.Consistency models.arXiv preprint arXiv:2303.01469, 2023.
- [35]Jingxuan Tan, Nan Zhong, Zhenxing Qian, Xinpeng Zhang, and Sheng Li.Deep neural network watermarking against model extraction attack.In Proceedings of the 31st ACM International Conference on Multimedia, pages 1588–1597, 2023.
- [36]Yusuke Uchida, Yuki Nagai, Shigeyuki Sakazawa, and Shin’ichi Satoh.Embedding watermarks into deep neural networks.In Proceedings of the 2017 ACM on international conference on multimedia retrieval, pages 269–277, 2017.
- [37]Thanh VanLe, Hao Phung, ThuanHoang Nguyen, Quan Dao, NgocN Tran, and Anh Tran.Anti-dreambooth: Protecting users from personalized text-to-image synthesis.In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2116–2127, 2023.
- [38]Guanjie Wang, Zehua Ma, Chang Liu, XiYang, Han Fang, Weiming Zhang, and Nenghai Yu.Must: Robust image watermarking for multi-source tracing.In Proceedings of the AAAI Conference on Artificial Intelligence, volume38, pages 5364–5371, 2024.
- [39]Liyuan Wang, Xingxing Zhang, Hang Su, and Jun Zhu.A comprehensive survey of continual learning: Theory, method and application.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
- [40]Qixun Wang, XuBai, Haofan Wang, Zekui Qin, and Anthony Chen.Instantid: Zero-shot identity-preserving generation in seconds.arXiv preprint arXiv:2401.07519, 2024.
- [41]Tao Wang, Yushu Zhang, Shuren Qi, Ruoyu Zhao, Zhihua Xia, and Jian Weng.Security and privacy on generative data in aigc: A survey.arXiv preprint arXiv:2309.09435, 2023.
- [42]Yuxin Wen, John Kirchenbauer, Jonas Geiping, and Tom Goldstein.Tree-rings watermarks: Invisible fingerprints for diffusion images.Advances in Neural Information Processing Systems, 36, 2024.
- [43]Hanzhou Wu, Gen Liu, Yuwei Yao, and Xinpeng Zhang.Watermarking neural networks with watermarked images.IEEE Transactions on Circuits and Systems for Video Technology, 31(7):2591–2601, 2020.
- [44]Xiaoshuai Wu, Xin Liao, and BoOu.Sepmark: Deep separable watermarking for unified source tracing and deepfake detection.In Proceedings of the 31st ACM International Conference on Multimedia, pages 1190–1201, 2023.
- [45]Katherine Xu, Lingzhi Zhang, and Jianbo Shi.Detecting image attribution for text-to-image diffusion models in rgb and beyond.arXiv preprint arXiv:2403.19653, 2024.
- [46]Tianyun Yang, Juan Cao, Qiang Sheng, Lei Li, Jiaqi Ji, Xirong Li, and Sheng Tang.Learning to disentangle gan fingerprint for fake image attribution.arXiv preprint arXiv:2106.08749, 2021.
- [47]Tianyun Yang, Ziyao Huang, Juan Cao, Lei Li, and Xirong Li.Deepfake network architecture attribution.In Proceedings of the AAAI Conference on Artificial Intelligence, volume36, pages 4662–4670, 2022.
- [48]Ning Yu, Larry Davis, and Mario Fritz.Attributing fake images to gans: Analyzing fingerprints in generated images.arXiv preprint arXiv:1811.08180, 2:3, 2018.
- [49]Ning Yu, Vladislav Skripniuk, Sahar Abdelnabi, and Mario Fritz.Artificial fingerprinting for generative models: Rooting deepfake attribution in training data.In Proceedings of the IEEE/CVF International conference on computer vision, pages 14448–14457, 2021.
- [50]Ning Yu, Vladislav Skripniuk, Dingfan Chen, Larry Davis, and Mario Fritz.Responsible disclosure of generative models using scalable fingerprinting.arXiv preprint arXiv:2012.08726, 2020.
- [51]Jie Zhang, Dongdong Chen, Jing Liao, Weiming Zhang, Huamin Feng, Gang Hua, and Nenghai Yu.Deep model intellectual property protection via deep watermarking.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(8):4005–4020, 2022.
- [52]Lvmin Zhang and Maneesh Agrawala.Adding conditional control to text-to-image diffusion models.arXiv preprint arXiv:2302.05543, 2023.
- [53]Xuanyu Zhang, Runyi Li, Jiwen Yu, Youmin Xu, Weiqi Li, and Jian Zhang.Editguard: Versatile image watermarking for tamper localization and copyright protection.arXiv preprint arXiv:2312.08883, 2023.
- [54]Xuanyu Zhang, Youmin Xu, Runyi Li, Jiwen Yu, Weiqi Li, Zhipei Xu, and Jian Zhang.V2a-mark: Versatile deep visual-audio watermarking for manipulation localization and copyright protection.arXiv preprint arXiv:2404.16824, 2024.
- [55]Xulu Zhang, Xiao-Yong Wei, Wengyu Zhang, Jinlin Wu, Zhaoxiang Zhang, Zhen Lei, and Qing Li.A survey on personalized content synthesis with diffusion models.arXiv preprint arXiv:2405.05538, 2024.
Appendix / supplemental material
In this Appendix part, we will supplement contents including (1) additional experiments about our method in AppendixA; (2) the comparison of our method and other related work that are similar but not applicable in comparison experiments in AppendixB; (3) introduction of personalized generation methods in AppendixC; (4) detailed network architecture and training of watermark embedding in AppendixD; (5) Details of collecting and personalized generation of our IP dataset in AppendixE; (6) limitations and discussion of our work in AppendixF.
Appendix A Additional Experiment Results
Analysis of Different Base Models
We take PhotoMaker as an example, selecting YamerMIX✉✉✉Link of YamerMIX on CivitAI is here. as the base model for testing, which is different from RealVisXL✉✉✉Link of RealVisXL on HuggingFace is here. used during training. There are noticeable differences in the background and style of the generated results between these two models, with specific visualizations shown in the Tab.5. From the results, it can be seen that our method can also successfully trace the source for unseen base models.
Generated Result | Decoded Watermark | Generated Result | Decoded Watermark |
Security Analysis
To validate the security of our proposed watermark framework, we perform an anti-steganography detection using StegExpose[3] on watermarked images of Zhang et al.[51] and our framework. Both of them embed a 256256 RGB image. The detection set is built by mixing watermarked images and original images with equal proportions. We vary the detection thresholds in a wide range in StegExpose[3] and draw the receiver operating characteristic (ROC) curve in Fig.6. Please note that the closer the curve in the figure is to the reference line, the more difficult it is for the detecting model to detect the method corresponding to that curve, and thus the watermarking method is considered to be more secure.(The ideal case for a watermarking method is nearly close to the reference line, meaning a 50% accuracy of judging whether the image has embedded a watermark, which is random-guess). Fig.6 shows that for watermark NeurIPS and IceShore, the curve of our method is closer to the reference line than Zhang et al.[51], demonstrating the reliable security of our proposed watermark.
Appendix B Explanation of Methods Not Available in Experiments
For the box-free watermarking method Huang et al.[19], their work is focused on copyright verification of GAN-based methods, and a specific discriminator for GAN architecture is needed, while our work is focused on diffusion-based generation methods.For attribution methods[50, 49], their attribution approach is based on GAN finger-printing, while our scenario is diffusion-based personalized generation methods; the watermark added in these works are bit messages, while our watermark is an image.For source-tracing methods[38, 44], the watermark added in these works are also bit messages, while our watermark is an image.
Appendix C Personalized Generation Methods
Diffusion Probabilistic models[17, 33, 31] have introduced a new paradigm to generative models, particularly for personalized generative models[55]. LoRA[18] incorporates a low-rank representation into the diffusion process, enabling the generation of specific character traits with a relatively minor training cost. However, LoRA still necessitates training, and a high-performing LoRA model requires dozens to hundreds of images, along with several hours of training time. Additionally, it is necessary to prepare caption texts corresponding to these images.DreamBooth[32] learns the character information corresponding to a specific token, such as "sks person". By adding such token to the generation prompt in Stable Diffusion[31], an image of that character can be generated. To reduce training costs and enhance the efficiency of generating personalized images, several training-free methods have been proposed.InstantID[40] leverages a pre-trained ControlNet[52] model as its foundation, and with just a single specific image, it can generate a personalized image corresponding to that person’s face ID. PhotoMaker[25], on the other hand, can directly extract the character’s face ID information from an image and generate a personalized image for that ID, allowing for the overlay of faces.
Appendix D Network Architecture and Training Watermark Embedding
Our watermark embedding and decoding network is referred from EditGuard[53]. As we do not include bit message into the watermark, we modify the network into an invertible structure with images as input and output, and the detailed architecture and training process are shown below:
Watermark Encoder It embeds a 2D image watermark original image, forming a container image. We do not use the bit encoder structure.
Copyright Extractor It extracts copyright information from the container image, which is robust against degradation including noise and JPEG compression.
Invertible Blocks They are used in the encoder and extractor for precise multimedia information recovery through Discrete Wavelet Transform (DWT) and enhanced affine coupling layers.
Training Watermark EmbeddingThe watermark encoder takes as inputs an original image and a watermark image . As we choose the invertible network as watermark encoder as well as decoder, by reversing the direction of input and output, we can decode the watermark image using the same network and shared weights. The training target consists of two parts: (1) decode the image correctly, and (2) the image embedded with the watermark is similar to the original image. The specific loss is as Eq.4:
(4) |
The network is designed to achieve precise copyright information recovery. It provides a proactive approach to revealing copyright information suitable for various AI-generated content (AIGC) methods. More detailed information is shown in Fig.7.
Appendix E Details of Dataset Collection and Generation
We download photos of four well-known figures from websites such as Google Images, which are referred to as "Diva", "Sportsman," "Actor," and "Actress." For each downloaded image, we first filter out photos that are not suitable for training data, such as those with group photos or are blurry. Then, using OpenCV’s face recognition method, we identify the faces in the images and crop them centered on the faces, ensuring that the resulting images are square images centered on the face, and finally resize them to 256*256. Each figure ended up retaining 175, 94, 177, and 206 photos respectively.
For the training of LoRA and DreamBooth, we use BLIP[22] image caption method to generate corresponding captions for each image. These image-text pairs are used for the training of LoRA and DreamBooth. PhotoMaker and InstantID do not require training.
For the prompts used in image generation, we use Large-Language-Models (LLMs) like ChatGPT[1] to generate them, with specific prompts such as: "Please generate 100 prompts to generate a photo of an actor." Then, during actual generation, we add the following positive prompts: "high quality, high resolution, high definition, great face, colorful", and the following negative prompts: (asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), open mouth, grayscale.After all images are generated, they are randomly divided into training and validation sets in an 8:2 ratio.We select several images from the dataset and presented them in Fig.8.
Appendix F Limitations and Discussion
Although our proposed method shows promising value in application of IP protection, there still exists potential limitations: (1) The dataset for training and validation is not extensive enough; (2) Our approach only supports method-specific detection and attribution, which needs an extra training process for new stealing methods, and it would be more flexible if a self-adaptive approach for most existing personalized generative methods is proposed.
This study pioneers a method that combines proactive watermark detection with passive attribution to safeguard image IPs from illegal personalized generation. In future potential endeavors, our work could extend to other types and even other modalities of IP protection tasks. Furthermore, there is a lack of research on methods similar to GAN fingerprinting for diffusion model-based personalized generation models, which could differentiate between different models from the perspective of features in the model-generated results. This could enable training-free detection and attribution of AIGC models.