Unauthorized use of ML model

A. The Model and Wights are accessible to the attacker

B. Model Endpoint (API) is hacked to access the model without limits

C. Model Extraction in which the attacker uses the model to label unlabelled data and generate a surrogate model

Access scenarios for ML models. (A) A white-box setting allows the attacker full access to the model and all of its parameters but not (necessarily) to the model’s training data. (B) In a black-box scenario, the attacker has no direct access to the model but instead interacts with it over an application programming interface (API).

Process of a model extraction attack. The attacker holds auxiliary data from a similar distribution as the target model’s training data. Through query access, the attacker obtains corresponding labels for the auxiliary data. Based on that data and the labels, a surrogate model can be trained that exhibits a similar functionality to the original model.

Categories of Watermarking techniques

Embedding Watermarks into model parameters
Using Pre-Defined Inputs as Triggers
Trigger Dataset Creation Based on Original Training Data
Robust watermarking
Unique Watermarking
Fingerprinting

Is the watermark foolproof?

Not really, but pretty good. One could generate output using the GPT model and then use another model to reword the output. Replacing a few words is still likely to maintain the signature in the text generated by GPT, ChatGPT, and InstructGPT.

The shallower the depth of the network, the easier it is to remove or evade the watermarking. The watermarking techniques that use a separate set of nodes for “tagging” are relatively easier to remove too.

Closing thoughts

We hear a lot about Ethics in AI. It is a fuzzy concept. But IP is more concrete. Regulation of IP in AI is as important as the AI itself. It needs a concrete framework. The research by the experts in the cryptography, AI, and IP protection fields is invaluable to protect a potentially trillion-dollar industry.

Important Links

Home Page

Courses Link

ONLEI Technologies

Search This Blog

Unauthorized use of ML model

Closing thoughts

Comments

Post a Comment