Back to Glossary

Model Extraction Attacks

Model Extraction Attacks refer to a type of cybersecurity threat where an adversary attempts to replicate a machine learning model by querying it and analyzing its responses. This attack exploits the model's output to infer its underlying structure and parameters, potentially leading to unauthorized access to proprietary algorithms and sensitive data. By leveraging techniques such as black-box querying, attackers can create a surrogate model that mimics the original, compromising intellectual property and competitive advantage. Protecting against model extraction requires robust security measures, including rate limiting, output obfuscation, and comprehensive monitoring.