Where are the assets that need to be protected?
Security and Privacy considerations in Artificial Intelligence & Machine Learning ― Part 2: The NewAssets
Note: This is part-2 of a series of articles on ‘Security and Privacy in Artificial Intelligence & Machine Learning’. Here are the links to all articles (so far):Part-1 Part-2 (this article)
In theprevious article, we looked at the key challenges that show up from an end to end consideration of AI & ML ecosystem and workflows from a ‘traditional’ cybersecurity standpoint (without going much into the AI & ML specifics of the workflows). In this part, we will begin zooming in on the AI & ML components ― starting with an exploration of the interesting new assets that AI & ML bring into the picture. This exercise will help us become cognizant of these assets allowing us to treat them alongside other critical information assets when we perform threat modeling and choose security techniques for the end to end system. With the assets identified, it will be easier to consider possible ways that attackers can breach the Confidentiality, Integrity or Availability (CIA) of these assets, the impact that each type of breach may have on the business and to systematically ensure that we design and implement appropriate protections against those attacks.
In the previous article, we have already discussed how immense extents of sensitive business data may be involved at various points in end to end AI & ML workflows and how ― owing to (a) a diaspora of new tools and frameworks, (b) new types and combinations of systems/sub-systems and (c) new stakeholders that are involved ― we are already looking at a handful of security and privacy challenges. We will tread forward from there and peel through the other layers to identify interesting assets ‘downstream’ relative to the volumes of business data that is used for ‘learning’.Interesting ‘new assets’ that AI & ML introduce
At a very simple level, a lot of ML algorithms (especially ones concerned with prediction or classification) essentially try to work on a numerical problem that looks like:
y= w.x + b
Here ‘ x ’ represents the inputs (or features) and ‘ y ’ the corresponding outputs or outcomes as observed in past data.
So, in a home sales prediction context, ‘ x ’ may be the attributes of homes that influence their price (such as the built-up area, the yard size, the locality, the condition, etc.) and ‘ y ’ may be the prices home sales have fetched in the last few months. The task of the algorithm is to discover the optimal ‘ w ’ and ‘ b ’ that can explain the past data and that may be used to make good future predictions ‘ y ’ given a previously unseen inputs ‘ x ’. This process of working out the ‘ w ’ and the ‘ b ’ (referred to collectively as ‘weights’ or ‘ w ’ hereafter) is called ‘training’ or ‘learning’.
Once the algorithm ‘learns’ the weights ‘ w ’, we can use them to predict what a home newly placed on the market will likely sell for. (In most real world problems, the evaluation of ‘ w ’ involves computationally intense and expensive operations on very large matrices.)
In the backdrop of this really brief overview, let us look at the interesting new assets that emerge from AI & ML:1. Features
In many ML problems, data scientists work closely with domain experts to devise the best ‘representation’ of the data to machine learning algorithms. This is called ‘feature engineering’ and can take a lot of effort and insights. Domain expertise helps towards the intuitions on what might make interesting features to consider (or not) and data scientists or statisticians help in figuring out the most appropriate ways to factor in those features. Thus a good choice and combination of features can yield better results even if you are starting from the same training data and that makes the artifacts of ‘feature engineering’ important assets from a data protection standpoint.
(These days, it is becoming more common for the model to ‘learn’ these features by itself ― especially in larger systems. The technique is called ‘feature learning’ or ‘representation learning’ and the rationale is to feed in all available inputs and let the algorithm (internally) figure out which features matter and which don’t. When that happens, the ‘features’ remain internal to the model. That is, there is no explicit artifact called ‘features’ to worry about protecting. However, where features are hand-engineered they represent a valuable artifact that needs to be treated just like any other information asset.)
A typical ‘deep’ neuralnetwork 2. Model Hyper-parameters
Most machine learning algorithms have several ‘settings’ that can be tweaked to modify the behavior of the algorithm. These settings can be thought as ‘design choices’ that define the physical characteristics and behavior of the underlying machine learning model. For e.g., in the case of linear regression, ‘learning rate’ is something that influences how fast the model converges (or not) in its search of the optimal weights. In the case of deep neural networks (see pic above), there are many other choices such as the number of layers (depth of the network), the number of neurons in each layer (the height of the layer), the batch size to use during training, the number of passes to make, the optimization method to use, etc., etc.
These settings are called “ hyper-parameters ” because their choice influences the eventual “parameters” (i.e., the coefficients or weights) that are learned by the model from the training data. In larger problems, dozens of such choices may be involved and it takes much work to discover and settle upon the correct combination which can provide desired outcomes. In problem contexts where the data itself is not unique (e.g., image recognition in a scenario where millions of images are available to all parties), these hyper-parameters represent a competitive edge. In other words, once you have invested a lot of hard work to create a model that has started producing great results, the respective hyper-parameters are no different than any other ‘high value asset’ (HVA) for your organization and it becomes important to think about protecting them wherever they may reside.3. Weights or Coefficients
Similar to hyper-parameters, the weights/coefficients (the ‘ w ’ and the ‘ b ’ from the “ y = w.x + b ” above) learned by the model represent all the invaluable ‘insights’ that the model has gleaned from millions of records of data that it peered through in the training phase. The future predictions from the model are a simple (and often quick) mathematical operation on the new data point using these weights.Just like ‘hyper-parameters’ these weights are ‘reusable’. Moreover, they are even more ready-made for reuse as compared to hyper-parameters. Using a technique called ‘transfer learning’, other