GAML-MuD2 IT Logo

GAML-MuD2 IT

Game Theory and Machine Learning for Multi Domain Deception in IoT

hero image

Topics

GAML-MuD2 IT

Design of contexts

Pre-requisite of the two topics

Building effective defensive deception requires understanding the complexity of the target environment, the nature and processes of attacks, as well as strengths and weaknesses of existing countermeasures. This is often not studied and detailed and only remain general in existing studies. For instance, APT attacks are complex and performed in multi-stage and the system consists of cyber and physical layers. This thrust therefore provisions game-theoretic and machine learning defensive deception techniques for real systems. The following specifies activities in this thrust.

Revisiting network architecture

This activity consists of investigating the parameters related to the network architecture (such as the points of vulnerabilities, network layers) when it comes to protection.

Revisiting APT

In this task, APT techniques are dissected while considering the complex nature of deployed networks relying on attack representations (attack graphs, networks), inter-relations of domains and different episodes of the attacks.

Revisiting defenses

This task consists of formalizing aspects characterizing APT defenses such as intrusion detection systems, incident responses and firewalls, in designing static and dynamic games for CPS.

Revisiting defensive deception

This activity consists of formalizing aspects characterizing APT defensive deception in static and dynamic games.

Topic 1

Game theoretical defensive deception

Once APT and existing countermeasures in IoT related systems are dissected and characterized, this thrust’s main objective is to design approaches for multi-domain and multi-objective deceptive games.

Modeling multi-objective games

Unlike single-objective games, where players seek to maximize or minimize a single objective function, multi-objective games (Cho et al. 2020; Asgharnia et al. 2022) involve many objectives and thus lead to many objective functions. For example, the defender may look to reinforce concealment through address mutation, fingerprint anonymization and configuration diversification (Duan et al. 2018). In this respect, it can be seen as a game in which the utility functions are vectorial and therefore a trade-off between these objectives should be found. Multi-objective games can be used to model and resolve scenarios where actors are faced with several competing or conflicting objectives. These games make it possible to take into account the different dimensions of security and defense in a complex cyber environment. In cyber defense, these games can help to better understand the trade-offs and interactions between different security objectives, evaluate defensive strategies and make optimal decisions. They also find optimal solutions that take into account the constraints and preferences of the actors involved in cyber defense. These games are characterized by a couple of ingredients. They take into account more than one objective. In general, these objectives might be independent, however in our context of cyber resilience, they are linked. Some might be more important than the others. Players have different preferences among these objectives. Due to the presence of multiple objectives and the need to consider individual preferences, decision making in multi-objective games is generally more complex than in single objective games. Multi-objective games often require negotiation and cooperation between players to reach satisfactory agreements. Players can exchange information, propose compromises, or establish alliances to achieve outcomes beneficial to all participants. In this context, we should build a cooperative game model to formalize the cooperation of defenders. We assume, in this context, that defenders can form coalitions in order to be more effective against different attacks. The characteristic function of this game will evaluate the impact of each coalition of defenders in protecting the system. The well-known values and power indices theories will be used in cooperative games to measure the influence of each defender in the collective defense strategy. This will make it possible to know the actors whose withdrawal (or betrayal) will cause more damage. Thus, we can strengthen collective defense by monitoring more defenders with the greatest influence, or in the defense strategies make the dummy defenders more apparent when they exist or create them otherwise. Thanks to different interaction indices defined in cooperative games, the degree of interaction will be measured between defenders in order to know, in the event of failure of a defender, which defender is likely to counter the attacks as much as possible of the attacker. We will evaluate the needs of the different coalition of defenders, and will use a suitable cooperative solution concept to determine the optimal allocation of resources between defenders. A cost/effectiveness analysis will also allow to determine the different optimal configurations (partition or covering of all defenders), making it possible to better counter possible attacks.

Consideration of multi-domain notion

As developed above, multi-objective game theory can help model the management of defense resources in a cyber environment. Defenders should allocate their limited resources, such as computing power, storage capacity, or security specialists, to achieve multiple objectives such as intrusion detection, incident response, and data protection.. These objectives can be expressed across the cyber and physical layers simultaneously in the target system . Taking into account the inter-related layers as domains will lead to the modeling of multi-domain games. In the second part of this Research trust, we will formalize the interactions between domains through multi-domain game. Based on the obtained results on multi-objective games (in the first part), we will analyze the obtained game to display and understand the relationships between attacks from different domains. The obtained equilibria (optimum solutions) of the game will provide a better understand the defenders’ strategies. To sum up, we will introduce and study a multi-objective game in a single domain. Using the obtained results, we will introduce and study multi-domain games. Likewise, a multi-domain game can fit into the classical case of single objective game.

Collaborative deception with commutable profiles

In this research thrust, collaboration in attack and in defence is considered. Allies – either in the same CPS environment or not – strategically work together to reduce the attack surfaces of the enemies. Enemies – either in the same organization or not – collaborate to maximize their gains. However, literature lacks to consider the fact that one ally can behave as a spy in the defenders group. In other words, a defender agent can switch its role to be attacker to augment chances to succeed the attack, and vice-versa to negatively impact defensive deception. To reinforce its objectives, camouflage is used to hide fake and malicious behaviours exploited deceive the real defenders. An attacker can likewise switch to be a defender. These situations often happen in real scenarios related to espionage attacks. In this thrust, three main questions should be answered: (i) How to formalize the parameters of commutation of profiles for the game? Which games are more appropriate to characterize this situation? How to design deception games that respect these situations, while considering the commutation of profiles?

Topic 2

ML-based Defensive Deception

In this thrust, computational and artificial intelligence artifacts are coupled to automate the evolution of the defensive deception process and to build the subsequent memory that a defender can rely on. This work package is intended to leverage ML-based techniques resilient to adversarial attacks and able to capture deceptive objects, real objects and malicious objects profiles in the system. The contribution of this thrust is five-folds: (i) to couple computational and artificial intelligence techniques to create and attach temporal and robust/reliable profiles of any component involved in the interaction between the attacker and the defender; (ii) to build a model relying on previous profiles that virtually represents the whole deception games including all the interactions between the players and inter-layered flows; (iii) to investigate transfer learning and other techniques that can optimize the learning processes during the deception so that the defender can quickly observe and anticipate (iv) to model possible adversarial attacks exploited to deteriorate the performance of classifiers, and to design adequate mitigation approaches (v) to consider the real-time aspects of the deceptive process to build evolutive intelligence that can support guide the defender on the actions to perform based on the intents of the attacker. The following gives more details on each research sub- thrust.

Generating intelligence

The possible paths of attack across layers will be represented using a cyber-attack graph that is generated based on the vulnerabilities associated with each node of the network. The attack graphs are a direct graph made up of nodes that are pre- or post-conditions of an exploit, and edges referring to consequences of having a pre-condition that enables an exploit post-condition. Attack graph form knowledge of the attacker. The defender’s goal that is to take deceptive actions to prevent the attacker from taking control over the network resources can be designed as deceptive graph guided by the attack graph structure. Knowledge is built based on attack graphs that have the possible paths from the entry point (cyber, physical, node or layer) to the target point. The deceptive graph is accordingly built with adequate attributes based on placement of decoys. During the whole process, these graphs are dynamically fed and recorded as conditional rules to guide movement between nodes. The concept drift will also be monitored as well as mitigation methods such as Adaptive WINdowing (ADWIN) (Bifet & Gavalda 2007) will be experimented to assure temporality and robustness. SDN techniques with their capability to abstract networks in programmable software to control security’s aspects, are expected to be to generate large amounts of traffic flow data, and to train ML models to identify real to fake decoy objects. Mechanisms are proposed to profile machine learning adversarial attacks based on the attacker strategies. In literature, authors do not consider that it is possible for an attacker to plan evade from ML detectors. This aspect should be considered when designing the games. After profiling, this activity should minimize ML attack effects on the delivery performance.

Adversarial attacks mitigation

In this activity, generative models such as Defense- GAN relying on Generative Adversarial Network (GAN) are studied to protect the different classifiers against adversarial attacks. But also, these techniques will be explored to intentionally generate fake images of the systems, to augment to concealment to the attacker (Lopes Antunes and Llopis Sanchez 2023). This involves the consideration of the vulnerable parameters of machine learning possibly exploited by the attacker, to build the deception games.

Training optimization during deception

In this activity, transfer learning will be explored to avoid re-training that costs resources. But the constraint here would be to find a model that optimally fits with the deceptive environment and the type of attacks. Incremental and online learning would be investigated to look for schemes enabling addition of current instance to the previous knowledge and even new detected class of attack, instead of re- generating the knowledge from scratch. The techniques of feature relevance would also be useful to compress the information necessary to build the deception and attack intelligence.

RL and Deep Reinforcement Learning (DRL)

In this activity, deception games based on RL and DRL techniques such as Markov Decision processes (MDP) and Q-learning will be designed to find an optimal policy of deployment for deceptive resources such as honeypots in different domains or layers in the CPS (Li et al. 2022). This orientation has three main advantages: exploiting RL rewards to formulate players’ utility functions, designing an evolutive learning without prior datasets and hybridizing game with ML parameters. RL and DRL require manually defined of behaviours through reward functions that constitutes learning from the environment. This process takes lot of time and therefore not appropriate to define all possible set of rules and rewards for environments necessitating a high degree of flexibility and adaptability. For systems in domains of military domains Imitation Learning (IL) techniques (Ogenyi et al. 2019; Zare et al. 2023) might be exploited to provide demonstrations, thus eliminating the development of explicit reward functions. IL could be leveraged to accelerate the learning during deception and to design the utility functions based by imitating an expert’s behaviour through the provision of demonstrations.

Apply now

Need more information?

For more information, feel free to write via email at any time at the address below

[email protected]
(237) 694 485 416 / (237) 696 465 767
GAML-MuD2 IT Logo

GAML-MuD2 IT

Game Theory and Machine Learning for Multi Domain Deception in IoT

Copyright 2024

University of Ngaoundere