End-to-end Information Flow Security Model for Software-Defined Networks

Software-defined networks (SDN) are a novel paradigm of networking which became an enabler technology for many modern applications such as network virtualization, policy-based access control and many others. Software can provide flexibility and fast-paced innovations in the networking; however, it has a complex nature. In this connection there is an increasing necessity of means for assuring its correctness and security. Abstract models for SDN can tackle these challenges. This paper addresses to confidentiality and some integrity properties of SDNs. These are critical properties for multi-tenant SDN environments, since the network management software must ensure that no confidential data of one tenant are leaked to other tenants in spite of using the same physical infrastructure. We define a notion of end-to-end security in context of software-defined networks and propose a semantic model where the reasoning is possible about confidentiality, and we can check that confidential information flows do not interfere with non-confidential ones. We show that the model can be extended in order to reason about networks with secure and insecure links which can arise, for example, in wireless environments. The article is published in the authors’ wording.


Introduction
The traditional approach to the networking assumes that a network is constructed using vendor-specific hardware which is tightly coupled with a proprietary software which implements distributed protocols.Protocols can provide various services including topology discovery, routing, access control, quality of service and other features.Network operators must install these devices and configure every protocol they intend to use.This tight integration of forwarding and control functionality within proprietary devices restricts innovations and slows down introduction of new network services to modern networks.Bringing open standards and programmability to networks are key points of introduction of software-defined networks (SDN).
Software-defined networks have drawn a lot of attention in recent years and provide a rich set of concepts for centralized management of modern networks.The main aim of SDNs is to provide general principles of packet forwarding and to decouple control software from forwarding devices.This makes it possible to bring innovations to networks without changing the underlying hardware just relying on a well-defined standard collection of packet-processing functions that forms the body of the OpenFlow standard [9].Software controller provides a centralized management and orchestration of the whole network inspecting network packets and installing forwarding rules to switches under management.
However, the standard does not solve security problems which are the great challenge in todays networking [12].The centralized control of SDN can benefit in enforcing security strategies, however, the lack of models makes this problem challenging [4].We can discuss the security of SDN in three dimensions: integrity, availability and confidentiality.The integrity assumes that no data is corrupt due to internal or external events or misconfiguration.This problem was in the focus of study in [1] where the authors propose a model checking-based approach to find configuration inconsistencies that can lead to network partitioning.The availability property means that data are available when needed.At some extent this property is achieved by load balancing in SDN [8].
The confidentiality considers that secret data cannot be inferred by an attacker or unintentionally.This policy can be imposed by using access control lists, encryption etc.One of the recent attempts that introduce access control lists to SDN is [5].However, access control does not prevent leaks of confidential data through improperly configured or buggy software [11].The confidentiality property can be seen in a broad sense, so we focuse on the end-to-end confidentiality.We assume that an attacker can observe nonconfidential entities of the network and has a limited access to the network infrastructure.The confidentiality can be achieved at some extent when network resources are separated from each other in slices [7], however, slices isolate flows in the network, thus, are too restrictive.Software nature of network control in SDN is a cause for a try to apply security methods that are developed for programming languages [11].
There is an extensive work on semantic foundations of networking programming languages that can provide a solid basis for reasoning about networks.One of the first attempts was Frenetic language [6] that provided abstractions for SDN programming and means for combining these abstractions in a meaningful and consistent way.The NetKAT project [2] defined a semantics that can help to prove reachability in networks (which is an integrity property) and address several security properties at once, however, the decision procedure for this formalism has PSPACE complexity.Focusing only on confidentiality may reduce complexity of verification.The confidentiality property was investigated for programming languages [11] and implemented for model [3] and industrylevel languages [13].This approach is based on rigorous semantic rules that impose restrictions on information flows in programming languages.
In this paper we propose a framework for checking confidentiality on-the-fly for modern SDNs that are conformed to the OpenFlow standard.We introduce a set of semantic rules that help us to verify that controller application does not allow nonconfidential information flows.We assume that network consists of high and low security nodes and latter we extend our concept to a model of network that can contain secure and insecure links.
Consider a simple model of a software-defined network.Let us assume that the network consists of endpoints or hosts that generate data traffic and a set of unified intermediate nodes forwarding the traffic.These forwarding devices are OpenFlow switches that conform to standard [9].There is a single node representing a controller application that manages all the switches by using secure channels.Thus, a network can be represented as a graph where the nodes are either hosts or switches, and the edges are links.
The OpenFlow switch contains a set of physical or logical ports which are interfaces for passing packets between the switch and the network.According to specification [10] the OpenFlow switch consists of an OpenFlow channel, one or more flow tables, and a group table.The OpenFlow channel is used for managing the switch and for passing relevant data about the traffic under management to the controller.Flow tables provide means for forwarding and processing packets.The controller can add, update or modify flow entries in flow tables.Such an entry consists of match fields, counters and a set of instructions to apply to matching packets.The group table enables additional methods of forwarding by representing a set of ports as a single entity.Thus, group tables do not represent a fundamentally different abstraction and can be modeled via flow tables.So, we exclude group tables out of consideration.
Each arriving packet is matched to flow table entries starting from the first one.If the match is found, instructions associated with this flow entry are executed.If the packet is mismatched to each table entry, the outcome depends on the table-miss flow entry.Such a packet can be passed to the controller, dropped or handed to the next flow table.We will assume that the packet is passed to the controller.
The match field is a predicate which partitions the set of all flows passing through the network.Standard [10] proposes that matching field is a conjunctive predicate where each conjunct can impose conditions on various packet headers including Ethernet, IP, TCP, etc.Each flow has a source and a destination host.Thus, the matching field can be modeled as m src ∧ m dst , where m src and m dst are conjuncts for matching the source and destination hosts of the flow, respectively.
Counters are variables that contain statistical information about flows.Since counters have no direct impact on forwarding, exclude them out of consideration.
Let us consider instructions that can be executed if a packet is matched to a flow table entry.The standard proposes that instructions are lists of actions.Some of these actions are required to be implemented by switch designers and the rest are optional.
The actions are executed in the order specified by the list and are applied immediately to the packet.We consider only the following actions: • Output(port).This action specifies the port to which the associated packet will be forwarded.
• Drop.The packet can be discarded from the network using this action.
• Set.The optional set action allows to modify packet header fields, such as IP and MAC addresses, various tags, etc.
• Delete.This action deletes flow entries according to a match.
We limit ourselves to considering only listed actions when trying to capture most relevant OpenFlow processing features and not to overwhelm the model.
The packet processing model is the following.Upon receiving an incoming packet p, the controller emits an ordered list of match fields each of which is paired with an action.This list is installed to the switch.
The controller software implements specific network applications.There are a lot of them.For example, the controller can implement a simple hub application where it installs such forwarding rules to a switch, so an incoming packet is flooded to all switch ports except for an ingress port.Other applications include a learning switch, where the controller determines what subnets are reachable from different switch ports and it installs forwarding rules in such a manner that the incoming packet goes to a port from which its destination host is reachable, otherwise it is flooded.The controller can implement various security checking policies, for example, allowing to forward a packet from authenticated hosts and dropping packets from other hosts.

End-to-end Security Model for SDN
The controller application gathers all the information about the network under management.So, we can assume that the security level of each endpoint is known.The security level can be revealed using some kind of a protocol or can be defined ad-hoc.For the sake of simplicity we assume that there are two security levels of endpoints: high and low.Since a host is identified by the IP address, we can think that the controller can map the space of IP addresses of the network under management into a set of security levels.Denote a security level of a host h as h : low or h : high.
For further discussion we need means for reasoning about sets of hosts.The network itself or its subnets aggregates hosts with different security levels.Define security predicates exists and f orall that will give us a security type for a set of hosts {h 1 , . . ., h n } as shown in Fig. 1.
If the set of hosts is homogeneous, i.e. all hosts have the same security level, the predicate f orall can be typed with the same security type as any host in the set.On the other hand, the exists predicate is high only if the set contains a high host.This predicate can not be typed as low and it will be seen later that we only need to check a possibility to reach a high security host.
One of the primary functions of the controller is routing that is essentially reasoning about reachability in networks.Model the network as a graph, so we are forced with More accurate algorithms that take into account network policies can be found in [2].
The data plane of the network is represented by switches that use flow tables for implementation of network policies.Each flow table entry contains a matching field that is modeled as a predicate match = m src ∧ m dst .We define the functions src and dst that map a predicate to a set of source and destination hosts, respectively, such that the predicate is true.
The next part of the model is a packet processing context.When the OpenFlow switch can not match the packet to any flow table entry, the model assumes that the packet is forwarded to the controller.The controller can examine headers of the packet and determine the host that emitted the packet.Security type of the host implies the packet processing context so we can analyze whether the controller generates a secure response to the packet or not.
A security-type system can help to reason about the security type of a single interaction between a switch and a controller.Figure 2 presents typing rules for instructions that can be installed to a switch s by the controller in response to a packet pkt.We can use the presented security-type system for inferring a type of the interaction.If the type can be inferred, the interaction is secure, otherwise it allows leaks of confidential data.
Let us consider a proposed set of typing rules in more detail.Rule 4 assigns a type for a packet processing context in such a way that the context pc agrees with the security type of the source host of the packet pkt.The packet processing context is a virtual action in the list formed by the controller.
For the Drop action (rule 5) we strictly isolate flows of different security levels, that is, the source host of the flow must correspond to the context of the action.Such a type setting prevents interference between packet processing contexts and actions of different security levels.Violation of this can lead to a covert channel when low hosts discover that a high host installs Drop action by observing occasional drops.Setting low type to the Drop action ensures that under the low security packet processing context a drop can occur only for low security flows.Non-interference property holds even if we allow a low security packet processing context to drop high security flows since no information about high security flows can be inferred.However, we discard this and guarantee that integrity for high security flows can not be broken by low hosts.
exists(dst(match)) : high exists(reachable(s, port)) : high f orall(src(match)) : low f orall(dst(match)) : high f orall(src(match)) : low f orall(src(pattern)) : low f orall(src(match)) : high f orall(dst(pattern)) : high f orall(src(pattern)) : high The Output(port) action type depends heavily on the matching condition (rules 6-7).If the match forwards traffic to high security hosts, there must be a high security host reachable from the port.In this case the security context of Output(port) is high.If the source of the traffic is a low security host, it can be forwarded anywhere and the security context of this action is low.The Output(port) action can not be typed if the match condition specifies that traffic from high security hosts must be forwarded to a low security host.If this is the case, f orall(src(match)) can not be typed as low and exists(dst(match)) can not be typed as high implying that premises for both rules do not hold.
Rules 8-9 for Delete action guarantee that the eviction of flows from the flow table of the switch is done in the respective security context.So, a low packet can not be a reason to remove high matches and vice versa.
Rule 10 guarantees that any low security flow can not become a high security flow by changing the source address of the packet.By imposing this condition we achieve a certain level of integrity since a low packet can not become a high packet that may later influence other high security flows.Rule 11 assures that a high security flow stays high providing no information leak to the low security plane.In both rules we denote as a pattern the data that have to be written to the packet header.
The controller can respond with several actions at once, thus we must have means for inferring a security type for a list of actions.This can be done using rule 12 that assigns a security type for a composition.Here, A and B can be either single actions or lists of actions.
The proposed rules constitute a security-type system which describes what security type must be assigned to a list of actions.This list of actions is formed by a controller in response to a packet incoming from the switch.The packet specifies the first action in the list called a packet processing context.If the whole list can be typed using the proposed security-type system, the list ensures non-interference among flows of different security levels and fulfills some integrity properties.This security-type system can be further extended to SDNs with insecure links.We can define an insecure link as a channel that can not be trusted since they are exposed to everyone like Wi-Fi medium or may be public channels shared by various tenants.This setting leads to a new confidentiality violation since high data traffic may be forwarded to an insecure link.It could be noted that any link can be secured using traffic encryption.We propose the following extension to our model.Let us assume that every link has a security level (high for secure links and low for insecure ones) and it is known to the controller.It is the same that we did for endpoints.Also we must provide means of reasoning about secure paths in the network.
Let us define a function reachable s (s, port) that calculates a set of hosts that are reachable via paths such that every link in the path is secure.Since the controller has the information about the network graph, it can be done using breadth-first search or taking into account current network policies [2].
Since a confidentiality flaw can occur when high traffic is forwarded to an insecure link, we must only refine rule 6 that is used for inferring the type for Output action considering high traffic.We propose the following change: Thus, we allow high traffic only to those switch ports that start with a secure link and have the possibility to reach the destination host using a secure path.
This shows that the proposed model can be used as a basis for reasoning about various aspects of confidentiality in software-defined networks.We consider a learning switch application as an example.The switches in the network initially have no flow entries and forward incoming packets to the controller.The controller examines each packet and stores in the internal database the source address of the packet along with the port from where it was received.The port and packet headers are forwarded to the controller as an OpenFlow packet in message.Next time the switch receives the packet destined to the address that was seen earlier, the controller can infer the port to which the packet must be forwarded.If the port can not be determined, the packet is flooded to all the switch ports.

Algorithm 1 Learning switch algorithm
1: pkt ← packet arrived to the controller 2: port ← from which port pkt received 3: if find(src(pkt)) is null then push (src(pkt), port) for all switch port i other than port do emit (dst(pkt), src(pkt))×Output(port) 14: end if A simple algorithm for the learning switch is shown as Algorithm 1.The input data for the algorithm is an incoming packet pkt and the port port from which it has been received.The controller maintains an internal database which can be implemented as a hash which supports the following operations: • push(address, port).The operation creates a mapping between the address and the port in the internal database.
• find(address).This is a query to the database which returns port number associated with address and null if there is no such an association.
There is an emit operator in our language which appends the action to the list of instructions destined to the switch.The list is sent to the switch when the algorithm is stopped.Then we can analyze the list and find if it is secure or not.
Algorithm 1 checks whether a mapping between a source address of pkt and port exists.If there is no such mapping, it writes it in lines 3-5.Thereafter, we try to find if we have learned the port to which we can forward the packet pkt (line 6).If no such a port exists then we flood the packet to all ports except ingress port (lines 8-10).Otherwise, we emit forwarding rules which set up a duplex channel between source and destination hosts of the packet (lines 12-13).We assume that entries responsible for flooding packets will be eventually evicted from switches and replaced by direct forwarding entries.
Recall the network from Fig. 3. Assume that the controller database is empty and there is no forwarding rules at switches, so each switch sends a packet in message to the controller upon a packet receipt.The security flaw arises even when the first packet travels from any high security host.For example, if the host 10.0.0.2 sends a packet pkt to the host 10.0.2.1, the following list of rules will be emitted by the controller to the switch 1 according to lines 8-10 of Algorithm 1: Likewise, ∀(src(match)) : low, hence we can not infer the only premise for rule 7. Thus, the considered action can not be typed, so the whole list can not be typed.Algorithm 2 proposes an enhanced version of the learning switch.This version is free from many security leaks but let us analyze it formally.The algorithms breaks into two parts.The first one is represented by lines 8-19 where packets from low sources are processed.If the output port can not be identified, the packet is flooded to all ports of the switch (lines 9-11), otherwise forwarding rules are installed to the switch.These rules include the one which redirects packet pkt to the destination host (line 13 and another which either create a channel with the opposite direction (line 15) or sets the action to Drop if the opposite forwarding rule forms a route from high host to low host (line 17).The second part of the algorithm processes packets from high sources (lines 21-31).If the destination for such a high packet is a low host, we drop the packet (line 22 for all switch port i other than port do (src(pkt), dst(pkt))×Output(f port) Assume that f port is null, the packet must be flooded to all ports except port (lines 9-11).So, the controller emits packet out messages which can be typed using rule 7: .
Applying rule 12, we have Thus, the whole list of emitted actions is typed and these actions are safe.Assume that f port is not null, then the controller emits an action in line 13 which safety can be ensured using the same inference as in flooding case above.The second action of the list depends on the security type of dst(pkt).If it is low, the action in line 15 is emitted.The security type of the action is the following: Thus, the security type of all emitted actions agree, so the whole list can be typed as low.If dst(pkt) is high (line 17), only the following can be inferred: This means that the security type of the Drop action from line 17 does not agree with the security type of previous actions and the packet processing context which are low.Thus, the Drop action can not be considered safe.Indeed, low packets must not trigger packet drops originated from high security hosts.If we carefully examine the code, we

Conclusion
Security is challenging in networking and must be further investigated for softwaredefined networks.There is a lack of formal models for making security analysis [4] and the paper proposes the approach that is based on a formal security-type system.This system ensures that the controller application does not violate security properties such as confidentiality and, at some extent, integrity.We have extended the proposed system so that can verify new confidentiality properties in case of insecure network links.The security system can be implemented as a software module of the controller and check whether network applications violate security properties.

Fig 2 .
Fig 2. Security-type system for SDN

Fig 3 .
Fig 3. A sample network with high and low security hosts
).If the Algorithm 2 Secure learning switch algorithm 1: pkt ← packet arrived to the controller 2: port ← from which port pkt received 3: if find(src(pkt)) is null then controller does not find the port to forward the packet, the packet is flooded but only to high ports (lines 25-27), otherwise forwarding rules are installed (lines 29-30).Let us show how security properties of Algorithm 2 can be proved.If the condition in line 8 is true, the following holds for lines 8-19 by rule 1: