Report - Makhlouf Hadji

Page 9. ABSTRACT. Combinatorial optimization has its roots in combinatorics, operations research, ..... We start the repacking problem model from an initial VM placement solution and .... In our case, we propose to use the upper bound in (2.5) refined as the ..... The providers can offer all unused capacity in a spot market.
1MB taille 4 téléchargements 237 vues
Universit´e Pierre et Marie Curie - Paris 6 ´ Ecole Doctorale EDITE de Paris ´ Ecole Doctorale Informatique, T´el´ecommunications et Electronique

Habilitation a` Diriger des Recherches Discipline: Informatique

Presented by

Makhlouf HADJI

Mathematical Optimization Methods for NFV and Cloud Resource Allocation Problems Committee : Mr. Mr. Mr. Mr. Mr. Mr. Mr. Mr. Mr.

Luigi ATZORI Prosper C HEMOUIL Bernard C OUSIN Filip D E T URCK Ahmed K AMAL Dusit N IYATO Guy P UJOLLE James ROBERTS Djamal Z EGHLACHE

Examiner Reviewer Reviewer Examiner Examiner Reviewer Examiner Invited Examiner

Associate Professor, Cagliary University, Italy Research Director, Orange Labs, France Professor, Rennes 1 University, France Professor, Ghent University, Belgium Professor, IOWA State University, USA Associate Professor, Nanyang Technological University, Singapore Professor, Pierre & Marie Curie University, France Professor, Lincs Laboratory, France Professor, IMT T´el´ecom SudParis, France

2

To Cam´elia and Louise

3

4

The formulation of a problem is often more essential than its solution, which may be merely a matter of mathematical or experimental skill. To raise new questions, new possibilities, to regard old problems from a new angle, requires creative imagination and marks real advance in science. - Albert Einstein

5

6

ACKNOWLEDGMENTS

First, I would like to deeply thank the reviewers of this manuscript: Pr. Bernard Cousin, Dr. Prosper Chemouil and Dr. Dusit Niyato, for their valuable and interesting evaluation, remarks and suggestions. I am very thankful to Pr. Filip De Turck, Dr. Luigi Atzori, Pr. Ahmed Kamal, Pr. Guy Pujolle, Pr. Jim Roberts and Pr. Djamal Zeghlache, for accepting to examine my HDR manuscript and for their time they spent to evaluate it. I am particularly much obliged to Pr. Djamal Zeghlache who helped me enormously by giving explanations on interesting and prominent research areas such as cloud computing and network functions virtualization. I am very thankful to James Roberts, for carefully reading all the chapters of this manuscript, for giving corrections and suggestions improving the text and the layout and for helping me with useful background information. In preparing this manuscript, I have profited greatly from the support and help of many friends and colleagues to whom I would like to express my gratitude. I am also very thankful to all the IRT SystemX members for their support. I am extremely grateful to my wife Selma and my daughters Cam´elia and Louise for their devoted support and patience. I would like to apologize to them for all the week-ends I missed to be with them.

7

8

ABSTRACT

Combinatorial optimization has its roots in combinatorics, operations research, and theoretical computer science. A main motivation is that thousands of real-life problems can be formulated as abstract combinatorial optimization problems. Resource optimization in cloud computing and the domain of Network Functions Virtualization (NFV), is required to reduce cost and improve infrastructure utilization. To ensure system efficiency, we need new algorithms that scale well and converge rapidly to optimal solutions. In this manuscript, we address NP-Hard combinatorial optimization problems in cloud computing and NFV areas. We focus on two problems such as virtual resources repacking and critical resource placement in a cloud federation. Next to that, we consider virtualized network functions (VNFs) placement and chaining problems in a physical network substrate. We provide near-optimal placement of Virtualized Network Functions (VNFs) chains in order to deliver, on demand, virtual networks with dedicated network functions according to a tenant specified chaining of their VNFs. Our contribution in these areas consist in providing novel, scalable and cost-efficient algorithms based on graph theory, b-Matching, matroids and the Perfect 2-Factor theory to optimally solve the above problems. All of our algorithms have low complexity confirmed by simulation and performance evaluation results. Our manuscript aims also to stimulate research on combinatorial optimization applied to cloud computing and NFV areas that are changing the way infrastructures and networks are developed, deployed and put into production. Thus, we provide throughout this manuscript, some open problems attracting interest of researchers and industrial companies. Keywords: Cloud Computing, NFV, Combinatorial Optimization, Matroids, Perfect 2-Factor, Graph Theory, Forwarding Graphs

9

10

´ RESUM E´

L’optimisation combinatoire d´ecoule des champs de recherche connus comme ceux de la recherche op´erationnelle et de l’informatique th´eorique. Cette th´eorie est motiv´ee par le fait que plusieurs probl`emes peuvent eˆ tre formul´es comme des probl`emes d’optimisation combinatoire. L’optimisation des ressources dans les domaines du cloud et du NFV est pr´erequis dans le but de r´eduire les coˆuts de l’infrastructure. Pour atteindre cet objectif, il est n´ecessaire de proposer de nouvelles solutions algorithmiques qui passent a` l’´echelle et qui convergent rapidement vers la solution optimale du probl`eme pos´e. Dans ce manuscrit, on adresse des probl`emes d’optimisation combinatoire difficiles au sens de la complexit´e algorithmiques, dans deux domaines e´ mergents a` savoir le cloud et le NFV. Ces deux derniers domaines sont compl´ementaires et permettent de proposer des services de r´eseaux virtuels et a` la demande avec plus de facilit´e, tout en permettant un partage e´ quitable de ressources et des coˆuts d’exploitation de plus en plus faibles a` la fois pour les fournisseurs de services et a` l’utilisateur final. Nous adressons dans ce rapport des probl´ematiques de placement et d’allocation de ressources dans le cloud et proposons des sch´emas d’approximation efficaces et qui convergent rapidement vers des solutions de bonne qualit´e, y compris pour des grandes instances des probl`emes. Ces algorithmes sont issus de la th´eorie des matroides, de la th´eorie de b-Matching et de la th´eorie des graphes, nous permettant ainsi de varier les mod´elisation math´ematiques des probl`emes en fonction de la difficult´e de ces derniers. Nous avons e´ valu´e les performances de nos algorithmes face a` diff´erents sc´enarios de topologies de requˆetes et de graphes d’infrastructure, et on a montr´e que nos approches sont performantes en termes de temps de convergence, de passage a` l’´echelle et de coˆut total de la solution trouv´ee. Finalement, il est important de mentionner que le but de ce rapport consiste aussi a` proposer de nouvelles pistes de recherche et des d´efis scientifiques ouverts dans le domaine du cloud computing et dans celui du NFV. Mots-Cl´es: Cloud Computing, NFV, Optimisation Combinatoire, Matroides, 2-Factor Parfait, Th´eorie des Graphes

11

12

Table of Contents 1

Introduction

2

Virtual Machines Placement and Repacking Optimization 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Placement and Consolidation Algorithms . . . . . . . . 2.2.1 A smart placement is not enough . . . . . . . . . 2.2.2 b−Matching formulation . . . . . . . . . . . . . 2.2.3 Graphic matroid and greedy algorithm . . . . . . 2.3 Summary of Results . . . . . . . . . . . . . . . . . . . . 2.4 Conclusion and Future Work . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

21 21 22 23 23 28 30 32

Critical Resource Allocation in Cloud Federations 3.1 Introduction . . . . . . . . . . . . . . . . . . . . 3.2 Critical Placement Algorithms . . . . . . . . . . 3.2.1 Problem complexity . . . . . . . . . . . 3.2.2 Mathematical program formulations . . . 3.2.3 Critical Node Detection (CND) algorithm 3.2.4 Critical Edge Detection (CED) algorithm 3.3 Summary of Results . . . . . . . . . . . . . . . . 3.4 Conclusion and Future Work . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

35 35 37 39 39 40 42 44 46

Virtualized Network Functions Chaining and Placement Problem 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 NFV definition . . . . . . . . . . . . . . . . . . . . . . 4.1.2 State of the art . . . . . . . . . . . . . . . . . . . . . . 4.1.3 Our contribution to NFV . . . . . . . . . . . . . . . . . 4.2 VNF Placement and Chaining Algorithms . . . . . . . . . . . . 4.2.1 Perfect 2-Factor algorithm . . . . . . . . . . . . . . . . 4.2.2 Algorithm based on multi-stage graph construction . . . 4.3 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Conclusion and Future Work . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

47 47 47 48 49 49 50 53 54 56

Research Perspectives and Challenges 5.1 VNF Placement and Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57 57

3

4

5

19

13

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

14

TABLE OF CONTENTS 5.2 5.3

virtual Content Delivery Networks (vCDN) Placement and Migration: an ETSI Use-Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optimal Split of BBU Functions in C-RAN . . . . . . . . . . . . . . . . . . . . .

A Mathematical Tools for Combinatorial Optimization Problems A.1 Combinatorial Optimization : Some Basic Examples . . . . A.1.1 Flows problem . . . . . . . . . . . . . . . . . . . . A.1.2 b-Matching problem . . . . . . . . . . . . . . . . . A.1.3 Matroids . . . . . . . . . . . . . . . . . . . . . . . A.2 Cloud Computing and NFV . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

59 60 69 69 71 72 73 74

List of Tables 2.1

Time resolution of the four algorithms . . . . . . . . . . . . . . . . . . . . . . . .

31

3.1 3.2 3.3

CND algorithm’s performance analysis . . . . . . . . . . . . . . . . . . . . . . . . CED algorithm’s performance analysis . . . . . . . . . . . . . . . . . . . . . . . . Scalability analysis: 100 providers and requests with 50 nodes . . . . . . . . . .

45 45 46

4.1 4.2

Comparison using Interoute-Network . . . . . . . . . . . . . . . . . . . . . . . Comparison with state of the art of [45] . . . . . . . . . . . . . . . . . . . . . . .

55 56

15

List of Figures 2.1 2.2 2.3 2.4 2.5 2.6

System model and placement problem VM to server mapping weighted edge Complete bipartite graph construction Optimal VMs repacking example . . . SLA violations comparison . . . . . Cost comparison . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

23 24 24 29 32 32

3.1 3.2 3.3 3.4

Cloud Federation System Model . . . . . . . . . . . . . . . Example of a Gomory-Hu tree of a graph G . . . . . . . . . Example of outcome of the CND algorithm cost computation Example of CED algorithm cost computation . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

38 40 42 44

4.1 4.2 4.3 4.4 4.5 4.6

ETSI VNF-Forwarding Graph view . . . . . . . . . . . . . . . . . . . . . . System Model: Physical infrastructure (left side) and SFC request (right side) Cycle representation of a VNF chain . . . . . . . . . . . . . . . . . . . . . . Multi-stage graph construction example . . . . . . . . . . . . . . . . . . . . Convergence time of our algorithms . . . . . . . . . . . . . . . . . . . . . . Service request acceptance ratio over time . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

48 50 50 53 55 55

5.1 5.2 5.3

vCDN placement and migration problem . . . . . . . . . . . . . . . . . . . . . . . C-RAN Macroscopic Architecture View . . . . . . . . . . . . . . . . . . . . . . . BBU functions splits in C-RAN . . . . . . . . . . . . . . . . . . . . . . . . . . .

60 60 61

A.1 ETSI’s NFV architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 NFV MANO architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75 76

17

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Chapter 1

Introduction Cloud computing is a large scale distributed solution for providing compute, storage and communications resources and services on demand through data centers. Cloud computing has advantages over traditional computing such as avoiding capital investments and operational expenses for users through pay-as-you-go metered services and resource multiplexing and sharing. Network Functions Virtualization (NFV) enables provisioning of on demand networking services and facilitates the development of agile networking services tuned to application and tenant requirements. NFV leverages virtualization technologies to realize network multi-tenancy with the network infrastructure shared by multiple tenants. There is a close connection between cloud computing and NFV and this can be illustrated when considering virtualization techniques (Hypervisors, Containers such as Docker, . . .) in the cloud used to virtualize network functions (VNFs) to be deployed on a physical substrate, to benefit from cloud advantages such as elasticity, scalability and multi-tenancy. In the literature, the most frequently used algorithms to cope with combinatorial optimization problems in cloud computing and NFV areas, are based on exact methods illustrated by simple mathematical models. These models are often described by few inequalities to represent problems constraints in an integer linear program. These mathematical models unfortunately lead to solutions that do not scale with problem size and this can not be solved through the rather limited number of inequalities that can be associated to the models. There is consequently a need to propose other approaches and models that are more likely to scale while providing good and whenever possible near optimal solutions. In this manuscript, we aim at proposing new approximate and competitive algorithms often based on theoretical propositions guaranteeing rapid convergence towards near optimal or optimal solutions. In fact, our algorithms are based on exact formulations of the convex hulls of the considered combinatorial optimization problems enlarged by new cutting planes or valid inequalities, and allowing us to find efficient linear relaxations to NP-Hard problems. Our manuscript discusses approximation algorithms and complexity in the areas of cloud computing and NFV that are transforming networks and providing the required flexibility and agility in 19

producing network services to support applications. Thus, we provide solutions to NP-Hard problems, that have turned out to have particularly interesting consequences in terms of convergence time and scalability for large problem instances. As a central objective of this manuscript, we aim to stimulate research on combinatorial optimization applied to cloud computing and NFV areas. Thus, we provide throughout this manuscript, some open problems and questions that attracted interest of researchers and industrial companies. This manuscript is divided into four main chapters discussing three combinatorial optimization problems in the areas of cloud computing and NFV. The last chapter is dedicated to further research and open problems. After a short introduction provided in Chapter 1, we go into a prominent problem of virtual machines placement and repacking dscribed in Chapter 2. Thus, Chapter 2 is dedicated to address new and scalable algorithms based on b-Matching theory and matroid theory, to efficiently cope with this problem. Next to that, Chapter 3 considers resource allocation problems in a cloud federation to host endusers requests (interconnected virtual machines) according to their level of criticality. We investigate and explore new algorithms based on Gomory-Hu techniques that can be found in the appendix. These techniques allow us to considerably reduce the complexity of the considered problem, when identifying critical nodes and links simultaneously. In Chapter 4, we consider end-user requests as interconnected virtual resources (that can be considered as virtual machines, or virtualized network functions that will be introduced in next chapters) under sequencing constraints. These virtual resources are considered as directed graphs to be judiciously deployed on a physical network substrate. Thus, we investigate scalable and cost-efficient algorithms based on graph theory and the theory of the Perfect 2-Factor to address optimal solutions for graph requests with exactly 3 nodes or VNFs. We propose in Chapter 5 open questions and research challenges that will be addressed in the future and are of concern to the scientific community and the industry addressing cloud computing and NFV. Finally, in the appendix, we survey some combinatorial optimization problems and well known and appropriate solutions that can be found in the state of the art. The appendix outlines the main idea that can be found in the literature to cope with NP-Hard problems.

Chapter 2

Virtual Machines Placement and Repacking Optimization In this chapter, we focus on solving the problem of virtual machines placement and repacking in cloud data centers. As this problem is NP-Hard, we investigate efficient and rapid algorithms to cope with scalability and convergence issues. Our contribution in this chapter, consists in proposing approximation algorithms based on b-Matching theory and matroid theory to guarantee a rapid convergence to near optimal or optimal solutions. We compare our algorithms with Bin-Packing and Best-Fit approaches acting as references and providing performance bounds.

2.1

Introduction

Cloud computing has the advantage over traditional computing of avoiding capital investments and operational expenses for users through pay-as-you-go metered services and allowing providers to increase their revenues. These services are deployed and offered by multiple Cloud providers as virtual resources and services at infrastructure, platform and software levels (IaaS, PaaS and SaaS) [4]. Achieving such benefits and increasing revenues requires very efficient resource utilization and the strict respect of quality of service. Without virtual resources (VMs) repacking and autoscaling, initial smart placement alone can not efficiently reduce costs since all expenses matter: hosting, energy consumption, maintenance, configuration and management costs. In this chapter, we focuses on optimal repacking algorithms to reduce overall cost and improve utilization and find the best tradeoffs between these conflicting goals. An exact mathematical model is proposed to derive a set of appropriate algorithms for the repacking problem. The repacking is achieved via migrations while minimizing costs and disruptions when relocating the VMs. We seek algorithms that scale well, minimize SLA violations and converge reasonably fast. These goals lead us to b-Matching algorithms and a greedy algorithm resolution of a graphic matroid representation of the repacking problem. Traditional Bin-Packing and a simple Best-Fit algorithm are used for comparison and benchmarks since they provide lower and upper bounds on performance. The 21

22

Virtual Machines Placement and Repacking Optimization

Best-Fit converges very fast but is very inefficient while Bin-packing is optimal but does not scale with problem size. In the following, we summarize some related works on our problem to better clarify and compare our contribution when solving the placement and consolidation problem. More details on our contribution can be found in [28]. A server consolidation algorithm noted ”Sercon” was proposed by Murtazaev et al. in [54] minimize simultaneously the number of used servers and migrations to achieve consolidation. They compare their algorithm with a well known placement heuristic, called FFD (First-Fit Decreasing), to solve the Bin-Packing problem at hand. Sercon is found to be efficient but since it is a heuristic it can not always find the optimal solution. Our goal is to find the optimal solutions and have instead used b-Matching and matroid theory that can guarantee the efficiency of the solutions. Sedaghat et al. [67] address automation of horizontal versus vertical elasticity in Clouds. They analyze the price-performance tradeoffs with respect to VM-sizes to handle increasing load. They use a repacking approach combined with auto-scaling strategies (vertical and horizontal elasticity) and show a cost saving varying between 7% and 60% in total resource utilization cost. The proposed solution is based on a set of heuristically chosen parameters and this can lead also to undesired suboptimal solutions. In one of our papers [22], we propose an integer linear programming approach to VM placement including an energy optimization criterion. The provided formulation is an extension of Bin-Packing. We take into account departures of VMs when jobs end. A consolidation algorithm based on an exact method is associated to the placement strategy to improve performance. The algorithms are integer linear programs that unfortunately do not scale to large instances. In this work, we seek faster, linear and scalable algorithms and use migrations for run time reorganization of already instantiated resources. A dynamic placement for geographically distributed Clouds presented in [72] determines where to best place applications to minimize hosting costs while respecting key performance requirements. Using control theory and game theory, they treat demand and dynamic pricing fluctuations jointly. This is a macroscopic study of the cloud placement problem with dynamic resource allocation. A tight bound is proposed in [14] for Bin-Packing resolution of the placement problem. They proposed a new bound for the First Fit Decreasing algorithm to approximate the optimal solution. Dynamic placement of VMs in a physical infrastructure without taking into account migrations and consolidations of servers, is not sufficient, as it leads to an under-utilization of physical servers. In Goudarzi et al. [23], VM replication is used to reduce the energy consumption of servers. The Authors create multiple copies of the VMs and then use dynamic programming and local search methods to place these copies in the best physical servers. The scalability is however not guaranteed for a large number of servers and VMs. We seek algorithms that scale to thousands of VMs and servers and provide the optimal solution.

2.2

Placement and Consolidation Algorithms

In the following, we start the resolution of the repacking problem of already placed and instantiated VMs, and then investigate new virtual machines migration solutions to optimize resource

Placement and Consolidation Algorithms

23

consumption in a cloud data center.

2.2.1

A smart placement is not enough

We start the repacking problem model from an initial VM placement solution and search for an optimal consolidation via migrations of virtual resources in the physical infrastructure. Figure 2.1 depicts the placement problem for N VMs on K available servers. Cloud services are characterized by elastic requirements from users, applications and services that induce variable workloads on the system that require consolidation to use more efficiently resources and reduce costs that would be otherwise wasted because of suboptimal utilization or exploitation. For example, VMware [32] runs in a random way VM consolidation every 5 minutes to improve efficiency and system capacity. This also provides the opportunity of shutting down emptied (and empty) hosts to reduce energy consumption whenever appropriate.

Figure 2.1: System model and placement problem

2.2.2 b−Matching formulation VM repacking resorts to consolidation by moving VMs between hosts to maintain optimal placement when dynamic changes are sufficiently significant to require an adaptation. Placement needs to be dynamically and regularly updated to remain optimal. This is also needed to minimize hosting costs, energy consumption and operations costs. We focus in this paper on horizontal migrations in a system of already instantiated VMs on a physical infrastructure. We consider a hosting costs, noted by He = Hik , when a V Mi is currently hosted by a server Sk (also noted k). The hosting of V Mi by server k is represented as an edge e = (i, k) (with the initial extremity (i = I(e)) of e corresponding to a VM, and the terminal extremity (k = T (e)) of e) representing the host or server (see Figure 2.2 (a)). Furthermore, if our optimization recommends

24

Virtual Machines Placement and Repacking Optimization

Figure 2.2: VM to server mapping weighted edge to move V Mi from server k 0 to another server k, we include a reconfiguration cost Re = Rik as depicted in Figure 2.2 (b). Based on this configuration, one can construct a new weighted bipartite graph G = (V ∪ S, E), where V is the set of vertices representing virtual machines actually hosted, and S is the set of all available (powered ON) servers (see Figure 2.3). E is the set of weighted edges between V and S constructed as described bellow: there is an edge e = (i, k) between each VM i and each available server k, and the weight of e is given as follows: 1. if VM i is actually running on server k, then the edge (i, k) has a weight wik = Hik , which represents hosting costs of VM i in server k. 2. if VM i is hosted in server k 0 6= k, then we define a weight wik = Hik +Rik which considers hosting costs of server k added to reconfiguration costs due to moving VM i from k 0 to k.

Figure 2.3: Complete bipartite graph construction

Placement and Consolidation Algorithms

25

We also consider resource limitation constraints in terms of maximum available CPU, memory . . . capacity in servers to circumscribe the search space. We now introduce the well known ”minimum weight b-Matching problem” to build one of our combinatorial optimization solution. The b-Matching is a generalization of the minimum weight Matching problem that is defined as (see [39] for more details): Definition 1 Let G be an undirected graph with integral edge capacities: u : E(G) → N ∪ ∞ and numbers b : V (G) P→ N. Then a b-Matching in G is a function f : E(G) → N with f (e) ≤ u(e), ∀e ∈ E(G), and e∈δ(v) f (e) ≤ b(v) for all v ∈ V(G). where δ(v) represents the set of incident edges of v. To simplify notation, with no loss in generality, we use E and V for the edges and vertices of G. From theP definition, finding a minimum weight bMatching in a graph G consists in identifying f such that e∈E ce f (e) is minimum, where ce is an associated cost to edge e. This problem can be solved in polynomial time since the full description of its convex hull is given in [39]. Proposition 1 Let G = (V ∪ S, E) be a weighted complete bipartite graph built as described in Figure 2.3. Then, finding an optimal VM repacking solution is equivalent to an uncapacitated (u ≡n∞) minimum weightmo b-Matching solution, where b(v) = 1 if v ∈ V (v is a VM) and b(v) = l CP U v min |V|; P if v ∈ S. cpu e∈δ(v)

I(e)

Proof: The bipartite graph is constructed with the conditions: • Each vertex v ∈ V, then b(v) = 1 assigned to one and only one server. • For all v ∈ S, we set (

&

CP Uv b(v) = min |V|; P e∈δ(v) cpuI(e)

')

to assign different VMs to a server according to its capacity or resources limit. A proof on how to characterize b is given in the sequel. • Each vertex v ∈ V is linked to a vertex j ∈ S through an edge e, with a weight we = He + Re 1e and for each edge e = (i, j): 1e = 1(i,j) = 1 if VM i is not actually hosted in server j, and 1e = 0 otherwise. Starting from this constructed graph, a solution to the minimum weight b-Matching problem on G is equivalent to finding a set of servers hosting all the VMs at minimum overall cost. This is equivalent to finding the optimal placement of VMs in hosts that leads to minimum global cost. 2 Proposition 1 can easily be generalized to other resources such as RAM, Disk, etc.

26

Virtual Machines Placement and Repacking Optimization

To formulate mathematically our model, we associate a continuous decision variable xe to each edge e in the bipartite graph. As shown in Figure 2.3, each edge links a VM to a server. After optimization if the decision is xe = 1 then VM i (i = I(e) initial extremity) will be hosted by server j (j = T (e) terminal extremity). Since the solution of a b-Matching problem is based on solving a linear program, an integer solution of the minimum weight b-Matching is found in polynomial time. This is equivalent to the optimal solution of the repacking problem described in this paper. According to the different costs listed previously, we assign each VM to the best server with minimum cost. We can thus formulate the objective function as follows: X (He + Re 1ij )xe (2.1) min Z = e∈E,e=(i,j)

 where 1ij =

1, if V Mi is not currently hosted in server j; 0, elsewhere.

This optimization is subject to a number of linear constraints. For instance, the Cloud provider has to consider repacking of all VMs, and each VM will be assigned to one and only one server. This is represented by (2.2) (δ(v) is the set of incident edges of v in G): X xe = 1, ∀v ∈ V (2.2) e∈δ(v)

Each server v has a resource capacity limit expressed as an integer upper bound (as a limited number of neighbors (VMs)) noted by b(v) as in (2.3): X

xe ≤ b(v), ∀v ∈ S

(2.3)

e∈δ(v)

In other words, if cpuI(e) is the required amount of CPU by V MI(e) and CP Uv (v is a server) is the available amount of CPU in server v, then resource limitation constraints are given as: X cpuI(e) xe ≤ CP Uv , ∀v ∈ S (2.4) e∈δ(v)

Concerning the upper bound optimization problem, more details can be found in the literature (see [69] and [40] for example). In our case, we propose to use the upper bound in (2.5) refined as the optimization process progresses. (

&

CP Uv b(v) = min |V|; P e∈δ(v) cpuI(e)

') , ∀v ∈ S

(2.5)

Proposition 2 Using the upper bound in (2.5), the constraints (2.3) dominate the resource limitation constraints given by (2.4).

Placement and Consolidation Algorithms

27

Proof: For each server v ∈ S, define  cpumin = min cpuI(e) , ∀e ∈ δ(v) the minimal required amount of resource (CPU in our case) by all the VMs soliciting server v. Then, one can write: X

cpuI(e) xe ≥ cpumin

X

xe , ∀v ∈ S

(2.6)

e∈δ(v)

e∈δ(v)

Using (2.4), we get: cpumin

X

X

xe ≤

cpuI(e) xe ≤ CP Uv , ∀v ∈ S

(2.7)

e∈δ(v)

e∈δ(v)

and deduce the following valid inequality: X

xe ≤

e∈δ(v)

CP Uv , ∀v ∈ S cpumin

(2.8)

k j CP Uv . Thus, for at least 2 VMs, and Constraint (2.3) dominates (2.8) only if we have: b(v) ≤ cpu min by assuming that a server cannot host more than |V| VMs, we propose the following upper bound taking into account resource limitation constraints: ( & ') CP Uv b(v) = min |V|; P , ∀v ∈ S 2 e∈δ(v) cpuI(e) According to Proposition 2, resource limitation constraints (2.4) are not required in our model as they are dominated by (2.3). The new dominant valid inequality for our model is given by: (

X e∈δ(v)

&

CP Uv xe ≤ b(v) = min |V|; P e∈δ(v) cpuI(e)

') , ∀v ∈ S

(2.9)

Using the b-Matching model with the resource limitation constraints (and inequality (2.9)) enables the use of the complete convex hull of b-Matching and makes the problem easy in terms of combinatorial complexity theory. Reference [39] gives a complete description of the b-Matching convex hull expressed in constraints (2.2) and (2.3). These two families of constraints are reinforced by blossom inequalities (2.10) to get integer optimal solutions with continuous variables:  P X v∈A bv + |F | , ∀A ∈ V ∪ S, (2.10) xe + x(F ) ≤ 2 e∈E(G(A))

P P where F ⊆ δ(A) and v∈A bv + |F | is odd, and δ(A) = i∈A,j∈A x(ij) . E(G(A)) represents a subset of edges of the subgraph G(A) generated by a subset of vertices A. An in depth study of

28

Virtual Machines Placement and Repacking Optimization

blossom constraints (2.10) can be found in [24] and [43]. Based on the bipartite graph G, we constructed a linear reduction of the repacking problem to the b-Matching problem. The blossom constraints (2.10) are added to our model to get optimal integer solutions of the repacking problem whose model is finally given by (b(v), v ∈ S is set as proposed in (2.5)): P min Z = e∈E,e=(i,j) (He + Re 1ij )xe S.T.P:  x = 1, ∀v ∈ V;   Pe∈δ(v) e   x ≤ b(v), ∀v ∈ S ;  (2.11) e∈δ(v) e  jP k P b +|F | v v∈A , ∀A ∈ V ∪ S; e∈E(G(A)) xe + x(F ) ≤ 2  P    F ⊆ δ(A), v∈A bv + |F | is odd ;   xe ∈ R+ , ∀e ∈ E; We define an SLA violation rate as the percentage of overused servers (servers that run out of space) in terms of resources (CPU for example), after the repacking solution. This definition is needed since model (2.11) is a linear program providing quasi optimal solution respecting the resource limitation constraints with a negligible amount of SLA violations.

2.2.3

Graphic matroid and greedy algorithm

In addition to the exact linear program based on the b-Matching approach, we seek a polynomial algorithm that can scale to thousands of VMs and servers without SLA violations. Since the exact solution optimizes repacking in linear time, we seek an algorithm with similar properties and criteria, and with no SLA violations. We propose a new algorithm that solves the repacking problem based on matroid theory introduced in the following definition (more details on matroids can be found in [58]): Definition 2 A matroid M = (E, F) is a structure in which E is a finite set of elements and F is a family of subsets of E such that: 1. ∅ ∈ F. 2. If A ∈ F and B ⊆ A, then B ∈ F. 3. If A, B ∈ F, and | B |>| A | then ∃e ∈ B \ A, such that A ∪ {e} ∈ F. Using the bipartite graph G = (V ∪ S, E) (see Figure 2.3) described in Section 2.2.2, the optimal solution of the repacking problem consists in hosting each VM in one server. Similarly, in the bipartite graph G, each vertex v ∈ V will be assigned to exactly one vertex k in S, and each vertex k ∈ S can be a neighbor of different vertices in V. This yields a solution as presented in Figure 2.4, showing a forest of trees optimally linking servers and VMs.

Placement and Consolidation Algorithms

29

Figure 2.4: Optimal VMs repacking example Proposition 3 Let G = (V ∪S, E) be a simple bipartite graph as shown in Figure 2.3. By relaxing server limited capacity constraints, M = (E, F) is a matroid, with F = {I ⊆ E, I is a forest of trees}. Proof • Condition (1) of Definition 2 is trivial. To prove condition (2) of Definition 2, we suppose that A ∈ F, and according to the definition of F, A is a forest of trees. Thus, if B ⊆ A, then the connected components of B are also trees even by deleting one or multiple edges in A. This leads us to easily conclude that B ∈ F. • To prove condition (3) of Definition 2, we note by A = ∪ki=1 Ai where the Ai represent the connected components (trees) of A. Then, for all i = 1, . . . , k, we suppose Gi = (Ti , Ai ), where Gi is a tree with |Ti | vertices Pk and |Ai | edges. This leads us to deduce the number of vertices of A given by nA = ∈ F, and i=1 | Ti |=| A | +k. Similarly, we define PB 0 t t suppose B = ∪j=1 Bj . The number of nodes of B is then given by : nB = j=1 | Ti |=| B | +t. By using | B |>| A |, two cases are discussed: 1. If t > k : then | B | +t >| A | +k and nB > nA . In other words, B reaches more vertices than A, and there exists a vertex x covered by B and not by A. Suppose that e ∈ B is an edge which contains x as one of its two extremities, we finally deduce that A ∪ {e} ∈ F. 2. If nB < nA : We suppose that the edges of B connect each couple of nodes in A in the same connected component Ai . Using the absurd reasoning, we suppose that there is no edge e ∈ B \ A, such that A ∪ {e} ∈ F. This means that: the edge e ∈ B, links two vertices in the same component Ai and forms a cycle. In this case, the number of edges of B will satisfy | B |≤| V1 | + | V2 | + . . . + | Vk |, then | B |≤| A | which contradicts our hypothesis | B |>| A |. 2 In fact, this matroid is well known in the literature (see [58]) and called the graphic matroid. Based

30

Virtual Machines Placement and Repacking Optimization

on the bipartite graph G, we look for a tree decomposition with a minimum cost. In other words, we look for an optimal basis of the graphic matroid. One can apply a greedy algorithm to optimally solve the repacking problem. The server limited capacity constraints are strong in our case. In fact, they influence the choice of a server to host VMs. To introduce these constraints in our solution, we propose a simple modification in the greedy algorithm as illustrated in Algorithm 1 (note that we = He + Re 1e ). Algorithm 1 Greedy Algorithm Put F = ∅; we1 ≤ we2 ≤ . . . ≤ wem ; for i = 1 to m do if F ∪ {ei } ∈ F then if cpu(I(ei )) ≤ CP U (T (ei )) then F := F ∪ {ei } CP U (T (ei ))− = cpu(I(ei )) end if end if end for

2.3

Summary of Results

The linear program solver CPLEX [71] is used to assess performance of the b-Matching solution. The algorithms are compared in terms of convergence times to find their respective solutions (optimal, near optimal with some SLA violations). Table 2.1 reports the results of the assessment and shows clearly the known scalability problem of the Bin-Packing algorithm whose resolution times become prohibitive and unacceptable for the tuple (|V| = 3000, |S| = 500, 700, 900). The performance of the Best-Fit comes as no surprise since it simply picks the first available least loaded host without any care for cost. The advantage of the greedy approach consists to totally eliminate SLA violations in compensation of higher costs. The main difference between the two proposed approaches is: the linear program algorithm is very efficient in time resolution and cost-effectiveness, and the SLA violation is very negligible (actually in less than 10%). We also note the rapid solution of the Best-Fit algorithm (few seconds for large instances) giving more expensive repacking solutions as is shown in Figure 2.6. Our algorithms, b-Matching and greedy (used to solve the graphic matroid representation of the repacking problem) find optimal and near optimal (b-Matching can be tuned to limit its SLA violations) perform much better in terms of convergence time-optimality tradeoff. The greedy solution has comparable convergence speed for up to 1000 VMs hosted in 200, 400 and 700 servers. The greedy required longer convergence times (order of minutes) beyond these values but remains valuable since it always finds good solutions. The b-Matching algorithm provides the best tradeoff between convergence time, optimality, scalability and cost. With respect to convergence time as seen in the second column of Table 2.1, it converges in few seconds for the scenario with 3000

Summary of Results

|V|

50

100

500

1000

2000

3000

31

|S|

Table 2.1: Time resolution of the four algorithms b-Matching Bin-Packing Greedy Best-Fit Time Time (sec) Time (sec) Time (sec) (sec)

12 25 40 25 50 75 100 250 350 200 400 700 300 500 700 500 700 900

0 0 0 0 0.02 0.04 0.05 0.1 0.4 0.4 0.5 0.8 0.6 1.2 2 1.6 2.2 3.2

0 0.02 0.04 0.02 0.04 0.06 0.4 1.30 2 4 4.2 9.8 5.8 14.4 36 205 >4H >4H

0 0 0.02 0.04 0.04 0.1 0.56 1.09 1.88 2.45 3.77 5.96 19.2 24.7 28.4 149.3 355.6 534.2

0 0 0 0 0 0.02 0.06 0.20 0.30 0.44 0.6 1 1.3 2.2 3.6 5.4 7.6 9.6

VMs and 900 servers (in 3.2s compared to 9.6s of Best-Fit, 10mn of greedy and the times exceeding 4 hours of Bin-Packing). In fact, the scalability test has been pushed to |V| = 4000 VMs and |S| = 1000 servers, especially for the b-Matching approach for which the convergence time was found to be 5sec and the percentage of SLA violations to 1.5%. To get a better grasp of relative performance of the algorithms, an instance of 1000 VMs hosted in a number of physical servers ranging from 50 to 800 is used to evaluate the algorithms’ SLA violations. This result should be analyzed jointly with the convergence time, the cost and the percentage of unused servers. Table 2.1, Figures 2.5 and 2.6 reveal the characteristics of the algorithms. Figure 2.5 shows that the b-Matching SLA violations remain below 7% (value obtained for 50 servers) and decrease with increasing number of servers to vanish when a large number of servers is used (> 700). This means that the b-Matching becomes globally more efficient when more servers are used and thus improves with scale. The b-Matching algorithm achieves the best cost performance since it has consistently incurs the smallest cost, very close to the Bin-Packing which does not scale (as seen in Table 2.1 where time to find a solution is the worst; > 4 Hours for more than 700 VMs and 3000 servers). The greedy algorithm cost is higher than b-Matching and Bin-Packing but it consistently finds good solutions without any SLA violations. If no SLA violations are tolerated or allowed, the greedy algorithm should be used even if it will cost a little bit more than the b-Matching solution and will use more servers (see Table 2.1). The b-Matching provides the best tradeoffs with rather low penalties (SLA violations < 7%). These violations vanish for large infrastructures with thousands

32

Virtual Machines Placement and Repacking Optimization

Figure 2.5: SLA violations comparison

Figure 2.6: Cost comparison

of hosts and virtual resources. In summary, the b-Matching and the greedy algorithms scale well. Both find optimal solutions in convergence times in line with operational systems requirements with a clear advantage for b-Matching. The Bin-packing finds optimal solutions but does not scale and the Best-Fit costs is prohibitive despite its speed and simplicity. For small infrastructures hosting also a small number of VMs, the Bin-Packing can be used but both the b-Matching and the greedy are quite competitive.

2.4

Conclusion and Future Work

This chapter addresses the problem of VMs repacking in Cloud infrastructures. An exact and linear programming algorithm for VMs live migrations using a b-Matching model is derived and evaluated. The b-Matching algorithm improves as the numbers of servers and VMs increase by gradually reducing from an acceptable number of SLA violations to zero (no) violations for large instances. The exact algorithm optimizes the various costs associated to VMs repacking. In stringent requirements conditions, to guarantee repacking with no SLA violations, a greedy algorithm based on matroid theory is proposed. The greedy algorithm is quite efficient for medium scale problems and can thus be combined with the b-Matching that is very effective for large scale. As the repacking problem is NP-Hard, this leads to a prominent research challenge to facilitate the scalability of the proposed solutions when dealing with large number of VMs and servers. Thus, we will investigate the combination of the greedy approach with the b-Matching algorithm by using the greedy solution as an upper bound in the convex hull of the b-Matching approach to accelerate the necessary convergence time. Another ongoing extension of our work includes security constraints that will be introduced as affinity and anti-affinity relationships between VMs and hosts. This constrained repacking problem becomes more complex and requires in depth Branch and Cut studies relying on the linear

Conclusion and Future Work

33

formulation of the b-Matching model. While this chapter is dealing with VMs placement problem, the next chapter will consider a cloud federation of K involved providers, each with a limited quota of resources. The objective of the considered federation consists in hosting judiciously interconnected VMs represented by connected graphs (where the nodes are VMs and the edges are virtual links), to be deployed judiciously on the available resources of the cloud federation. Thus, next chapter will discuss resource allocation problems when end-users demands are more complex (connected graphs), under critical and limited resources constraints.

34

Virtual Machines Placement and Repacking Optimization

Chapter 3

Critical Resource Allocation in Cloud Federations This chapter addresses cloud federation involving multiple providers dealing with hosting tenant services according to criticality and costs criteria. We propose cost-efficient placement approaches taking into account resources criticality highlighted by a Gomory-Hu tree transformation. This transformation classifies services and their relationships according to their level of criticality.

3.1

Introduction

We focus on a cloud federation involving multiple providers when hosting of more or less critical tenant services. Services such as central control and management, security, protection and high availability as well as key and often solicited servers and network nodes require special attention and increased guarantees. Typical intelligent placement solutions take partially or indirectly into account the level of criticality of nodes (services) and links (relationships) in a composite service request. Often high availability and costs determine the selection and placement decisions. A more direct approach is proposed to take the importance of resources into account by transforming the initial service requests, expressed in the form of a service graph, to a tree representation that classifies nodes and links according to their importance. The proposed placement algorithms run at each provider in a distributed fashion. The algorithms rely on hosting and interconnecting costs information and on resources (quotas) made available to the federation. They are based on a smart transformation of the requested tenant service graph (virtual resources) using the Gomory-Hu tree [39] to reduce the problem complexity. At the same time, this transformation allows us to detect critical nodes and edges in the graph, to put the focus on critical services and give them priority when selecting the hosting infrastructures. Our contribution in this chapter, (can also be found in [27]), consists in proposing novel algorithms to cope with the problem of cost-efficient placement of virtual services (represented by graphs of interconnected VMs) in a federation that takes into account the importance of certain services or 35

36

Critical Resource Allocation in Cloud Federations

functions in the service request. Often some nodes are critical and have higher node degrees indicating the number of connections they have with other nodes. Some nodes can also be dominant and central because a large number of paths and flows transit through these nodes. The same can be said of dominant links, that represent also key resources. These critical nodes and links should be treated with care and priority and assigned to the most robust hosting resources. Our work focuses on improving and optimizing two operations: • Gomory-Hu request transformation: that will reveal dominant nodes and links. The transformation (described in the sequel), translates the problem to a placement of trees of smaller sizes instead of mapping the original graphs of much larger sizes; • Placement optimization: According to the request transformation of Gomory-Hu, we investigate various algorithms to achieve cost efficiency and critical resources aware placement. Before describing our scalable algorithms based on the Gomory-Hu transformation and classification, we propose a brief related work summarizing some results on resource allocation problems in cloud federations. Samaan et al. [65] and Mashayekhy et al. [48] rely on game theory to efficiently share available resources from multiple providers and improve revenues in hybrid clouds. Authors of [65] proposed an economic model to share the available resources of different and selfish cloud providers. The proposed solution models the interaction between the cloud providers as a repeated game for Virtual Machines (VM) outsourcing. The providers can offer all unused capacity in a spot market to guarantee revenue maximization. The proposed model takes only into account the pricing of VMs and available capacity from the providers. The inter provider networking costs between VMs distributed in different cloud federation providers are not considered. In [48], a simple integer linear programming approach is used to model the federation of providers aiming at improving their revenues when sharing their infrastructures. They resort to a cloud federation game mechanism to address complexity by dynamically cooperating to form a federation and determine each provider’s profit. The authors did not consider any interconnection costs between providers either. Konstanteli et al. [38] focus on optimal cloud federation resources placement across providers. They take into account computing, networking and storage costs at each provider and used heuristic algorithms to find feasible solutions in acceptable times, nevertheless without guaranteeing optimality for large problem instances. Petri et al. [61] propose the ”CometCloud” based federation system that aims at reducing task processing delays and profit maximization for each site or provider. Two federation models are used. A federation where providers interact through direct communication and aggregated federations where sites use a distributed coordination space to interact. They analyze the effect of policies on task completion and site utilization. Task completion is out of scope in our work that focuses on hosting and networking costs. Parak et al. [59] address cloud interoperability management frameworks and propose an interoperability solution named rOCCI while Gang et al. [11] investigate resource sharing and data processing systems deployment in multiple clouds and identified the required features for cloud

Critical Placement Algorithms

37

based federated data management. A state of the art on cloud federation and a related conceptual layered architecture is provided by Assis et al. in [6]. Lapacz et al. [41] propose an architecture to enhance inter cloud communication at the networking level without, however, optimizing the sharing of resources within the federation. Luo et al. [46] proposed a novel network stitching mechanism by translating VLAN tags dynamically and combined several networking technologies such as virtual switches and NetFPGA [1] into a coherent platform enabling dynamic configuration of virtual networks. The efficient allocation and scheduling of physical resources among different virtual networks is not addressed. Bermbach et al. [8] analyze the impact of five compute redundancy strategies on the availability, on processing time, on job distribution across different clouds, on security and costs. Networking aspects, constraints and costs are not taken into account. These networking constraints that affect and modify considerably cloud federation modeling, optimization and algorithmic complexity are on the contrary central in our work. Previous work on resource allocation in cloud federations can be found in Rebai et al. [63]. In fact, this reference is the closest to our work reported in this chapter. An exact mathematical formulation of the optimization that includes networking and traffic exchanged between the providers is proposed. A complete description of the problem integrating valid inequalities spanning networking costs, hosting costs, and available resources (quota of VMs) within each provider is presented. This model can solve the problem in acceptable times for small and medium federation sizes. Another exact mathematical formulation of the cloud federation resource allocation problem is presented by Hadji et al. in [30]. A pricing model is combined with an exact formulation of the cloud federation optimization problem to make resource allocation decisions based on local hosting costs and dynamic pricing of outsourcing and insourcing costs by the federation members. The networking or inter provider connectivity costs are not taken into account in the model. Compared to this prior art, we propose a more complete model including hosting and inter providers’ networking costs and resource sharing optimization algorithms that converge faster and scale with problem size.

3.2

Critical Placement Algorithms

The tenants and end-users requests are expressed in the form of resource graphs composed of nodes (virtual machines seen as graph vertices) and links (graph edges connecting the virtual machines) with specified hosting and transport capabilities to be met by the placement solutions (see Figure 3.1). The links that connect the nodes in the federation are weighted by a bandwidth or flow transport capacity and are intra or inter provider links. The requests are received by a cloud provider “j ” whose objective is to make the best allocation and placement decisions to improve revenue. Thus, we investigate new solutions that jointly reduce the execution time while proposing cost efficient placement using algorithms that can scale with problem size. There are K cloud providers involved in the same federation with hosting costs Hostj for a provider j, j = 1, 2, . . . , K. The inter provider connectivity costs to link a virtual resource in provider j to a resource hosted in provider i is γji which reflects the cost of the bandwidth expected to be

38

Critical Resource Allocation in Cloud Federations

Figure 3.1: Cloud Federation System Model consumed on the link ji connecting these two resources from the same consumer service request received by provider j. Figure 3.1 depicts these costs in a cloud federation with a focus on provider j’s view. Consequently, when a virtual machine VMa from the initial request is in j and another VMb is in i and they are the extremities of the edge ji as illustrated in Figure 3.1, the total cost (for a provider j) of interconnecting and hosting these two resources (2 VMs for example) is given by: Γj = γji + Hostj + Hosti

with

i 6= j

(3.1)

where Hosti can typically be considered as the insourcing cost of provider i to provider j. Thus, in addition to reducing problem complexity, our contribution also aims at minimizing the defined cost Γj of provider j at the same time. The interconnectivity cost is mainly due to the amount of traffic that is expected to flow from provider j to provider i according to the requested service needs. This can be expressed for example as: γji = trji × Rji

(3.2)

where Rji is the cost of a bandwidth unit on edge ji and trji is the expected end to end interaction traffic, measured in bandwidth units, between the end points (resources, e.g. virtual machines) hosted in providers j and i and interconnected by the link ji. In general, the cost γji is governed by the federation agreements and especially the established service level agreement between providers j and i. For this reason, the proposed algorithms in this work use mainly γji as the cost on edge ji for optimization purposes.

Critical Placement Algorithms

3.2.1

39

Problem complexity

The goal is to place virtual nodes (typically virtual machines in the virtual resources graph request) in the federation physical nodes. This consists of selecting a subset of cloud providers that can host the virtual nodes and their connectivity. The distributed and interconnected federation resources are represented by a graph GF ed = (VF ed , EF ed ). The virtual resources graph request is similarly defined as: GReq = (VReq , EReq ). Each node a (typically a VM) of GReq requires an amount cpua of compute resources and each associated edge e ∈ EReq requires an amount of bandwidth BWe ∈ R+ . Nodes VF ed , of the federation graph, GF ed , are the nodes made available by the federation providers. The links interconnecting the providers and their nodes are represented by e (e ∈ EF ed of GF ed ). We consider the following case: • ∃e ∈ EReq , BWe > 0 (Demands with bandwidth requirements): The problem consists in optimally placing the requested VMs on providers nodes while satisfying also the network connectivity requirements. This is very similar to the well known NP-Hard problem of Virtual Network Embedding [20] [50]. Since our problem is as complex it is consequently also NP-Hard. This complexity has motivated our search for lower complexity models and solutions in order to scale and converge towards near optimal solutions in practical times compatible with operational requirements.

3.2.2

Mathematical program formulations

In order to come up with efficient, scalable and near optimal solutions, we start from linear integer program formulations proposed in [63]. As mentioned with the exact mathematical models proposed in [63], only small size federation networking problem can be handled in practical times. Our contribution in this manuscript proposes a Gomory-Hu based classification of the resource request graphs to reduce complexity and scale. The Gomory-Hu tree produces a tree where nodes and links are tagged with a degree reflecting their importance in the original service graph. Figure 3.2 illustrates an example of a Gomory-Hu tree transformation. The identified critical nodes and links produced by the Gomory-Hu tree are given placement priority and are hosted if possible first in the solicited provider j’s infrastructure. Serving critical nodes and links from within enables provider j to more easily respect established agreements, provides stronger protection and security commitments and potentially ensures higher availability. Secondary nodes and links, from the degree standpoint, can be placed in the cooperating providers from the federation either using simple criteria such as minimum cost and maximum revenue objectives. Producing a Gomory-Hu tree classification of the original service request as depicted in the intermediate step in Figure 3.1, changes the problem to the allocation of critical nodes or links and leads to a less complex and scalable solution.

40

Critical Resource Allocation in Cloud Federations

Figure 3.2: Example of a Gomory-Hu tree of a graph G A node centric algorithm (Critical Node Detection - CND) and an edge centric algorithm (Critical Edge Detection - CED) are proposed and evaluated. The algorithms first place primarily critical nodes or critical links in house (in provider j). They assign (outsource), in a second phase, other less important nodes or links to other providers. Note that the algorithms make sure that any outsourced request does not loop back to its source. In Figure 3.2, the flow between nodes (5) and (2) is equal to min{8, 6, 7, 6} = 6 in the depicted tree. The output of the Gomory-Hu transformation is used as input to the CND and CED algorithms described in detail in the sequel. The algorithms use the node degrees and the edge weights produced by the Gomory-Hu tree to achieve the placement.

3.2.3

Critical Node Detection (CND) algorithm

As hinted previously, we first construct a Gomory-Hu tree of the graph request as shown in Figure 3.2 where GH = (VGH , EGH ) represents the obtained Gomory-Hu tree. Variable VGH is the set of vertices and EGH is the set of edges of GH. The number of vertices of GH is N and the number of edges is M = N − 1. Two thresholds are used to manage nodes and links according to their degrees or importance. These thresholds α and β are selected as the average vertices degree and the average communication bandwidth (value associated to edges) between VMs of the derived Gomory-Hu tree. The thresholds are used for the critical node and critical link approaches respectively (CND and CED algorithms as depicted in Figures 3.3 and 3.4). Nodes with degrees higher than the threshold will be hosted by provider j himself and the other ones can be hosted elsewhere by other providers if appropriate. The same principle applies for the links. These two approaches specified and described in the sequel compose the solutions advocated by this chapter to reduce complexity and scale. The CND approach consists in favoring node hosting while respecting the overall service graph request from the end users and hence the relationships between nodes. The emphasis is on the most prominent VMs. This is achieved by applying the Gomory-Hu tree search algorithm on the virtual service graph requests to obtain a simple tree that highlights critical nodes by ranking them according to their node degrees. This ranking is used to host critical nodes in the provider

Critical Placement Algorithms

41

j in charge of the end user request and responsible for respecting established agreements. The Gomory-Hu sets the degree of each VM v, declared as a critical node if the degree (dGH ) exceeds a threshold α: dGH (v) ≥ α =

1 X dGH (v) N

(3.3)

v∈VGH

The pseudo-code of the CND algorithm is illustrated in “Algorithm 2”. Algorithm 2 Critical Node Detection (Provider j) Select S ∗ = {v ∈ VGH /dGH (v) ≥ α} We define S ∗ = {V M1∗ , V M2∗ , . . . , V Ml∗ } and So∗ = dGH ((1)) ≥ dGH ((2)) ≥ . . . ≥ dGH ((l))

n o ∗ , V M ∗ , . . . , V M ∗ , where V M(1) (2) (l)

The nodes that are not critical are in the relative complement of S ∗ , S = VGH \ S ∗ . This leads to: n o S = V M1† , V M2† , . . . , V Mt† , with t + l = N . Based on these defined variables and sets, the CND algorithm operates formally as follows: for i = 1 to l do if CP Uj − cpu(i) > 0 then V M(i) is hosted within Provider j else  S = S ∪ V M(i) end if end for In order to host the other nodes in other providers than j, we define F = {P r1 ,. . . , P rj−1 , P rj+1 , . . . , P rk } Fo = P r(1) , P r(2) , . . . , P r(j−1) , P r(j+1) , . . . , P r(k) where γj(1) + Host1 ≤ γj(2) + Host2 ≤ . . . ≤ γj(j−1) + Host(j−1) ≤ γj(j+1) + Host(j+1) ≤ . . . ≤ γj(k) + Hostk n o † † † We also define S o = V M(1) , V M(2) , . . . , V M(t) where cpu(1) ≥ cpu(2) ≥ . . . ≥ cpu(t) The remaining nodes are placed on a best fit basis: for v = 1 to t do for i = 1 to k; i 6= j do if CP U(i) − cpu(v) > 0 then † V M(v) is hosted within Provider (i) EXIT end if end for end for Essentially, critical nodes are optimally placed in provider j, all nodes that are less critical or are

42

Critical Resource Allocation in Cloud Federations

critical but can not be placed (because there is no room left for hosting or there is no solution), are outsourced to other providers i 6= j. The non critical nodes are placed on best fit basis. Figure 3.3 illustrates how the nodes are mapped and assigned to the providers when the CND algorithm is used.

Figure 3.3: Example of outcome of the CND algorithm cost computation The CND algorithm in Figure 3.3 decides to host the critical nodes (light blue nodes) in provider j. The rest of the requested nodes (dark blue nodes) are hosted by provider 1 offering the best cost γ. The total cost for this CND example is (4 × 0.5) + (16 × 1.15) + (2 × 0.5) = 21.4 (the total bandwidth of virtual links between nodes 5, 6 and 1, 2, 3, 4 is equal to 16 in Figure 3.2 (or Figure 3.3) ).

3.2.4

Critical Edge Detection (CED) algorithm

In the obtained Gomory-Hu tree, and according to networking costs and resources quotas made available by the providers for outsourcing, we optimize the hosting and the placement of virtual requests by first selecting greedy edges in GH with bandwidth requirement exceeding the threshold β, and then select the best cloud providers (in terms of aggregate costs) to host the remaining nodes. For example β can be defined or set as follows:

β=

X 1 Be N −1

(3.4)

e∈EGH

where Be is the bandwidth requirement of the edge e in the Gomory-Hu tree. For each edge e, we note by I(e) the initial extremity of e and by T (e) its terminal extremity in the Gomory-Hu tree. Using these notations, the CED pseudo-code is illustrated in Algorithm 3.

Critical Placement Algorithms

43

Algorithm 3 Critical Edge Detection (Provider j) We define E ∗ = {e ∈ EGH /Be ≥ β};  and write E ∗ = {e1 , e2 , . . . , em1 } and define Eo∗ = e(1) , e(2) , . . . , e(m1 ) , with edges ranked according to their weights Be(1) ≥ Be(2) ≥ . . . ≥ Be(m1 ) ; We define othe relative complement of E ∗ as E = EGH \ E ∗ , leading to E = n 0 also 0 0 e1 , e2 , . . . , em2 , with m1 + m2 = N − 1, pointing at less critical edges to place using a best fit policy; The critical edges are placed first by making sure both edge extremities have enough CPU hosting capabilities for the end points; for i = 1 to m1 do  if CP Uj − cpuI(ei ) + cpuT (ei ) > 0 then e(i) is hosted within Provider j; else  E = E ∪ e(i) end if end for Non critical edges mapping: F = {P  r1 , . . . , P rj−1 , P rj+1 , . . . , P rk }; Fo = P r(1) , P r(2) , . . . , P r(j−1) , P r(j+1) , . . . , P r(k) where γj(1) + Host1 ≤ γj(2) + Host2 ≤ . . . ≤ γj(j−1) + Host(j−1) ≤ γj(j+1) + Host(j+1) ≤ . . . ≤ γj(k) + Hostk . n 0 o 0 0 We also define E o = e(1) , e(2) , . . . , e(m2 ) where Be0 (1) ≥ Be0 (2) ≥ . . . ≥ Be0 (m2 ) ; Check edge hosting and mapping: for v= 1 to m2 do  0 0 if I(e(v) ) ∈ F ∧ T (e(v) ) ∈ /F ∨  0  0 I(e(v) ) ∈ / F ∧ T (e(v) ) ∈ F ∨  0  0 I(e(v) ) ∈ / F ∧ T (e(v) ) ∈ / F then for i  = 1 to k; i 6= j do  if

CP U(i) − cpuI(e0

(v)

)

(resp. CP U(i) − cpuT (e0

>0

(v)

0

)

> 0) then 0

VM I(e(v) ) (resp. VM T (e(v) )) is hosted within Provider (i) EXIT; end if end for end if end for

44

Critical Resource Allocation in Cloud Federations

Just like in the CND case, after all critical links have been hosted in provider j by the CED algorithm, we proceed by placing non critical links e (whose Be < β) in other providers than j. A best fit approach is adopted for this second step. Figure 3.4 shows how edges are mapped and assigned to the providers when the CED algorithm is used.

Figure 3.4: Example of CED algorithm cost computation The CED algorithm elects provider j as host of critical edges (dashed links in Figure 3.4). The rest of the edges (i.e.(1; 3)), is hosted by provider 1 that proposes the lowest cost γ. The total cost for the CED example is (4 × 0.5) + (12 × 1.15) + (2 × 0.5) = 16.8 (the total bandwidth of virtual links between nodes 1, 3 and 2, 4, 5, 6 is equal to 12 in Figure 3.2 (or Figure 3.4).

3.3

Summary of Results

The algorithms are compared in terms of convergence times to find their respective solutions, rejection rate of end-users requests, and costs gap compared to the optimal solution. This gap is defined as follows for the CND method: GapCN D =

CostCN D − CostB&B × 100 CostB&B

(3.5)

where CostCN D is the total cost found by the CND algorithm, and CostB&B is the cost of the Branch and Bound algorithm. The same expression holds for the CED algorithm. Parameters α and β are set by the provider j as indicated in formulas (3.3) and (3.4) respectively. Table 3.1 reports the performance of the CND algorithm using the three cited criteria. The CND algorithm achieved costs are far from the optimal found by the Branch and Bound algorithm. This is expected since the CND algorithm will host critical nodes in provider j in priority to respect security constraints at the expense of cost even if the privileged provider j hosting costs are high. This leads to feasible but costly solutions and hence important gaps compared to optimal. The CND solution has nevertheless negligible rejection rates for small instances and federations and

Summary of Results

45

no rejection of requests for federation sizes exceeding 6 providers in Table 3.1. The CND finds feasible solutions faster than the Branch and Bound as shown for the worst case scenario of a graph size of 8 VMs and a federation size of 10 providers (4 seconds for CND versus times in excess of 8 minutes for the Branch and Bound as seen in Table 3.1). Table 3.1: CND algorithm’s performance analysis Federation Graph GAP Reject Time B&B Size Size (%) Rate (%) (msec) Time (msec) 4 6 8 4 6 8 4 6 8

3

6

10

28.22 32.20 41.46 43.68 51.17 57.42 41.62 57.37 61.12

1 2 11 0 0 0 0 0 0

0 1 3 1 2 4 1 2 4

1 6 78 3 173 12 sec 13 2518 >8mn

Table 3.2 presents the performance of the CED algorithm using the same cited criteria. The CED algorithm performs close to the Branch and Bound with much smaller gaps and improved rejection rates compared to CND and convergence times of the same order (few milliseconds). The rejection rates of the CED algorithm does not exceed 2% and the CED scales better with size, the gap does not degrade as significantly as with CND. Table 3.2: CED algorithm’s performance analysis Federation Graph GAP Reject Time B&B Size Size (%) Rate (%) (msec) Time (msec) 3

6

10

4 6 8 4 6 8 4 6 8

4.03 4.37 6.48 1.99 1.53 3.60 0.58 1.12 0.70

1 2 2 0 0 0 0 0 0

1 1 3 1 2 4 1 2 4

1 6 78 3 173 12 sec 13 2518 >8mn

To assess scalability of our proposed approaches, a scenario with more cloud providers and larger requests was also used. Evaluation conditions with 100 providers and requests involving 50 nodes led to the results reported in Table 3.3. For this type of instances, involving many providers and requests with many nodes, the two algo-

46

Critical Resource Allocation in Cloud Federations

Table 3.3: Scalability analysis: 100 providers and requests with 50 nodes Metric CND CED Time (sec) 6.3 6.41 Reject rate (%) 0 0 Cost 8014.8 1966.8 rithms (i.e. CED and CND) find near optimal solutions rather quickly in less than 7 seconds as shown in Table 3.3. The gaps compared to optimal are also low and no experienced rejection of requests. Since the Branch and Bound is known not to scale with size, the proposed CED approach appears as a good candidate to address scalability without sacrificing the quality of the solutions.

3.4

Conclusion and Future Work

Three approaches to solve the cloud federation resource placement and allocation problem in the presence of hard placement constraints for security or protection reasons were presented, evaluated and compared. The proposed algorithms use the Gomory-Hu tree transformation of user requests to identify critical nodes and links. These resources are hosted in priority to other less critical nodes while respecting their placement and allocation constraints. The proposed CED algorithm performs well and finds near optimal solutions. Future work will explore further the combination of the methods and joint (one shot) node and link placement constraints to enhance the quality of the solutions. In fact, a one shot optimization method taking into account nodes and links placement in the same time may reduce solutions’ cost. Another ongoing extension of our work consists in proposing a mathematical model based on a genetic approach. This approach starts by randomly generating an initial population (parents) to be used in a crossover operation to produce new children. Then, a mutation operation can be invoked to enhance the quality of the solution that converges towards the optimum. In this chapter, we formulated the resource (interconnected VMs) allocation problem within a cloud federation as a VMs placement under critical and limited resources constraints. In the next chapter, we will consider placement and chaining of virtual resources (as VMs, for instance) in cloud data center. Thus, end-users demands are represented as directed graphs to be deployed judiciously in a physical infrastructure according to some constraints such as the sequence of the demands’ chains and resources limitation.

Chapter 4

Virtualized Network Functions Chaining and Placement Problem We consider in this chapter, the problem of placement and chaining of oriented graphs of Virtualized Network Functions (VNFs) in a physical substrate. We discuss the complexity of this problem and propose two new and scalable approaches based on graph theory and 2-Factor approach to converge in negligible times towards near optimal solutions.

4.1

Introduction

4.1.1

NFV definition

Network Functions Virtualization (NFV) enables provisioning of on demand networking services including connectivity as a service and facilitates the development of agile networking services tuned to application and tenant requirements. NFV leverages virtualization technologies to realize network multi-tenancy with the network infrastructure shared by multiple tenants. NFV implements networking functions as software to be executed on industry standard physical nodes or commodity hardware. The software components known as virtualized network functions (VNFs), can be deployed, migrated, shut down and upgraded easily. Typical VNFs are DPIs, Firewalls, Load Balancers, Routers/switches, etc. NFV can reduce cost and time to market when delivering network services compared with, for instance, Middleboxes deployed on expensive and proprietary hardware inducing high costs and slowing down access to the market. In this chapter, the focus is on optimal Service Function Chaining (SFC) as defined by IETF [62], also known as VNF-Forwarding Graph (VNF-FG) in ETSI [16], according to demand. Interested readers can refer to [51] for additional insight on the history of NFV and its relationship with SDN and cloud computing. Nevertheless, we provide in Figure 4.1 an example of VNF chains to be deployed on a Network Functions Virtualization Infrastructure (NFVI). More details on Services Function Chains (SFC) or Forwarding Graphs can be found in [55] and [31], for example. 47

48

Virtualized Network Functions Chaining and Placement Problem

Figure 4.1: ETSI VNF-Forwarding Graph view

4.1.2

State of the art

In [2], Abbasi et al. discussed the VNF placement in the cloud and addressed the VNF forwarding graph 1 placement problem. They proposed a mathematical model based on linear integer programming to cope with the VNF chaining and placement optimization problem when taking into account two essential constraints : i) cloud computation constraints and ii) physical networking resources limitations. In other words, authors of reference [2] investigated the optimization of the VNF chaining placement considering a tradeoff between the computation and the communication overhead. They just optimized the network resources on the physical substrate by reducing the communication costs without carrying out on required bandwidth on the VNF chains. The problem of mapping and scheduling VNFs can be found in [52]. Mijumbi et al. proposed three greedy solutions and a tabu-search based heuristic to realize online mapping and scheduling of the VNFs simultaneously. In all of the proposed algorithms, authors used acceptance ratio, cost and revenue as criteria to optimize. The authors consider only VNFs as nodes in the mapping and scheduling processes without any arcs (or links) between the VNFs. Consequently, they do not address the sequencing between the addressed processes and hence do not consider any chaining or bandwidth requirement. Another reference in NFV management is provided by the project Stratos [21]. Authors of this paper proposed a detailed architecture to orchestrate VNFs outsourced to a remote cloud when taking into account various constraints as traffic engineering, VNFs horizontal scaling, etc. It is important to mention that in [21] authors have proposed a VNF deployment solution merely depending on upon the input workload. Moens et al [53], Bernardetta et al. [9] and Cohen et al. [12] propose mathematical models for the VNF chains placement with routing constraints. The proposed models, however, describe a limited 1

Named by ETSI and often confused with service chaining

VNF Placement and Chaining Algorithms

49

number of linear constraints and can only capture a very small portion of the problem convex hull. The proposed exact solutions do not scale for large problem instances. We need deeper modeling that can characterize better (completely) the convex hull of the VNF placement and chaining problem to find near optimal solutions in few seconds and scale with problem size. We propose two new and competitive graph theory approaches that have the desired properties of finding solutions quickly for the VNF chaining and of scaling with problem size.

4.1.3

Our contribution to NFV

In our contribution to NFV (see Khebbache et al. [36] for more details), we are interested in providing optimization algorithms for the ESTI VNF-FG use case where service providers acquire cloud and networking services and resources in the form of a graph to use in their own service design. Cloud Service Providers expect placement of their VNFs that ensures routing of their application flows according to the VNFs sequencing specified in their service function chains. Consequently, the request have the form of a chained service graph composed of a number of directed and dependent forwarding paths as described by the ETSI-NFV VNF-FG use case that can be found in [17] (Use Case #4: VNF Forwarding Graphs) and provides the desired details for interested readers. Note also that addressing VNF-FG requests in the VNF-FG sens automatically covers a subset of the Service Function Chaining problem that is currently the subject of IETF SFC [55], [31]. The aim of this chapter is to propose efficient algorithms for VNF chain placement that find good solutions that scale with problem size in requested chains and in infrastructure graphs. In order to find solutions close to the optimum, we propose an exact approach based on ”Perfect 2-Factor theory” presenting a complete mathematical programming description of a variant of the problem. Next to this exact solution, we propose a fast heuristic using a multi-stage graph construction of the problem. These approaches are: 1. An exact algorithm based on the Perfect 2-Factor theory to solve rapidly the case of VNF chains composed of 3 VNFs. 2. An approach based on a multi-stage graph representation: this heuristic is based on the construction of new extended multi-stage graph representing servers available to host required VNFs. The chains corresponding to the VNF forwarding graphs are placed using a maximum flow between the vertices of the different stages of the multi-stage graph representation. This is achieved while respecting the sequencing imposed by the requested chains.

4.2

VNF Placement and Chaining Algorithms

This section introduces our mathematical models for the VNF chaining and placement problem. Figure 4.2 illustrates the problem with the forwarding graphs requests on the right side (directed graph). The hosting infrastructure topology (undirected graph) is depicted on the left side with servers annotated by their available processing capabilities (measured for instance in number of available CPUs) and the links weighted by their remaining bandwidth (expressed in Mbps). The

50

Virtualized Network Functions Chaining and Placement Problem

objective is to map the request (ideally optimally) on the infrastructure to host the VNFs so the service chains are respected in terms of VNFs sequencing and associated compute and networking requirements.

Figure 4.2: System Model: Physical infrastructure (left side) and SFC request (right side)

4.2.1

Perfect 2-Factor algorithm

If we represent each VNF chain of this forwarding graph by a cycle as depicted in Figure 4.3 by adding a fictitious arc between the first VNF and the last VNF composing a chain, the VNF chains placement problem will consist of finding a subset of cycles in the physical graph capable of hosting the VNFs and meeting both the chain sequencing and flow requirements. Solving the VNF chain and placement problem reduces to finding all possible cycles verifying the VNF chains link and node requirements. To do so, we need to explore the substrate graph for cycles that meet the resources (links) availability constraints. This is known as the Perfect 2-Factor problem that happens to have polynomial time complexity [39].

Figure 4.3: Cycle representation of a VNF chain

VNF Placement and Chaining Algorithms

51

The Perfect 2-Factor problem consists of searching for nodes in the substrate graph with a node degree (number of neighbors) of 2. This is equivalent to finding cycles in the substrate graph, with minimum capacities, verifying arcs and nodes resource limitations or constraints. Thus, we propose to use the convex hull (linear programming) of the Perfect 2-Factor polytope, to cope with the problem for VNF chains with 2 arcs (see Figure 4.3). For VNF chains requests with length exceeding 3 arcs, we extend the convex hull of the VNF chaining and placement problem by adding new valid inequalities to accelerate convergence time to the optimal solution. Let Ce be the residual bandwidth on the link e (e ∈ Es : set of physical edges) of the substrate graph, and Rc is the required bandwidth (flow rate) of the VNF chain c. Finding a solution to the VNF chaining and placement problem consists of finding edges e ∈ Es and nodes v (v ∈ Vs : set of physical servers) with sufficient capacity (enough bandwidth and CPU respectively) to host the demand. For sake of rapid convergence, we need to eliminate right away all the edges of Es that cannot serve the demand. We achieve this elimination by minimizing the following utility function: X min (Ce − Rc )1+ xe (4.1) e∈Es ,e=(i,j)

where xe is a binary variable indicating if an edge e is selected in the solution (i.e. xe = 1) or not (i.e.  xe = 0). Since we aim at reducing the consumption of the total bandwidth, we use: 1, if Ce − Rc > 0; 1+ = ∞, otherwise. To describe the convex hull of the Perfect 2-Factor problem, we use equality (4.2) to ensure that all the nodes/servers in the substrate graph will have a degree 2, and will thus be covered by a cycle. X

xe = 2

(4.2)

e∈δ(v)

where δ(v) represents the set of edges that are incident to the node v (v ∈ Vs ). The following family of valid inequality (known as Blossom Inequalities- see [39] for more information) reinforces the previous constraints to get integer solutions of the considered problem using continuous variables. They are given by: X e∈E(G(A))

 xe + x(F ) ≤

2|A| + |F | 2

 (4.3)

P where A ∈ Vs , and F ⊆ δ(A), v∈A 2 + |F | is odd. Blossom inequalities are essentially used to solve NP-Hard problems in Operations Research. They are for instance used to solve the Traveling Salesman Problem (see [68] for details). Thus, the mathematical model of the VNF chaining and placement problem is similar to the linear program of the Perfect 2-Factor given as follows:

52

Virtualized Network Functions Chaining and Placement Problem

P min Z = e∈Es ,e=(i,j) (Ce − Rc )1+ xe S.T.  P:   Pe∈δ(v) xe = 2, j k   2|A|+|F | x + x(F ) ≤ , e e∈Es (G(A)) 2 P   F ⊆ δ(A), v∈A 2 + |F | is odd   xe ∈ R+ ,

∀v ∈ Vs ; ∀A ∈ Vs ;

(4.4)

; ∀e ∈ Es ;

The mathematical model (4.4) will polynomially find cycles in the substrate graph, verifying the required flow of each VNF chain c in the forwarding graph. Moreover, this model is also minimizing resources consumption by assigning the minimum amount of available resources that meets the VNF chains requests. More details on the Perfect 2-Factor problem and the associated linear programming approach (4.4) can be found in [39]. To place a VNF chain noted by ”c”, with a length lc (number of arcs of the chain), we should find cycles with a length at least equal to lc (this will enable hosting the required VNFs by allocating one VNF to each node in the worst case), and this is not necessarily verified in the model (4.4). Indeed, we have the following result: Proposition 4 The VNF chaining and placement problem can be entirely formalized to reach nearoptimal/optimal solutions in negligible time for VNF chains c with lc = 2 (i.e. chains with 3 VNFs).

Proof For VNF chains with 2 arcs (a chain with 2 or 3 VNFs), we add a fictitious arc from the last VNF to the first VNF in the forwarding graph as shown in Figure 4.3. This fictitious arc is weighted by the VNF chain required throughput. This flow is equal to 10 Mbps in Figure 4.3. This transforms the VNF chain to a VNF cycle. We can then search for cycles (of size 3) meeting the required flow. That is, we look for all existing cycles in the substrate graph, with length equal to 3. This problem is equivalent to the polynomial Perfect 2-Factor problem which will be used to find cycles in linear time even for large problem instances. As the polytope and the convex hull of the Perfect 2-Factor problem is completely described (see (4.4)) and can be found in [39], we will use this description to polynomially find all the cycles of the substrate graph, with length equal to the size of our VNF chains (i.e. 3 VNFs). We conclude that placing optimally VNF chains with 3 VNFs, is similar to finding cycles meeting the required flows in a polynomial time. 2 For VNF chains with length exceeding 3 arcs (lc ≥ 3 or VNF chains with at least 4 VNFs), our chaining and placement problem is NP-Hard (see [9] for instance), and we should investigate new valid inequalities to accelerate resolution time needed to reach the optimal/exact solution in negligible times. Therefore, a heuristic algorithm is proposed in the next section to cope with scalability issue.

VNF Placement and Chaining Algorithms

4.2.2

53

Algorithm based on multi-stage graph construction

To cope with scalability issues, and to consider requests with at least 3 VNFs, we propose a heuristic approach using graph theory. The principal objective function and constraints of this approach are similar to those given by (4.4). Our heuristic uses a multi-stage graph construction detailed in the sequel. Let M = (V, A, K) be the multi-stage graph constructed from the set V of all available physical servers, the set A of arcs as defined below, and the number of levels K which is equal to the total number of requested VNF types. Thus, the levels of the graph M is equal to the involved VNF types k, with k = 1, . . . , K. To populate the graph M , we assume that each server is capable of hosting a number of VNFs of different types and their combination results in an amount of required resources available on the physical server. From the multi-stage graph construction standpoint this means that if a given node, say s∗ , can host both VNF f1 and VNF f2 , then this s∗ will appear at the VNF f1 and VNF f2 levels of the multi-stage graph. A node can appear in as many levels as VNF types it can handle. The set of arcs in the multi-stage graph follows a rule that makes sure that VNFs from the same service chain are assigned to hosting nodes without exceeding available resources in the selected nodes: • There exists an arc from the server i at level k (noted by (ik )) to the server j at level k + 1 (noted by jk+1 ) to construct a multi-stage graph. This construction rule ensures that VNFs of the same service chain can share the same server according to the available server capacity (CPU in our work). In addition, the arc (ik ; jk+1 ) has a weight given by the maximum flow that can be routed from the server ik to the server jk+1 in the physical substrate. Figure 4.4 represents an example of the extended multi-stage graph with arbitrary values of available resources and capacities. We considered forwarding graphs with 3 VNFs and a physical infrastructure with 4 servers.

Figure 4.4: Multi-stage graph construction example To find the optimal locations of the VNFs in the order expressed in the chain requests, we use

54

Virtualized Network Functions Chaining and Placement Problem

a flow marking process (with a symbol +) starting from fictitious node T to the fictitious sink S as depicted in Figure 4.4. The marking process operates as described below until node S is reached, with details on the procedure provided in Algorithm 4. Note that Figure 4.4 illustrates the remaining number of CPUs on the hosting nodes after allocation of CPUs (barred values) are removed from the available compute resources to indicate a transition from the current state to a new state. Algorithm 4 Multi-stage Graph Algorithm Input: Substrate graph, forwarding graph. Output: A mapping of the forwarding graph on the physical substrate This is summarized formally in steps: Step 1: Create the multi-stage graph M according to the description given above; Step 2: For each chain c of the forwarding graph, apply a flow marking process from T to S; Step 3: If node S is marked (for all the chains), then a solution has been found. Otherwise, the request is rejected.

Marking process description • Starting from T (already marked), select the node (at level K) with minimum (or maximum) processing capability; • Suppose that a node jK is selected and marked. Update server’s j capability at each level h (1 ≤ h < K) in which j appears. Explore all the predecessors iK−1 of jK , and mark one of the predecessors (or the predecessor) representing an extremity of an arc with a minimum residual resource (bandwidth). If we have more than one with the same capacity, than choose one which has an extremity node with a minimum (or maximum) processing capability. • Repeat this process until the node S is reached and marked.

4.3

Summary of Results

To assess our algorithms’ performance, we use the following metrics: 1. Execution Time: is the average time required by the algorithms to find near optimal solutions; 2. Acceptance Ratio: is the average number of requests (VNF-FG) accepted for hosting in the physical substrate. We start by assessing execution time of the described algorithms to find near optimal solutions for the requested VNF chaining and placement. We considered different scenarios with physical networks in the [10, 150] interval, and the parameters of the VNF-FG requests are as described above. Figure 4.5 depicts the execution time of the multi-stage and the 2-Factor algorithms. This execution time remains below 0.05sec for the worst case of this scenario for the multi-stage method.

Summary of Results

Figure 4.5: Convergence time of our algorithms

55

Figure 4.6: Service request acceptance ratio over time

Execution time of the 2-Factor solution is negligible (0.45 sec). The proposed approaches scale well even for large graph instances with a significant advantage for the multi-stage algorithm (up to 150 servers used in the performance evaluation). Figure 4.6 depicts the acceptance ratio of the multi-stage (min and max) and 2-Factor approaches. Clearly, the 2-Factor method performs better than the two other variants of the multi-stage algorithm, as it explores the entire solution space in a polynomial time thanks to the description of the convex hull given in (4.4). This leads to get acceptance ratios often close to 100% when the multistage approaches (min and max) realize acceptance ratios close to 95% at the end of simulation time. We compare our proposed methods to the state of the art, using metrics (acceptance ratio and cost) feasible for fair comparisons. Indeed, depending on the used scenarios, evaluation conditions and metrics some comparisons are not possible with previous work (we considered requests with a uniform number of VNFs ranging in [5, 10] interval). We have nevertheless, selected a number of viable comparisons involving real traces from the Interoute-Network [34] (a network with 110 nodes and 148 edges). The results are summarized in Table 4.1. Table 4.1: Comparison using Interoute-Network Multi-stage-max Multi-stage-min Acceptance Ratio (%) 99.9 95 Cost 200 192

[64] 93 180

Table 4.1 addressed the embedding cost of our proposed algorithms and the state of the art cost of [64]. The embedding cost of [64] is lower than the cost of our proposed approaches but this is achieved at the expense of quality with more sub-optimal placement in [64] that accepts less requests (93% only). Our algorithms accept more requests and this costs more anyway. Looking at acceptance ratio and cost jointly, we outperform [64].

56

Virtualized Network Functions Chaining and Placement Problem

To asses scalability of our approaches, we compare performance with [45] and report the outcomes in Table 4.2 in terms of required execution times to find the solutions for two cases: VNF chains with 3 nodes with a physical substrate of 50 and 200 servers. Table 4.2: Comparison with state of the art of [45] Average Time (s) Multi-stage (min/max) 2-Factor [45] 50 nodes 0.005s 0.13s 60 200 nodes ≈ 0.05s 0.45s 1000 There is a significant improvement with our proposed algorithms in execution time compared to [45]. This gap is even more important for large graph sizes (case of 200 servers) Our algorithms are much faster and can find solutions in few seconds (less than 0.45 seconds in the worst case for the evaluated scenarios) compared to [45] execution time that are much higher in the order of 1000 seconds.

4.4

Conclusion and Future Work

We presented in this chapter a complete mathematical description of the convex hull of the chaining and placement problem using a 2-Factor mathematical model for forwarding graphs with exactly two arcs (or 3 VNFs). Since the problem is NP-Hard in the more general case, we proposed a new heuristic solution based on the construction of a multi-stage graph. The heuristic improves the state of the art in execution time and acceptance ratios and scales well with increasing forwarding graph sizes and infrastructure sizes. Our solutions are competitive in terms of performance and complexity compared to the current state of the art. In a future work, we will consider the VNF chaining and placement problem for VNF-FG with at least 4 VNFs. The problem becomes harder in terms of algorithmic complexity view, and we will investigate new valid inequalities to accelerate time convergence to the optimal solution. This is detailed in Chapter 5. Another ongoing extension of this chapter consists to consider servers and links reliability constraints in the VNFs placement and chaining problem. We will investigate the use of polyhedral aspects to find optimal solutions in acceptable execution times.

Chapter 5

Research Perspectives and Challenges The aim of this chapter consists in identifying new research and technological challenges in cloud computing and NFV with a focus on 5G networks. These challenges concern VNF placement and chaining problem under reliability constraints, to deal with large forwarding graphs, virtual Content Delivery Networks (vCDN) placement and migration problem as already proposed by ETSI as a use-case (see reference [17]), and a last emergent domain of 5G networks in which we address the problem of Base Band Units (BBUs) functions split in Cloud Radio Access Networks (C-RAN).

5.1

VNF Placement and Chaining

We recall that VNF-FGs placement and chaining problem is addressed in Chapter 4, and new scalable solutions are proposed invoking an exact mathematical formulation based on the Perfect 2Factor theory to deal with the placement of VNF-FGs with 3 VNFs. To cope with the NP-Hardness of this problem for VNF chains with at least 4 VNFs, and to judiciously place these forwarding graphs in acceptable times, we need new valid inequalities to accelerate convergence time to the optimal solution. For a chain c, with a length lc (number of arcs), we propose new valid inequalities allowing to find cycles with length at least equal to lc . This means that we will investigate solutions excluding all the cycles of length lower than lc . The following proposition describes a new valid inequality to enhance considerably the necessarily time resolution of the VNF-FGs placement and chaining problem to reach optimal solutions for chains with at least 4 VNFs. Moreover, the new inequality consists to better describe the convex hull of the placement and chaining of VNF-FGs, by eliminating all the infeasible candidates not verifying cycle length constraints. Indeed, this allows to solve practical problem instances in acceptable times. 57

58

Research Perspectives and Challenges

Proposition 5 For a given VNF chain c, the following inequality is valid for the VNF-FGs placement and chaining problem: x(U ) ≤ |U | − 1, ∀ U ⊆ Vs , and 3 ≤ |U | ≤ lc

(5.1)

Note that x is the {0; 1} incidence vector of the convex hull of the VNF-FGs placement and chaining problem. Using this new valid inequality (5.1) which represents a facet of the VNF-FGs placement and chaining problem, one can formulate a new mathematical model verifying all the previously cited requirements and constraints. For each given chain c with a flow requirement Rc , we obtain a new formulation given as follows: P min Z = e∈E,e=(i,j) (Ce − Rc )1+ xe S.T.P:   e∈δ(v) xe = 2,  j k  P  2|A|+|F |  x + x(F ) ≤ ,  e∈E(G(A)) e 2 P F ⊆ δ(A), v∈A 2 + |F | is odd     x(U ) ≤ |U | − 1, ∀U ⊆ Vs , 2 ≤ |U | ≤ lc ;   xe ∈ {0; 1},

∀v ∈ Vs ; ∀A ∈ Vs ;

(5.2)

; ; ∀e ∈ Es ;

It is important to mention that inequalities (5.1) are strongly related to the selection of the set of vertices U in Vs . To add these inequalities to our mathematical model, we should generate subsets of vertices U to be used to check if (5.1) are violated or not. The subset U should be judiciously chosen, but in our case, as we solve problems with instances of practical size, one can enumerate some cases and add them to the linear program. Few generated sets of U can be enough to find quickly the optimal solution. The problem of finding judiciously all the sets U of nodes that violate inequalities (5.1) can be an NP-Hard separation problem. This is due to the exponential number of possible subsets U that can be found in Vs . Thus, the separation problem of inequalities (5.1) is another ongoing research challenge leading to find efficient subsets U violating these constraints. For instance, Lawler et al. [68], proposed an algorithm based on network flow techniques, to efficiently separate inequalities (5.1). After deploying the VNFs of VNF-FGs on the obtained cycles, we will investigate new solutions to verify the feasibility of the chains obtained within these cycles. For example, flow techniques (see reference [29]) may be used to accelerate the verification of the feasibility of the chains.

Reliability constraints Next to that, we focus on the variant of VNF placement and chaining problem when taking into account physical servers and links reliability constraints. Hence, we propose to investigate new

virtual Content Delivery Networks (vCDN) Placement and Migration: an ETSI Use-Case

59

methods to identify and detect nodes and links failures before deploying new placement and robust algorithms. Moreover, these solutions should be dynamic to take optimal decisions according to dynamic network changes. This future work will be considered to address the following: • Placement and orchestration of VNFs chains to avoid nodes and links failures, • New and cost efficient migration algorithms according to SLA violation constraints, • Integrate the results into Management and Orchestration (MANO) component of the NFV ETSI architecture, to test and compare our results with the state of the art.

5.2

virtual Content Delivery Networks (vCDN) Placement and Migration: an ETSI Use-Case

Among the network applications meeting the virtualization challenges through softwarization, the Content Delivery Networks is one of promising and important research challenges to be addressed according to NFV and cloud paradigms. vCDN is the software that executes CDN services over a virtualized infrastructure. Content caches distributed over different nodes of an operator’s network, should be easily updated and migrated to enhance end users’ QoS when reducing the operational costs of caching providers. Despite the existence of placement techniques used by caching providers, we think that these solutions lack optimal optimization leading to efficiently reduce the operational costs of CDN providers. We believe that vCDN services (especially, video services) performances can be enhanced by considering various network parameters when adopting NFV techniques. This leads to improve the overall caching performance and obtain a clear view on where and how to migrate a vCDN. We will address the problem of placement and migration of vCDNs (a use-case addressed by ETSI in [17]) under various constraints such as the positions of access points of the demands and network available resources (see Figure 5.1 form more details). We will investigate mathematical models describing the convex hull of the vCDN placement and migration problem when considering the operator network constraints. This mathematical formulation and optimization has to minimize the cost of content migration and the costs of streaming and link utilization, and should reach optimal solutions even for large problem instances. As the vCDN placement and migration problem is a well known NP-Hard problem, our contribution consists in finding optimal solutions in negligible times even for large problem instances. Indeed, we will investigate new methods using graph theory and linear programming enabling to reach optimal or near optimal solutions of this problem.

60

Research Perspectives and Challenges

Figure 5.1: vCDN placement and migration problem

5.3

Optimal Split of BBU Functions in C-RAN

Cloud Radio Access Networks (C-RAN) is used by network operators to propose radio access services using cloud computing and NFV paradigms (see [60] for more details). The objective of CRAN consists in proposing dynamic and more agile services by modifying the classical architecture of cellular networks in which each Base Band Unit (BBU) is connected to one Remote Radio Head (RRH) causing wasted resources and a huge amount of consumed energy. Figure 5.2 illustrates how C-RAN architecture should be in the future.

Figure 5.2: C-RAN Macroscopic Architecture View

Optimal Split of BBU Functions in C-RAN

61

As it is mentioned in the literature (see Nickein [56], and Jingchu et al. [44], for example), the C-RAN consists in virtualizing the BBUs to construt a BBU pool that will be shared by various RRHs in the same time. This BBU pool will be dynamically deployed on a set of available physical servers to absorb demands fluctuations and peaks. In other words, base stations will be provisioned on demand and according to end-users positions and demands. Thus, Cloud RAN allows network operators to be more agile and able to serve large number of end-users in the same time when reducing energy consumption of the data centers hosting the BBU pools. Nevertheless, and to beneficiate from C-RAN advantages, network operators are faced to certain research and technological challenges that should be discussed and solved to improve network performances and the Quality of Experience (QoE). As an ongoing research direction, we will focus on how to optimally split the BBU functions to perform better C-RAN ? (see Figure 5.3).

Figure 5.3: BBU functions splits in C-RAN Figure 5.3 illustrates a set of various and necessary functions of a BBU that should be solicited to treat uplink and downlink mobile end-users demands. It is important to note that these functions communicate together to guarantee the feasibility of the uplink and downlink operations. To enhance network performances and reduce resource consumption, we investigate solutions to judiciously split these functions in order to find good trade-off between the necessary computation time and the latency. This split is under the following constraints: 1. Reduce fronthaul network resource consumption, 2. Reduce the latency by deploying certain BBU functions in the same pool, but this can degrades computation time Our contribution to C-RAN consists in proposing a new modeling solution of an optimal BBU functions split under the detailed constraints. Our approach consists in representing the uplink and downlink operations as directed and weighted graphs to be deployed or mapped on a larger physical graph. This is very similar to the Virtual Network Embedding problem, that we discussed

62

Research Perspectives and Challenges

and solved efficiently in our previous work [26]. In fact, the reference [26] is using graph theory to propose rapid solutions that can reach the optimum in negligible times even for large problem instances. As our approach is based on an exact mathematical formulation leading to always find the optimal solution for practical problem instances, thus, we believe that our proposal on BBU functions split will outperform the existing solutions of the art.

Bibliography [1] http://netfpga.org/2014/. ´ , A. An optimization case in [2] A BBASI , Z., X IA , M., S HIRAZIPOUR , M., AND TAK ACS support of next generation NFV deployment. In 7th USENIX Workshop on Hot Topics in Cloud Computing, HotCloud ’15, Santa Clara, CA, USA, July 6-7, 2015. (2015). [3] A HUJA , R. K., M AGNANTI , T. L., AND O RLIN , J. B. Network flows: Theory, algorithms, and applications. Prentice Hall (1993). [4] A RMBRUST, M. Above the Clouds: A Berkeley View of Cloud Computing. [5] A RMBRUST, M., F OX , A., G RIFFITH , R., J OSEPH , A. D., K ATZ , R. H., KONWINSKI , A., L EE , G., PATTERSON , D. A., R ABKIN , A., S TOICA , I., AND Z AHARIA , M. Above the clouds: A berkeley view of cloud computing. Technical Report Identifier: EECS-2009-28 (2009). [6] A SSIS , M., B ITTENCOURT, L., AND T OLOSANA -C ALASANZ , R. Cloud federation: Characterisation and conceptual model. In Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on (Dec 2014), pp. 585–590. [7] B EN -A MEUR , W., H ADJI , M., AND O UOROU , A. Networks with unicyclic connected components and without short cycles. Electronic Notes in Discrete Mathematics 36 (2010), 961– 968. [8] B ERMBACH , D., K URZE , T., AND TAI , S. Cloud federation: Effects of federated compute resources on quality of service and cost. In Cloud Engineering (IC2E), 2013 IEEE International Conference on (March 2013), pp. 31–37. [9] B ERNARDETTA , A., DALLAL , B., B OUET, M., AND S TEFANO , S. Virtual Network Functions Placement and Routing Optimization. In IEEE Int. Conference on Cloud Networking (http://www.ieee-cloudnet.org/, Canada, Oct. 2015). [10] B IHA , M. D., AND M AHJOUB , A. Steiner k-edge connected subgraph polyhedra. J. Comb. Optim. 4, 1 (2000), 131–144. [11] C HEN , G., JAGADISH , H., J IANG , D., M AIER , D., B ENG , C., K IAN -L EE , T., AND WANG C HIEW, T. Federation in cloud data management: Challenges and opportunities. Knowledge and Data Engineering, IEEE Transactions on 26, 7 (July 2014), 1670–1678. 63

64

BIBLIOGRAPHY

[12] C OHEN , R., L EWIN -E YTAN , L., NAOR , J., AND R AZ , D. Near optimal placement of virtual network functions. In Computer Communications (INFOCOM), 2015 IEEE Conference on (April 2015), pp. 1346–1354. [13] DAHL , G. An introduction to convexity, polyhedral theory and combinatorial optimization. ´ , G. The tight bound of first fit decreasing bin-packing algorithm is ffd(i) 0}. Definition A.2 An augmenting path noted by µ, is a path from s to t in Rf . Definition A.3 A residual capacity of µ is noted by cf (µ) = min{cf u|u ∈ µ}. Definition A.4 A cut in R = (V, E, c) is a partition of vertices of V given by (S, S) where s ∈ S and t ∈ S. The P capacity of this cut is given by c(S, S) = i∈S,j∈S c(i, j). The following theorem establishes the relationship between a maximum flow and a minimum cut.

72

Mathematical Tools for Combinatorial Optimization Problems

Theorem A.1 Let f be a flow in R = (V, E, c). There is an equivalence between the three following proposals: 1. f is a maximum flow in R, 2. There is no augmenting path in R, 3. There exists a cut (S, S) such that |f | = c(S, S). We note that Ford-Fulkerson approach has an O(f ∗ |E|) algorithmic complexity.

Gomory-Hu algorithm In the following, we give another approach to find maximum flow between each couple of nodes in a given graph. We introduce the algorithm of Gomory-Hu tree and its application on connected graphs. In depth descriptions and details can be found by interested readers in [39], for example. Definition A.5 For a given connected graph G = (V, E) and a capacity c : E → R+ , a Gomory-Hu tree for G is a tree GH = (VGH , EGH ) such that for each edge e = (s, t) ∈ EGH , δ(U ) is a minimum capacity s − t cut of G, where U is any of the two components of GH \ e, and δ(U ) is the sum of capacities of connected edges to U . A Gomory-Hu tree can be found through N − 1 (N is the number of vertices in G) applications of a minimum capacity cut algorithm. In the following, we present the Gomory-Hu algorithm for a graph G = (V, E), where V is the set of N nodes and E is the set of M edges. The Gomory-Hu tree transformation compacts the original graph structure via successive cuts and√reduces the number of edges to (N − 1) rather than M . This algorithm has a complexity of O(N 2 M ). A Gomory-Hu transformation is provided in Algorithm 5, and a detailed graphic example is shown in Chapter 3.

A.1.2

b-Matching problem

The b-Matching problem is considered as a simple generalization of the Matching problem, which consists in identifying a set of edges M in a graph G, and every node of G is covered by one edge of M . More details on Matching theory can be found in [24, 39], for example. We consider b : V → Z+ , the b-Matching is a function x : E → Z+ such that x(δ(v)) ≤ b(v), ∀v ∈ V . A perfect b-Matching consists simply to replace the above inequality by an equality x(δ(v)) = b(v), ∀v ∈ V . The b-Matching polytope can be found in [39], and is provided as follows:

Combinatorial Optimization : Some Basic Examples

73

Algorithm 5 Gomory-Hu tree transformation algorithm Input: G = (V (G), E(G)) Output: TG = (V (TG ), E(TG )) E(TG ) ← ∅ Q = {V (G)} while Q = 6 ∅ do S ← pull(Q) //Pull the first element from Q {S1 , S2 } ← Minimum-Steiner-Cut(S, TG ) //Update the TG structure V (TG ) ← {V (TG ) \ S} ∪ {S1 , S2 } E (TG ) ← E (TG ) ∪ (S1 , S2 ) //w.r.t a cut size λS1 ,S2 //Update the queue Q if |S1 | > 1 then Q ← Q ∪ S1 end if if |S2 | > 1 then Q ← Q ∪ S2 end if end while 0 ≤ x(e) ≤ 1, x(δ(v)) ≤ b(v), x(E(W )) + x(F ) ≤ b

A.1.3

b(W ) + |F | c 2

∀e ∈ E ∀v ∈ V

∀W ⊆ V, F ⊆ δ(W ), (b(W ) + |F |) is odd

(A.3)

Matroids

A matroid is a combinatorial abstraction of linear dependence in vector spaces and of certain aspects of graphs. It appears to become more and more central object in combinatorics, especially in combinatorial optimization (see [24]). A matroid M = (E, F) (where E is a finite set and F is a set of subsets of E) satisfies: 1. ∅ ∈ F. 2. If A ∈ F and B ⊆ A, then B ∈ F. 3. If A, B ∈ F, and | B |>| A | then ∃e ∈ B \ A, such that A ∪ {e} ∈ F. The elements of F are called independent sets of the matroid. Two standard examples of matroids are the following (see [42], [49], and [58] for example): • Graphic matroid : Let G be a graph, and E the set of edges of G. If F = {J ⊆ E : G = (V, J) is a forest }, then we obtain a graphic matroid.

74

Mathematical Tools for Combinatorial Optimization Problems • Transversal matroid : Let G = (V1 ∪ V2 , E) be a bipartite graph, where V1 represents F = {J ⊆ V1 : there exists a Matching in G covering J}. If V2 is a subset of V1 , then an independent set of G is a transversal matroid.

A.2

Cloud Computing and NFV

As cloud computing paradigm becomes popular and more accessible, the concept of cloud services expands and enables cloud providers to increase their revenues when satisfying various end-users requests. Some of these services are deployed and proposed by multiple providers as virtual resources at infrastructure level (IaaS services)(see references [5], [57] and [15] for more details). To achieve increasing revenues, cloud providers require very efficient resource utilization and the strict respect of quality of service. Cloud federation is an additional paradigm that can be used to increase revenue for providers through sharing of available physical and virtual resources. Federation can make the cloud computing business more profitable in supporting multiple concurrent services [30]. Cloud computing is dominated by key actors such as Amazon, Google and Microsoft that offer cloud resources, platforms and software as a service using different pricing models. Some research challenges and problems in the cloud consist in providing connectivity as a service by relying on heterogeneous resources by a network under various constraints such as QoS, fair resource sharing, and delay constraints (see [10], [47], [35], and [40] for example). Network Functions Virtualization (NFV) enables provisioning of on demand networking services including connectivity as a service and facilitates the development of agile networking services tuned to application and tenant requirements. NFV leverages virtualization technologies (Hypervisors, Containers such as Docker, . . .) to realize network multi-tenancy with the network infrastructure shared by multiple tenants. NFV implements networking functions as software to be executed on industry standard physical nodes or commodity hardware. The software realizations or components, known as Virtualized Network Functions (VNFs), can be deployed, migrated, relocated, shut down, activated and upgraded more easily. Typical VNFs are Deep Packet Inspection (DPI), Firewalls, Classifiers, Load Balancers, Routers/switches, NATs, DNSs,. . .. NFV can reduce cost and time to market when delivering network services compared with, for instance, Middleboxes deployed on expensive and proprietary hardware inducing high costs and slowing down access to the market. In Mijumbi et al. [51], one can find additional insight on the history of NFV and its relationship with cloud computing. In addition, there exist working groups on NFV such as European Telecommunications Standards Institute (ETSI) [19], and Internet Engineering Task Force (IETF) [33] . These working groups introduced interesting NFV research challenges such as Service Function Chaining (SFC)[62] (introduced by IETF), also known as VNF-Forwarding Graph (VNF-FG) in ETSI [16]. An NFV reference architecture is also proposed by ETSI as illustrated by Figure A.1.

Cloud Computing and NFV

75

Figure A.1: ETSI’s NFV architecture The NFV architecture depicted in Figure A.1 is composed by a first layer in which one can find a set of VNFs implemented in software to be deployed on a physical infrastructure. The second layer considers servers that represent the NFV Infrastructure (NFVI) offering virtualized resources such as RAM, CPU cores, etc. These resources are managed and orchestrated by the NFV Management and Orchestration (MANO) component which contains cost-efficient solutions that manage and allocate efficiently resources to end-users and tenants. For the sake of clarity, Figure A.2 provides more details on the MANO architecture as it is described in [18]. Thus, our optimization algorithms described in the earlier chapters can be deployed in the orchestrator (NFVO) to better manage the resource allocation according to demands requirement.

76

Mathematical Tools for Combinatorial Optimization Problems

Figure A.2: NFV MANO architecture