To meet the demands of the Exascale era and facilitate Big Data analytics in the cloud while maintaining flexibility, cloud providers will have to offer efficient virtualized High Performance Computing clusters in a pay-as-you-go model. As a consequence, high performance network interconnect solutions, like InfiniBand (IB), will be beneficial. Currently, the only way to provide IB connectivity on Virtual Machines (VMs) is by utilizing direct device assignment. At the same time to be scalable, Single-Root I/O Virtualization (SR-IOV) is used. However, the current SR-IOV model employed by IB adapters is a Shared Port implementation with limited flexibility, as it does not allow transparent virtualization and live-migration of VMs. In this paper, we explore an alternative SR-IOV model for IB, the virtual switch (vSwitch), and propose and analyze two vSwitch implementations with different scalability characteristics. Furthermore, as network reconfiguration time is critical to make live-migration a practical option, we accompany our proposed architecture with a scalable and topology agnostic dynamic reconfiguration method, implemented and tested using OpenSM. Our results show that we are able to significantly reduce the reconfiguration time as route recalculations are no longer needed, and in large IB subnets, for certain scenarios, the number of reconfiguration subnet management packets (SMPs) sent is reduced from several hundred thousand down to a single one.
Permanent URL (for citation purposes)