Performance evaluation of windows virtual machines on a Linux host

Virtualization has experienced a dramatic expansion recently and today is ubiquitous in modern IT industry since it provides numerous benefits to companies and individual users. It increases efficiency, flexibility and scalability of IT equipment by enabling different software-based environments on a single physical hardware. Each virtual machine is a separate instance that is completely independent and separated from the computer hardware and it runs on emulated hardware. Emulated hardware is managed by virtualization tool that provides lower resources when compared to physical hardware. This paper presents a performance evaluation of three different virtual machines run by three recent versions of Windows operating system, namely Windows 7TM Professional, Windows 8.1TM Professional and Windows 10TM Professional, on a host computer system run by Linux Ubuntu. Performance measurement results show that Window 7 is the most suitable virtual operating system since it obtains the best performance when run on a Linux host.


Introduction
Over the years, virtualization of computers and operating systems has grown in one of the keystone technology and today it is ubiquitous in modern IT industry from huge data centres to personal computers, and it is used by majority of organizations and IT companies, in general. The biggest advantage is that it enables heterogeneous services to be hosted upon shared physical infrastructure. By using virtualization, users can have more different software-based environments for various usages on one computer system, which enables reduction of expenses while boosting efficiency, flexibility, scalability and agility [1]. Virtualization enables installation of one or more virtual computer systems known as virtual machines inside the existing computer system run by a host operating system. Each virtual machine is a tightly isolated software environment that is completely independent, separated from the computer hardware and run on emulated hardware with usually lower resources than on physical hardware [2]. Therefore, it is necessary to evaluate performance of various virtual machines on the same host in order to select the most efficient ones that can provide the best hardware utilization and achieve the best performance. In particular, we are interested in examining and studying key performance metrics related to computer components with the biggest impact on the performance: Central Processing Unit (CPU) scheduling, memory management, graphics subsystem management and disk drive management [4].
This paper, to some extent, continues our work descried in [3] where we used a reverse logic in order to study how different host operating systems influence virtual machine performance. However, it still remains unknown which virtual machine achieves the best performance while running on the identical host. This brings new challenges in the field of computer system performance evaluation since still there is no standard and proven experimental method, setup, process or approach for the virtual machine performance measurement process and results evaluation.
In this paper, we study performance of three different virtual machines on the identical host computer system. Linux Ubuntu is used as a host operating system and three latest versions of Windows operating system, namely Windows 7 TM Professional, Windows 8.1 TM Professional and Windows 10 TM Professional, are used as virtual machine operating systems. Three different benchmark applications are used for performance measurement conducted in the identical and controlled conditions for all three operating systems. Performance evaluation shows that Windows 7 still has the best performance when used as a virtual operating system on Linux Ubuntu host operating system when compared to newer versions of Windows operating system, Windows 8.1 and Windows 10. The main reason is that Windows 8.1 and Windows 10 cannot take an advantage of the improved architecture and new features since they require more hardware resources that are not available through emulated hardware on a virtual machine.
The paper is organized as follows. Section 2 presents related work, while Section 3 describes virtualization process and used virtualization tool -VirtualBox. Used versions of Windows and Linux operating system are described in Section 4, while Section 5 presents benchmark applications. Performance measurement setup, methodology and hardware impact analysis are presented in Section 6. Section 7 presents performance evaluation and results analysis. Section 8 concludes the paper.

Related work
Performance test plays a fundamental and irreplaceable role in the field of software assessment, especially in guaranteeing the quality, performance and reliability of an operating system [5]. However, it remains a challenge in a field of virtual machines and operating systems since it is a complex and a long-term process that requires equal conditions and controlled environment for all tested systems.
In our previous work, we have studied several different aspects of an operating systems performance on personal computers. This paper, to some extent, continues our work described in [3] where we studied an influence of three different host operating systems, namely Windows XP, Windows Vista and Windows 7 on performance of virtual machine run by Windows Vista. Performance measurement was conducted with five different benchmark applications and by performing two resources demanding operations: video encoding and data compression. Based on the performance evaluation results, it can be concluded that using Windows 7 host operating system provides the best performance for virtual operating system. In [4], we continued our work in the area of an operating systems performance evaluation. We preformed a performance evaluation in two different environments (low-end and high-end computer systems) of a three different versions of Windows operating systems, namely Windows XP, Windows Vista and Windows 7. Evaluation was conducted with a set of benchmark applications in five different areas: CPU scheduling, memory, graphics subsystem, disk drive management and network performance. Performance measurement results showed better performance of Windows XP in the majority of tests on the low-end computer system when compared to Windows Vista and Windows 7. Furthermore, on the highend computer system, newer operating systems showed improved performance in areas of memory management and graphics display, but other areas showed equal or lower performance than achieved in Windows XP. Furthermore, a performance measurement process and a performance evaluation model for Windows operating systems was developed and the similar will be used in this work.
A performance evaluation research in the field of virtual machines is one of the less addressed topics in the area of exploring operation systems performance. Therefore, in the literature, only several virtual machine performance studies can be found. In [6], a research study on performance of the most typical virtualization techniques under typical networked denial of service (DoS) attacks is presented. Authors showed that even a light DoS attack on all virtualization techniques suffer from greater performance degradation compared with same services running on nonvirtualized servers. Paravirtualization and Hardware Virtual Machine are most affected due to their inherent virtualization structure, while a container-based virtualization is less exposed to performance degradation. In [7] performance evaluation of Windows XP virtual machine operating system focused on the Input/Output (I/O) read performance was conducted. Results show that several factors, such as virtual machine cache configurations, access modes and request sizes and affect I/O throughput. In order to increase the read performance, a unified virtual machine cache that can support more than one virtual machine synchronously was developed.

Virtualization
Virtualization [8] can be considered as a framework or methodology for dividing the resources of computer hardware into multiple execution environments by applying one or more concepts or technologies such as hardware and software partitioning, time-sharing, partial or complete machine simulation, emulation and others. Virtualization uses software to simulate real hardware and create a virtual computer system. Virtualization can apply to applications, servers, storage and networks [9].
The main benefits [10] of virtualization are that it can increase IT agility, scalability and flexibility while significantly reducing costs. Workloads get deployed faster, performance increases and operations become automated, resulting in IT that is cheaper to own and operate and much more simple to manage. There are much more benefits which include: • Reducing capital and operating costs, • Minimizing downtime, • Increasing productivity, efficiency and agility, • Faster server provisioning and deployment, • Enabling business continuity and disaster recovery. A virtual computer system [8] is known as a virtual machine, isolated software container with an operating system and applications inside. Each virtual machine is completely independent. Multiple virtual machines can be put on a single computer system enabling several operating systems and applications to run on just one host. A thin layer of software, between the virtual machine and the host is called a hypervisor or a virtual machine manager, which allows multiple operating systems to share a single hardware host. The task of virtual machine manager is to handle resources and memory allocation for the virtual machines, ensuring they cannot disrupt each other, and also to provide interfaces for higher level administration and monitoring tools [11]. System virtualization [12] has been widely used for a variety of applications: • Consolidation of physical servers, • Isolation of guest operating systems, • Software debugging, • Intrusion and fault tolerance, • System migration, • Entire system backup, • Creating a personal cloud system, • Software debugging and testing.
Virtualization layer or platform maps requests from a virtual machine to physical requests and supports virtual environments with software approaches. Virtual environment can be provided with several different methods and at several different levels of abstractions [3]. Those levels are Instruction Set Architecture (ISA), Hardware Abstraction Layer (HAL), operating system and user level. ISA-level virtualization emulates another architecture by translating from one ISA to another, sometimes by rewriting instructions. Virtualization at HAL-level exploits the similarity in architectures between the virtual and host machines and use the native hardware to executes certain instructions without emulation. HAL-level virtualization used in experiments is hown in Figure 1.
Operating system level virtualization, also called container-based virtualization, is a method for deploying and running distributed applications without launching entire virtual machine for each application. Multiple isolated systems (containers) run on a single control host and access a single kernel. User level virtualization separates a user from a desktop (operating system and applications) and allows a user session to traverse across multiple desktops, operating systems versions and application delivery methods. It runs as a virtual instance on top of the underlying desktop components, separated from the desktop assets. There are several different levels of isolation with different resource requirements, isolation strength, performance overhead, scalability and flexibility [13]. Virtual machines have better isolation and separation from the host machine when the virtualization layer is closer to hardware [14]. However, more resource are required and flexibility is lower.

VirtualBox
VirtualBox [15] is a powerful cross-platform virtualization application. Developed initially by Innotek GmbH and currently owned by Oracle. VirtualBox runs on existing Intel or AMD-based computer systems whether they are running Windows, Linux, Macintosh or Solaris hosts. It also supports a large number of guest operating systems including Windows (NT 4.0, 2000, XP, Server 2003, Vista, Windows 7, Windows 8, Windows 10), DOS/Windows 3.x, Linux (2.4, 2.6, 3.x and 4.x), Solaris and OpenSolaris, OS/2, and OpenBSD. VirtualBox is being actively developed with frequent releases and has a huge list of features, supported guest operating systems and platforms it runs on. Here are some of VirtualBox main features: • Portability -VirtualBox runs on a large number of 32-bit and 64-bit host operating systems, • No hardware virtualization required -VirtualBox does not require processor features built into newer hardware like Intel VT-x or AMD-v so it can be used even on older hardware, • Guest additions -software packages which can be installed inside of supported guest systems to improve their performance and to provide additional integration and communication with the host system, • Great hardware support -guest multiprocessing, USB device support, full ACPI support, multiscreen resolutions, built-in iSCSI support and PXE network boot, • Virtual machine groups -a feature that enables users to organize and control virtual machines collectively, as well as individually, • Remote machine display -VirtualBox Remote Desktop Extension allows a high-performance remote access to any running virtual machine.

Operating systems
One of the most popular Linux distributions, Ubuntu 16.04.2 LTS (Xenial Xerus) was used as a host operating system [16]. It is based on Debian architecture and it is open source with both community and professional support. Ubuntu is suitable for both desktop and server use. The current Ubuntu release supports Intel x86 (IBM-compatible PC), AMD64 (x86-64), ARMv7, ARMv8 (ARM64), IBM POWER8, IBM zSeries (zEC12/zEC13) and PowerPC architectures. It also supports virtual file system feature [17], which represents an object-oriented form of file system implementation allowing user to the identical access to all files, regardless of file system that these files belong to. Ubuntu includes a wide range of software, covering every standard desktop application from word processing and spreadsheet applications, browsers, web server software, email software, programming languages and tools and even games. Many additional software packages are available from the built in Ubuntu Software Center. Ubuntu operates under the GNU General Public License and all software installed on Ubuntu is free software. In Ubuntu, emphasis has been also put on security, so developers are trying to make Ubuntu secure out-of-the-box. To achieve that, users programs run with low privileges, most network ports are closed to prevent hacking and disk drive encryption is also available.
Windows operating systems are most widely used desktop operating systems and in this paper the three latest versions of Windows operating system, namely Windows 7 Professional, Windows 8.1 Professional and Windows 10 Professional, are used as virtual machine operating systems. Windows 7 was built upon Windows Vista core architecture. The main focus during Windows 7 development was on user responsiveness and the main development goal was to improve performance in key user scenarios. This was achieved by improving existing kernel features such as ReadyBoost, ReadyBoot, memory and desktop window managers, simultaneous multithreading and timer management API, by removing kernel dispatcher lock and memory manager physical frame number global lock and by adding new features like user mode scheduling, unified background process manager, DirectX 11, core parking, etc. [18]. Windows 8.1 [19] continues to bring more improvements in operating system optimization and performance by adding a number of new features and upgrading the existing ones. It also uses modular component design so that each component of the operating system is defined as a separate and independent unit or module that supports hardware independence. Furthermore, it includes extensive support architecture with built-in self-diagnostics and troubleshooting. The most important improvements include shorter startup and shutdown times by using a hybrid shutdown technology, improved Unified Extensible Firmware Interface (UEFI) with better security and processor protection, USB 3.0 and DirectX 11.2 support, new Windows Imaging (WIM) format that dramatically reduces the size of image files, new Hyper-V machine virtualization technology that enables to run more than one 32-bit or 64-bit operating system at the same time on the same computer system [20]. Error detection for devices and failure detection for disk drives are also automated as well as performance issues, which include slow application startup, slow boot, slow standby/resume, slow shutdown, memory leaks and failing memory. New Windows mini-kernel, MinWin, reduces the Windows core to its absolute minimum by reducing all dependencies. Windows 10 [21] brings numerous new improvements in deployment, servicing, management and networking, but the biggest emphasis was put on the security. Furthermore, new development logic introduces a new way to build, deploy, and service Windows with "Windows as a service" approach that will deliver small feature updates two times per year instead of having significant features revisions every few years. Some of the deployment improvements include a self-service deployment for Windows 10 devices called Windows AutoPilot, Windows 10 Automatic Redeployment and new Hyper-V virtual machine gallery with automatic checkpoints. Servicing is improved by including new settings user experience for delivery optimization, new policies in Windows update for business, etc. Management includes new kiosk configuration and management features (multi-app scenarios, simplified lockdown configurations), co-management ability to manage Windows 10 devices using configuration manager and Intune at the same time, etc. Always On virtual private network is introduced as a new networking feature that enables to have remote computers and devices always connected to organization network. Most important security features include Windows Defender advanced threat protection, Windows Defender application and exploit guards, Windows information protection, improved warning prompts for end users, PIN protection in BitLocker, removal of server message block version 1 from clean installs, etc. Several general improvements include ability to run Windows 10 on ARM64 architecture, to access all files without using up your device storage with OneDrive files on-demand, improved battery life for laptops, performance tuning for storage subsystems, etc. [22].

Benchmark applications
Benchmark applications can be used for measuring performance of a complete computer system or just of a specific component. The following components have the greatest impact on the performance of the entire system: CPU, memory, graphics subsystem and disk drive. In this paper, three different benchmark applications were used to test performance of different virtual Windows operating systems on a Linux host with the same hardware in every experiment. Used benchmark applications are described below.

FinalWire AIDA64 extreme v5.90.4200
AIDA64 [23] is a system information, diagnostic and benchmarking tool for wide range of home and industry users. It provides large number of methods to measure system performance. These benchmarks are synthetic, so their results show only theoretical performance of the system. In this paper, following tests were used: • Memory tests -memory bandwidth benchmarks (read, write, copy), memory latency benchmarks, • CPU tests -mathematical operations, photo processing, compression, encryption, • Disk drive tests -measuring performance of storage devices.

Futuremark PCMark 8 v2.2.282
PCMark [24] is a complete benchmark tool for Windows operating systems. It is widely used for performance testing of various types of computers (PC, laptop, tablet) in home and at the office. It includes numerous benchmark tests, which can all be run individually allowing to choose benchmark that best describes purpose of a tested computer system. Each test produces a score, which can be used to evaluate computer system performance. In this paper, following suites were used: • Creative benchmark -measures system's ability to perform a series of entertainment and media tasks (web browsing, photo editing, video editing, media to go, mainstream gaming, video group chat, etc.), • Storage benchmark -measures performance of Solid State Disk (SSD) drives, Hard Disk Drives (HDDs) and hybrid drives using traces recorded from a selection of popular applications and video games (Adobe tools, Office tools, World of Warcraft, Battlefield, etc.).
Each performed test consists of several tasks that need to be completed. Some of these tasks are: load document, copy, save, apply filter, transcode, resize and others. The final result x final is calculated as a geometric mean of measurement results of all individual tasks x i , as show in Equation (1).

PassMark performancetest v8.0.1037
Passmark PerformanceTest [25] is consisted of 32 standard benchmark tests that are available in five test suites, which enable a detailed performance measurement and a computer system benchmark. In this paper, following test suites were used: • CPU tests -mathematical operations, compression, encryption and physics, • Memory tests -memory access speeds and latency, • 2D graphics tests -graphical user interface elements, vectors, bitmaps, text and fonts, • Disk drive tests -reading, writing and seeking within disk drive files and input/output operations per second.
Every test suite consists of several different performance measurement tests that provide separate measurement results x i , which are combined in one overall result for each test suite R n , as shown in Equation (2), where c i represents constant of each test suit and w i represents a weighted factor of each individual test.
6. Performance measurement setup, methodology and hardware impact analysis

Performance measurement setup
Performance measurement proces is conducted on a high-end desktop computer system with software and hardware configuration shown in Table 1. As a host operating system, the latest version of Linux Ubuntu 16.04.2 for desktop computers and laptops with the long-term support was used. Virtual operating systems and virtual machine emulated hardware characteristics used in our experiments, are shown in Table 2. For a virtual operating system, similar editions of the three following latest versions of 64-bit Windows were used: Windows 7 Professional, Windows 8.1 Professional and Windows 10 Professional.
There are several main editions of every Windows operating system and the professional editions were selected for each tested operating system since they are the most equivalent among different versions and add additional features that are oriented towards business environments and power users. All operating system were installed with default settings and immediately after installation the latest available updates were installed through Windows Update. This is crucial in a performance evaluation since updates include enhancements that could improve performance, stability and security and of an operating system. Furthermore, the other crucial performance measurement setup element is using the newest device drivers for each hardware component since similar to Windows updates they mainly include enhancements that enable better performance.

Performance measurement methodology
In order to ensure accurate, reliable and repeatable performance measurement process and results, to avoid errors and ensure equal experimental conditions for all operating systems during performance measurements, the performance measurement setup and procedure algorithm is depicted in Algorithm 1. The first three steps were conducted only once since they include host operating system management. Following two steps were repeated for every tested virtual operating system and the last four steps were repeated for every benchmark application. To ensure results' accuracy every measurement was repeated five times in the same working conditions and as the final result the arithmetic mean of five repetitions, as shown in Equation 3, was calculated for every metrics parameter. Furthermore, the final performance measurement results are reported with only significant digits rounded on two decimal places since third decimal digit represents too low deviation and enters the area of the measurement error.
Performance measurement results are compared by using Windows 7 results as a referent value and by calculating the percentage difference of Windows 8.1 and Windows 10 values with respect to Windows 7 values, as shown in Equation 4.
(4) The performance evaluation was based on the comparison of the virtual machine performance measurement results in all tests of each benchmark application. The results are represented by means of the following general metrics that comprises four major parts of an operating system with the greatest impact on the performance (memory management, CPU scheduling, graphics subsystem and disk drive management): • Number of points obtained in benchmark applications, • Number of performed operations, • Speed of specific tests, • Time required to complete complex operations.
On the host operating system besides the virtual machine and on the virtual operating systems besides the benchmark applications, no other applications were installed and no additional data files were placed on the SSD drive. Furthermore, during experiments there was no user activity on the system and the network was disconnected.

Hardware impact on performance measurement results
Every new version of an operating system is much more complex and consists of numerous new features and capabilities when compared to the previous ones. This trend is also obvious in computer hardware development. However, although the complexity is growing, it is expected to get improved performance with every new version of operating system and hardware. Virtual operating system is running inside a virtual machine on allocated hardware resources that could influence the overall system performance. In order to eliminate possible issues affecting performance measurement results, hardware impact on the system performance is analysed and briefly described below. All recommendations from the literature on how to avoid the negative impact of hardware on the operating system performance are considered and the used computer system in this paper is prepared accordingly. Number of CPU cores dramatically influence system performance but only the maximum available number of physical cores (real cores, no hyper threads) should be used. Since the advance memory management is present in all virtual Windows operating systems, allocated memory size should be as big as possible. However, the requested amount of the memory from the virtual operating system must be available as free memory on the host operating system when attempting to start the virtual machine and will not be available to the host while the virtual machine is running. Based on the amount of allocated video memory N bytes , higher resolutions and colour depth color dpt are available, as shown in Equation 5, where N vpix represent number of vertical pixels, N hpix represents a number of horizontal pixels and N scr represents a number of screens. Furthermore, there might be extra memory required for any activated display acceleration setting.
Since the display resolution, as well as the number of allocated virtual and physical monitors, influence the system performance, the identical settings must be used in all tested virtual operating systems. A virtual machine uses disk drive resources by connecting a virtual disk drive to a virtual storage controller. Only one partition should be created in all experiments since the number of volumes on the disk drive and the location of system files can influence performance measurement results [15]. Furthermore, device drivers can have a huge impact on the overall system performance but in the literature there is no research study that defines model or setup for the usage or impact of the device drivers on the system performance. This is also due to a fact that there is a large number of different hardware manufacturers which are publishing newer versions of device drivers constantly. However, the newest device drivers should be used since it is expected that they provide the best performance [4].

Experimental results and analysis
In this paper, computer system's components performance measurement were conducted with benchmark applications installed on a virtual operating system and the main emphasis was placed on the following components: CPU, memory, graphics subsystem and disk drive. In order to determine which operating system provides the best performance on a virtual machine, comparison of performance measurement results of different virtual operating systems on Ubuntu host is conducted and used performance metrics represent a real-world performance.
Performance measurement results are shown in Figure 2 and in Tables 3-5. AIDA64 memory tests show the best performance for Windows 10, 1.09 − 6.98% better than Windows 7 and 0.93 − 7.6% better than Windows 8.1, as shown in Figure 2 (a). Only latency test shows better performance for Windows 8.1 for about 3%, while Windows 7 and Windows 10 show similar performance, as shown in Table 3. Unlike our previous test, Passmarks PerformanceTest shows best memory performance for Windows 7, for example, memory write test shows 2.79% better result than Windows 8.1 and 7.65% better than Windows 10. Memory read uncached shows 5.87% better performance than Windows 8.1 and 9.24% better than Windows 10. Passmarks memory tests are shown in Figure 2 (h). Interesting fact is that Windows 10 consumes largest amount of memory during the experiment, 12% more than Windows 8.1 and 20.65% more than Windows 7, as shown in Table 5.
It can be concluded that Windows 10 uses advanced memory management system, which stores frequently used applications and files in memory for faster access and thus leaving less available memory for other less frequently used applications and files. When comparing CPU performance with AIDA64, all operating systems show similar performance, within 1% deviation, which falls into the area of the measurement error, as shown in Figure 2(b) and in Table 3 Figure 2(f) and in Table 5. Biggest difference shows Prime number test which is on Windows 8.1 and 1.46% better than Windows 7 and 7.78% better than Windows 10. When measuring SSD drive performance, AIDA64 tests show similar results for all operating systems, all within 2%, as shown in Figure 2 (c). PCMark disk drive tests show, again, similar results among all operating systems, as shown in Figure 2(e) and in Table 4. Passmark tests also show similar results for all operating systems, with Windows 8.1 as a winner and Windows 7 as a runner-up, as shown in Table 5. From these disk drive tests, it can be concluded that all tested operating systems run on a similar disk drive management and that there are no significant differences in their performance on a virtual machine.
Due to the graphics limitations on virtual machine and their inability to gain full access to GPUs resources only 2D performance was tested. Windows 7 shows 1.52 − 15.48% better results than Windows 8.1 and 2.93 − 33.72% better than Windows 10, as shown in Figure 2 (g). Only test where Windows 8.1 is better, is Image rendering, with 5.42% better performance, as shown in Table 5. Therefore, it can be concluded that improvements in graphics subsystem of Windows 8.1 and Windows 10 are not showing better performance when working with low-end GPUs, as the one emulated on the virtual machine, since they are oriented towards more demanding 3D environments. Using PCMark, some typical situations, which include working with media and entertainment content were tested. These tests partially use CPU, GPU and disk drive resources. Results show the best performance for Windows 10 in almost all tests, as shown in Figure 2(d) and in Table 4, since a big emphasis during Windows 10 development was put on an efficient multimedia management. Only Video Editing and Music to go tests show better performance for Windows 7 and Windows 8.1.

Conclusion
Virtualization technology radically transforms traditional computing since it enables running of multiple virtual software environments on a single, physical hardware system and the result is increased efficiency, flexibility and scalability of those systems. In this paper, the performance of three virtual machines run by the three recent versions of Windows operating systems, namely Windows 7 Professional, Windows 8.1 Professional and Windows 10 Professional on the Linux host were measured, compared and evaluated experimentally. When comparing memory, AIDA64 shows the best results for Windows 10, but in contrary to Passmark, which shows the best results for Windows 7. CPU test show very similar performance among all tested operating systems, within 1%, which falls into the area of the measurement error. GPU performance measurement results in almost all tests show the best performance for Windows 7. When comparing disk drive performance, all tests show similar performance for all operating systems, which, again enters the area of the measurement error. The obtained experimental results lead to the conclusion that Windows 7 should be used as a virtual operating system on Linux Ubuntu host since it shows the best performance and two other latest versions of Windows operating system require more hardware resources, which are not available through emulated hardware on a virtual machine.