Skip to content
This repository was archived by the owner on Jan 28, 2023. It is now read-only.

Added support for Linux hosts #108

Merged
merged 5 commits into from
Nov 8, 2018
Merged

Conversation

AlexAltea
Copy link
Contributor

@AlexAltea AlexAltea commented Oct 16, 2018

Motivation

Although previously (#4) Linux support for HAXM was dismissed due to KVM being around. There's still reasons for having Linux support in this project:

  • Single cross-platform hypervisor API: Although QEMU supports most hypervisors, smaller cross-platform projects need to deal with multiple backends: WHPX on Windows, KVM on Linux, HVF on MacOS. HAXM can finally fill that gap and cover the 3 major platforms.
  • Extra features: Although KVM is a more mature hypervisor, HAXM is not a subset of it: As of today, there's certain guests that only run on HAXM [1], due to the recent emulator changes which added support for x86 instructions missing in KVM/Xen.
  • Kernel module: KVM is deeply integrated into Linux. Patches requires recompiling/installing the kernel. There's kvm-kmod but it's high-maintenance: Linux/KVM patches easily break kvm-kmod, and kvm-kmod patches easily break older kernels. Instead, HAXM can be built and installed in few seconds.
  • Permissive licensing: There's no QEMU-compatible hypervisor that supports Linux and offers a permissive license (e.g. BSD, MIT).

Additionally, we could take the chance to refactor a bit the codebase.

Changes

  • Normalized prefixes: Some platform-specific functions such as smp_mb or __fls had no hax_/asm_-prefix, clashing against functions declared in kernel headers.
  • Added Linux-specific code: Headers at include/linux/*.h, sources at linux/*.c, alongside Makefile+Kbuild files.
  • Added Linux to TravisCI build matrix: It should make maintenance easier.

Pending

Essential tasks:

  • HAXM builds/links under Linux.
  • Finishing implementation of Linux-specific functions.
  • Testing changes.
  • Adding build instructions.

Optional tasks that imply major changes to the codebase (needs discussion!):

  • Single ioctl definitions: Define ioctl's only once at hax_interface.h via HAX_IOCTL(name, code, size). Then implement the HAX_IOCTL macro on hax_interface_{darwin,windows,linux}.h to make the actual platform-specific ioctl definitions. This will reduce code duplication.

@AlexAltea AlexAltea force-pushed the linux branch 4 times, most recently from af67435 to b6693c6 Compare October 16, 2018 09:05
@raphaelning
Copy link
Contributor

Thanks, Alexandro. Great to see you back with another big PR!

Windows and Mac have been and will still be the Intel HAXM team's priority, and realistically, we wouldn't add a third host OS to our support matrix. However, if the open source community wants this feature and can provide enough help, there's no reason for us to stand in the way, not to mention that this is a great chance to improve the overall code quality of the project ;-)

We'll accept this PR on these conditions:

  1. It doesn't cause any regression on Windows and Mac hosts.
  2. You will develop the QEMU-side patch for testing this PR, and do your own testing on Linux. (We'll try to run some simple tests on a Linux host before merging it, but that won't be enough coverage.)
  3. You will be the maintainer of all the Linux-specific code that this PR adds to the source tree. In other words, in case someone else submits a PR for such code in the future, we will invite you to the review and will accept or reject the change based on your feedback.

What do you think? I have some other tasks in hand at the moment, but I'll try to respond to the major changes you have proposed in the next few days. The minor changes all look fine to me :-)

@AlexAltea
Copy link
Contributor Author

AlexAltea commented Oct 16, 2018

Sounds good to me! I will provide support for Linux frontend. Additionally, the planned changes will ensure easier maintenance in the future (e.g. by reducing code duplication in platform-specific code).

I will comment on this PR to discuss any further major changes that might affect generic or Windows/macOS-specific code. Once the patch is ready, I will run tests on the three platforms (of course, this implies taking care of QEMU-side patches myself) and let you know once everything is ready.

Also on the topic of testing: After completing this PR, it might be a good moment to setup something like Travis CI? This should also reduce maintenance effort.

@raphaelning
Copy link
Contributor

Sounds good to me! I will provide support for Linux frontend. Additionally, the planned changes will ensure easier maintenance in the future (e.g. by reducing code duplication in platform-specific code).

Thanks! Indeed.

Regarding the first proposed major change:

Combine platform-specific folders. Moving {darwin,windows,linux}/ to X/{darwin,windows,linux}/, to avoid polluting the root folder. Folder "X" could be called drivers, platforms, etc.

Agreed. Right now we have three kinds of files in {darwin,windows}/:

a) Makefiles (e.g. windows/IntelHaxm.vcxproj).
b) C/C++ source files that implement for the host OS the required interface specified by a shared header (e.g. both windows/hax_host_mem.c and darwin/.../hax_host_mem.cpp implement include/hax_host_mem.h).
c) Other C/C++ source files specific to the host OS, mainly to implement the driver interface specified by the host OS (e.g. windows/hax_entry.c).

Since drivers only covers (c), I'd prefer platforms.

Two more questions related to this topic:

  1. Currently, shared headers (as described by (b) above) exist in both core/include/ and include/, and it's not very clear to me how to choose between them when we create a new shared header. Is there a better scheme?
  2. Have you thought about using CMake to generate the Linux makefiles, with an eye to eventually do so for Windows and Mac as well using a shared CMake configuration? This could be a crazy idea, because last time I checked, CMake didn't have mature support for WDK projects.

I'll comment on the other proposed major changes later.

@raphaelning
Copy link
Contributor

Combine documentation: Moving API.md to X/api.md, and add platform-specific build instructions in X/building.md or X/building-{darwin,windows,linux}.md to avoid polluting README.md. Folder "X" could be called docs, etc.

Agreed. docs is a good name.

Single ioctl definitions: Define ioctl's only once at hax_interface.h via HAX_IOCTL(name, code, size). Then implement the HAX_IOCTL macro on hax_interface_{darwin,windows,linux}.h to make the actual platform-specific ioctl definitions. This will reduce code duplication.

Good idea! Agreed.

Normalized integer types: The codebase mixes intN_t and intN. It would convenient to pick one variant and consistently use it.

Agreed. I see that you have started doing this and have chosen [u]intN_t over [u]intN, which is the harder choice (because the latter types are more heavily used throughout the codebase), but also the right one, I'd say (because the former types are more familiar, as they resemble the standard types in <stdint.h>).

The question is what's the best strategy to implement such a ubiquitous change, so as to avoid rebase headaches. Here's what I think:

  1. Don't mix it with other changes.
  2. Create separate PR(s) for it, which can be merged before this one.
  3. Split it into multiple commits/PRs, each one covering as few as one source file (e.g. vcpu.c) or folder, which can be reviewed and merged quickly.

Please let me know if this is reasonable or if you have better ideas.

@raphaelning
Copy link
Contributor

Also on the topic of testing: After completing this PR, it might be a good moment to setup something like Travis CI? This should also reduce maintenance effort.

Good idea! I just started reading the Travis CI documentation, and it seems to work best with Linux Makefile projects. Since there's a risk of a common (core) patch causing a regression on Linux hosts, maybe we can set up Travis CI for Linux first, which will at least tell us right away if a patch breaks the build on Linux. I'm not sure if it can help us run QEMU-based tests, though. Neither do I know if it's possible to use it for Windows and Mac builds.

@AlexAltea
Copy link
Contributor Author

Have you thought about using CMake to generate the Linux makefiles, with an eye to eventually do so for Windows and Mac as well using a shared CMake configuration?

I didn't considered it yet, but that is much more convenient. For now, we could have each platform-specific build script (Makefile, Xcode-files, VS-files) in their corresponding platforms/* folder, and merge everything under a single root-level CMakeLists for a future PR.

The question is what's the best strategy to implement such a ubiquitous change, so as to avoid rebase headaches. Here's what I think:

Agreed, I'll submit all refactoring-related changes as a separate PR.

I'm not sure if it can help us run QEMU-based tests, though.

We can execute arbitrary scripts on Travis (build scripts run in sudo-enabled virtual machines), so it should be possible as long as their hardware supports it and they virtualize EPT to allow nested virtualization.

Neither do I know if it's possible to use it for Windows and Mac builds.

Travis can handle Linux, macOS and Windows:
https://docs.travis-ci.com/user/reference/overview/#virtualization-environments

@AlexAltea
Copy link
Contributor Author

AlexAltea commented Oct 18, 2018

Currently, shared headers (as described by (b) above) exist in both core/include/ and include/, and it's not very clear to me how to choose between them when we create a new shared header. Is there a better scheme?

Include folders give me quite some headaches. There's two issues:

  1. The current approach to include headers is weird: We explicitly mention the "include" folders directly in the paths, e.g. #include "../include/*.h", which most C/C++ codebases rarely do. Instead, those include directories should be added to the Makefile/Projects to simply write #include <*.h>.

  2. It's not clear what each include folder should contain. I see 5 types of headers:

    • Headers of core consumed by platforms (e.g.: core/include/hax_core_interface.h):
      Right now they are in core/include, which might be the appropriate place.
    • Headers of core consumed by core (e.g.: core/include/vmx.h).
      Right now they are in core/include. Maybe they should be next to their sources: that is, moving them to core, e.g. core/vmx.h. There's no reason for having them separate from sources, since there are no external consumers. This is already the case for platforms headers consumed by platforms (see below).
    • Headers for platforms consumed by core, (e.g.: include/hax_host_mem.h).
      Right now they are in include... Does that make sense?
    • Headers for platforms consumed by platforms, (e.g.: windows/hax_win.h).
      Right now they are each next to their sources.
    • Headers consumed by QEMU (e.g.: include/hax_interface.h).
      Right now they are in include, which seems alright since they are the only definitions that HAXM needs to export for 3rd party software.

Anyway, this comment is just brainstorming on my side. I still need to give this problem more thought.

@AlexAltea AlexAltea force-pushed the linux branch 6 times, most recently from 389d262 to 1c550f9 Compare October 19, 2018 01:15
@AlexAltea AlexAltea force-pushed the linux branch 2 times, most recently from 584c8fb to 32d1d2e Compare October 19, 2018 03:28
@AlexAltea
Copy link
Contributor Author

Combine platform-specific folders. Moving {darwin,windows,linux}/ to X/{darwin,windows,linux}/

@raphaelning Regarding this point, I've noticed there's a dirs file in the root-directory pointing to core and windows, each containing a sources file. This file seems to generate a sources.props file. Do you know exactly how this happens (which tool is responsible for that)? Are these files necessary at all?

@raphaelning
Copy link
Contributor

Ah, great question. Yes, I do know what they are.

When I first took over the project, it didn't even have a Visual Studio project (.sln or .vcxproj). Instead, it was using the WDK 7.1 build system to build the Windows driver. That build system was pretty straightforward/basic: you just need to provide very simple makefiles like dirs and sources, and then run build.exe.

Later we decided to migrate to WDK 10 and Visual Studio 2015. I did that in two steps:

  1. Convert the dirs-based project to a Visual Studio solution, using a tool available in Visual Studio 2012. I think it was the Nmake2MsBuild CLI tool, or its GUI equivalent. The tool is not available in later releases of Visual Studio.
  2. Open the VS2012 solution in VS2015 and proceed with the format upgrade.

Step 1 ended up generating all the VS files we see now. Since they are autogenerated, they are much more complicated than the original dirs and sources files (if you remember, we used to have a lot of build configurations, Win{7, 8, 8.1} {Debug, Release}, which were created by the conversion tool). For example, sources.props was generated from sources. The tool also created the awkwardly-named dirs-Package project, and made it the default project of HaxmDriver.sln.

I believe we can safely delete the old makefiles (dirs, sources, Makefile.inc, etc.). But then, you may still consider HaxmDriver.sln and the dirs-Package folder as polluting the root of the source tree. If we somehow make CMake work for Windows, we'll be able to get rid of all the VS files.

@raphaelning
Copy link
Contributor

Include folders give me quite some headaches. There's two issues:

I completely agree. Thanks for the really nice summary.

Headers of core consumed by core (e.g.: core/include/vmx.h).
Right now they are in core/include. Maybe they should be next to their sources: that is, moving them to core, e.g. core/vmx.h. There's no reason for having them separate from sources, since there are no external consumers. This is already the case for platforms headers consumed by platforms (see below).

I think vmx.h is somewhat similar to ia32_defs.h. They define constants and data structures according to Intel SDM, and those definitions are not unique to HAXM. In some sense, they are similar to hax_types.h. Therefore, although they seem to be consumed by core at the moment, they can be consumed by platform sources as well (or at least we should allow that). Shall we move them to a separate folder under include/?

Headers for platforms consumed by core, (e.g.: include/hax_host_mem.h).
Right now they are in include... Does that make sense?

No. And I should take the blame. I created that header but wasn't sure where to put it, and ended up making the wrong decision.

@AlexAltea
Copy link
Contributor Author

AlexAltea commented Nov 7, 2018

@raphaelning Regarding the 32-bit Linux host issues I found: Now with proper logging I see:

[   73.853623] haxm_warning: -------- HAXM v7.3.2 Start --------
[   98.103785] haxm_warning: hax_alloc_pages: HAX_MEM_LOW_4G is ignored
[   98.559302] haxm_error: VM entry failed: RIP=0000fff0
[   98.559311] haxm_error: VMfailValid. Prev exit: 0. Error code: 7 (VMX_ERROR_ENTRY_CTRL_FIELDS_INVALID)

The Intel SDM Vol. 3C claims at 30.4 VM Instruction Error Numbers that this code means:

VM entry with invalid control field(s) b,c

  • b. VM-entry checks on control fields and host-state fields may be performed in any order. Thus, an indication by error number of one cause does not imply that there are not also other errors. Different processors may give different error numbers for the same VMCS.
  • c. Error number 7 is not used for VM entries that return from SMM that fail due to invalid VM-execution control fields in the executive VMCS. Error number 25 is used for these cases.

Any clues as to debug which control field(s) might be invalid?


EDIT#1: I've added VMCS dumps for a 32-bit Windows host and a 32-bit Linux host, before and after the following snippet in cpu_vmx_run at cpu.c (both using the latest revision of the linux branch):

    result = asm_vmxrun(vcpu->state, vcpu->launched);

    vcpu->is_running = 0;
    vcpu_save_guest_state(vcpu);
    vcpu_load_host_state(vcpu);

These are the files: Does anything look suspicious?


EDIT#2: After diffing the files, aside from many expected differences in HOST_* fields I've noticed a mismatch in VMX_EXIT_CONTROLS:

-haxm_warning: 400c VMX_EXIT_CONTROLS: 36dff # windows
+haxm_warning: 400c VMX_EXIT_CONTROLS: 236fff # linux

@maronz
Copy link

maronz commented Nov 7, 2018

@AlexAltea I'm afraid, I owe you an apology. Before my last comment (about the race condition in the install script) I did not read the script carefully enough. [ I guess it being late in the evening at my end did not help things. ]

After having done a quick test of a0b270b, and still (!!) noticing the permission/ownership issue, I've had another look at the script, and have now come to the conclusion that we've got a "cart before the horse" situation. I'm now suggesting to re-arrange the order of the critical steps into the following order:

  1. create the group (if necessary),
  2. create the 'udev' rule, and
  3. then load the kernel module

With this sequence of events I'm now confident that any attempt of "fixing" the permission/ownership afterwards won't be required. The current order of processing is: 3,1,2 (i.e. load the module first, then set up the group and the 'udev' rule). And that seems to explains (in hindsight) what has been observed up to now.

I'm attaching a proposal of a re-worked 'haxm-install.sh' script for your perusal.
haxm-install.sh.gz

@AlexAltea
Copy link
Contributor Author

@maronz Thanks a lot for tracking down the issue! I've added your patched file and amended the commit.

@raphaelning
Copy link
Contributor

After diffing the files, aside from many expected differences in HOST_* fields I've noticed a mismatch in VMX_EXIT_CONTROLS

Thanks for doing all the hard work to track this issue down to this simple diff! Apparently, the Linux driver incorrectly sets bit 9 (Host address-space size) and bit 21 (Load IA32_EFER), both which should be 0 for a 32-bit host OS (especially when the host kernel is 32-bit). With this information, it's very easy to identify the code that sets these bits:

haxm/core/vcpu.c

Lines 1360 to 1367 in 24fdff5

#ifdef HAX_ARCH_X86_32
if (is_compatible()) {
exit_ctls = EXIT_CONTROL_HOST_ADDR_SPACE_SIZE | EXIT_CONTROL_LOAD_EFER |
EXIT_CONTROL_SAVE_DEBUG_CONTROLS;
} else {
exit_ctls = EXIT_CONTROL_SAVE_DEBUG_CONTROLS;
}
#endif

As I've pointed out in the above comment, the solution is to make sure is_compatible() evaluates to 0 on Linux hosts.

@AlexAltea
Copy link
Contributor Author

AlexAltea commented Nov 8, 2018

@raphaelning Thank you! At first, I was sceptical about is_compatible() being the culprit so I never bother changing it and testing. But that fixed indeed the problem. :-)

Now with 64-bit and 32-bit Linux hosts on the same grounds I'm going to fix the lingering VM/VCPU issues, after some checking I figured out I didn't implement open and release (i.e. close) file operations for them. I've already done it locally, which should fix the issue, but during hax_clear_vcpumem(cv->tunnel_vcpumem); there's a kernel panic happening, caused by vm_munmap.

I'll investigate why this happens and report back.

@raphaelning
Copy link
Contributor

The new commits look good to me, but the lingering device nodes issue is somehow still not completely fixed:

(Before the test, I rebooted my computer and confirmed that /dev/hax* didn't exist.)

  1. The first QEMU VM creates /dev/hax_vm/vm00 and /dev/hax_vm00/, but does not delete them in the end.
  2. The second VM somehow manages to "reuse" vm00 instead of creating vm01. In the end, it doesn't delete vm00 either.
  3. The third and fourth VMs are run in parallel. One of them reuses vm00, and the other creates vm01. In the end, we have both vm00 and vm01 nodes lingering around.

BTW, we have tested this PR (in its current state) on Windows and Mac without running into any issue, so we're ready to merge it. But if you want to add any Linux-specific patches, we can wait.

@AlexAltea
Copy link
Contributor Author

@raphaelning Great! None of the remaining issues require changing core or Windows/Mac-specific code, so the tests won't be needed again. Once everything is ready from my side, I'll rewrite the commit history as:

  • Normalized prefixes of functions/types
  • Added support for Linux hosts
  • Added Linux to TravisCI build matrix
  • Manual for building/testing HAXM on Linux

the lingering device nodes issue is somehow still not completely fixed

This seems caused by the inode modification, which also caused the devices to be shown in red with unknown type ? (permissions ?rw-rw----). Reverting ceed27e fixes both issues, but makes sudo necessary again. I'll investigate and fix this.

@raphaelning
Copy link
Contributor

@AlexAltea Sounds good, thanks! My previous comment coincided with 8f59564, but I was testing without this commit. It looks like an important fix though.

@raphaelning
Copy link
Contributor

I can confirm that edcd0ef has eradicated lingering VM/vCPU device nodes, and those device nodes now show up with the correct file type (c) in ls -l output. Awesome!

BTW, in the final patch set, I'd like to see d673697 ("Ensuring all VM/VCPU ioctl exit paths decrement the refcounter") remain a separate commit.

@maronz
Copy link

maronz commented Nov 8, 2018

@AlexAltea Congratulations!! The last commit (edcd0ef) appears to have really done the trick.

I've been (silently) following the progress over the course of the day, and even on a 32-bit host the '/dev/hax_vm*/vcpu*' entries are appearing, as a new VM gets started, and disappearing (!!), as a VM run comes to an end.

@AlexAltea
Copy link
Contributor Author

AlexAltea commented Nov 8, 2018

@raphaelning Sure thing! I'll insert that one as a separate commit right after "Added support for Linux hosts". If you want, I can submit an identical patch for MacOS in another PR.

The only thing pending, is fixing the vm_munmap issue mentioned at #108 (comment). However, since it's commented out, it doesn't cause any problems for users: the only drawback is that we cannot unmap the "HAXM tunnel" page from userland. But at that point our main consumer (QEMU) is already finished, so exiting the process will free that page.

With this in mind, I'd say we could proceed to merge this (after rewriting the commit history). Sounds good? Of course, I'll keep providing support for the Linux platform: fixing vm_munmap as well as maintaining the rest of the code will have top priority for me now.

Some platform-specific functions such as `smp_mb` or `__fls` had no hax_/asm_-prefix, clashing against functions declared in kernel headers.

Signed-off-by: Alexandro Sanchez Bach <[email protected]>
@AlexAltea
Copy link
Contributor Author

All done!

@raphaelning
Copy link
Contributor

If you want, I can submit an identical patch for MacOS in another PR.

That would be great, thanks!

There's a small problem with the latest PR, though:

$ ll platforms/linux/*.sh
-rw-rw-r-- 1 myuser   myuser   441 Nov  8 16:11 platforms/linux/haxm-install.sh
-rw-rw-r-- 1 myuser   myuser   375 Nov  8 16:11 platforms/linux/haxm-uninstall.sh

As you can see, these files are not executable, so sudo make install gives me this error:

./haxm-install.sh
make: execvp: ./haxm-install.sh: Permission denied
Makefile:35: recipe for target 'install' failed
make: *** [install] Error 127

Could you fix this in a99ba16 and force push?

@AlexAltea
Copy link
Contributor Author

@raphaelning Fixed!

@raphaelning raphaelning merged commit 2d7fa8b into intel:master Nov 8, 2018
@AlexAltea AlexAltea deleted the linux branch November 8, 2018 09:43
This was referenced Nov 8, 2018
@krytarowski
Copy link
Contributor

Great work!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants