-
Notifications
You must be signed in to change notification settings - Fork 527
[Windows Build] Implement MMAP for mmap_data_loader.cpp #5164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There is no sys/mman.h or posix-compatible mmap() implementation on Windows. The extension data loaders use it to map in data files, so adding an implementation. Test-run, & .\cmake-out\extension\data_loader\test\Debug\extension_data_loader_test.exe --gtest_brief=1 --gtest_filter=MmapDataLoader* Running main() from ...\executorch\third-party\googletest\googletest\src\gtest_main.cc [==========] 8 tests from 1 test suite ran. (50 ms total) [ PASSED ] 8 tests.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5164
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @jsidewhite! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm so sorry for the delay! I totally missed this PR. This looks like a good approach.
|
||
#include <errno.h> | ||
#include <io.h> | ||
#include <windows.h> | ||
|
||
#include "mman_windows.h" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please use angle brackets and the full path to the header
- .cpp files should include their .h file first to demonstrate that the .h files don't need earlier includes to work cleanly
#include <errno.h> | |
#include <io.h> | |
#include <windows.h> | |
#include "mman_windows.h" | |
#include <executorch/extension/data_loader/mman_windows.h> | |
#include <errno.h> | |
#include <io.h> | |
#include <windows.h> | |
@@ -6,12 +6,11 @@ | |||
* LICENSE file in the root directory of this source tree. | |||
*/ | |||
|
|||
#include <executorch/extension/data_loader/mman.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this include down to the group on line 17. Tests should always include their header-under-test first to demonstrate that the header can be included cleanly without needing other includes.
#ifdef __cplusplus | ||
extern "C" { | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExecuTorch is always C++, so no need for this or the matching one below
#ifndef _SYS_MMAN_H_ | ||
#define _SYS_MMAN_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use #pragma once
#include <sys/mman.h> | ||
#include <unistd.h> | ||
|
||
ET_INLINE long get_os_page_size(){return sysconf(_SC_PAGESIZE)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this will compile without a semicolon. And the formatting is off
ET_INLINE long get_os_page_size(){return sysconf(_SC_PAGESIZE)} | |
ET_INLINE long get_os_page_size() { | |
return sysconf(_SC_PAGESIZE); | |
} |
#include <sys/mman.h> | ||
#include <unistd.h> | ||
|
||
ET_INLINE long get_os_page_size(){return sysconf(_SC_PAGESIZE)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we're wrapping this, I'd like to return size_t
instead of long
to be more consistent with the types that ET uses.
@@ -6,17 +6,16 @@ | |||
* LICENSE file in the root directory of this source tree. | |||
*/ | |||
|
|||
#include <executorch/extension/data_loader/mman.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this include down to the group on line 21 so that this .cpp file includes its .h file first.
size_t map_size = range.size; | ||
#ifdef _WIN32 | ||
// On Windows, don't mmap-in memory past end of on-disk file. | ||
// | ||
// The Windows implementation of mmap uses CreateFileMapping which returns | ||
// error STATUS_SECTION_TOO_BIG (0xc0000040) if we try to map past the end | ||
// of the last page of a file mapped in as read-only. | ||
if (range.start + range.size > file_size_) { | ||
map_size = file_size_ - range.start; | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to always do this, if it works on unix systems; we're trying to minimize the number of ifdefs in common code.
size_t map_size = range.size; | |
#ifdef _WIN32 | |
// On Windows, don't mmap-in memory past end of on-disk file. | |
// | |
// The Windows implementation of mmap uses CreateFileMapping which returns | |
// error STATUS_SECTION_TOO_BIG (0xc0000040) if we try to map past the end | |
// of the last page of a file mapped in as read-only. | |
if (range.start + range.size > file_size_) { | |
map_size = file_size_ - range.start; | |
} | |
#endif | |
size_t map_size = range.size; | |
if (range.start + map_size > file_size_) { | |
// Clamp to the end of the file. | |
// | |
// The Windows implementation of mmap uses CreateFileMapping which returns | |
// error STATUS_SECTION_TOO_BIG (0xc0000040) if we try to map past the end | |
// of the last page of a file mapped in as read-only. | |
map_size = file_size_ - range.start; | |
} |
srcs = [ | ||
"mmap_data_loader.cpp", | ||
"mman_windows.cpp" | ||
] if host_info().os.is_windows else [ | ||
"mmap_data_loader.cpp" | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this would avoid repeating the common part.
srcs = [ | |
"mmap_data_loader.cpp", | |
"mman_windows.cpp" | |
] if host_info().os.is_windows else [ | |
"mmap_data_loader.cpp" | |
], | |
srcs = ["mmap_data_loader.cpp"] + | |
["mman_windows.cpp"] if host_info().os.is_windows else [], |
], | ||
headers = [ | ||
"mman.h", | ||
"mman_windows.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also make this header conditional on is_windows
@jsidewhite can I offer help on adding changes here? |
There is no sys/mman.h or posix-compatible mmap() implementation on Windows. The extension data loaders use it to map in data files, so adding an implementation.
Test-run,
& .\cmake-out\extension\data_loader\test\Debug\extension_data_loader_test.exe --gtest_brief=1 --gtest_filter=MmapDataLoader*
Running main() from ...\executorch\third-party\googletest\googletest\src\gtest_main.cc
[==========] 8 tests from 1 test suite ran. (50 ms total)
[ PASSED ] 8 tests.
For issue #4661