Daniel Colascione
2012-12-11 00:58:33 UTC
Emacs "make bootstrap" runs Emacs as a compiler, generating .elc files from .el
files. The build system runs Emacs once for each .el file we compile, of which
there are thousands. Now, Emacs takes about a two seconds to start on my system,
so compiling thousands of files takes a while; the actual .el to .elc
compilation is nearly instantaneous.
According to xperf, Emacs spends most of its startup time re-reading emacs.exe
code pages from disk.
~/edev/trunk.nox/src
$ time ./emacs --batch -Q --eval '(kill-emacs)'
real 0m2.236s
user 0m0.015s
sys 0m0.015s
~/edev/trunk.nox/src
$ time ./emacs --batch -Q --eval '(kill-emacs)'
real 0m2.343s
user 0m0.062s
sys 0m0.016s
We shouldn't need to read this file more than once. After the first run, the
system should be able to read emacs.exe from the standby list, not the disk.
Now, if we run emacs.exe from cmd, not bash, that's exactly what happens:
C:\Users\dancol\edev\trunk.nox\src
echo %TIME%
.\emacs --batch -Q --eval "(kill-emacs)"
echo %TIME%
C:\Users\dancol\edev\trunk.nox\src
16:39:48.73
C:\Users\dancol\edev\trunk.nox\src
16:39:50.96
C:\Users\dancol\edev\trunk.nox\src
16:39:51.37
I came up with a simple test case that reproduces in cmd the behavior I see when
I run Emacs from bash. I've reproduced the program below. Here, I've compiled
a.exe with -DSLOW:
C:\Users\dancol\edev\trunk.nox\src
%TMP%\a.exe emacs.exe
echo %TIME%
.\emacs --batch -Q --eval "(kill-emacs)"
echo %TIME%
C:\Users\dancol\edev\trunk.nox\src
16:41:55.12
16:41:57.24
C:\Users\dancol\edev\trunk.nox\src
16:41:57.62
16:41:59.69
C:\Users\dancol\edev\trunk.nox\src
16:42:00.05
16:42:02.20
Here's the program that generates a.exe:
#define UNICODE 1
#define _UNICODE 1
#include <windows.h>
#include <stdio.h>
int
main(int argc, char* argv[])
{
HANDLE file;
HANDLE section;
PVOID view;
LARGE_INTEGER size;
BYTE Buffer[64*1024];
DWORD BytesRead;
file = CreateFileA(argv[1],
SYNCHRONIZE | GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (file == INVALID_HANDLE_VALUE) {
fprintf(stderr, "CreateFile: 0x%lx\n", GetLastError());
return 1;
}
if (!GetFileSizeEx(file, &size)) {
fprintf(stderr, "GetFileSizeEx: 0x%lx\n", GetLastError());
return 1;
}
if (size.QuadPart > 64*1024) {
size.LowPart = 64*1024;
}
#if defined FAST
if (!ReadFile(file, Buffer, sizeof (Buffer), &BytesRead, NULL)) {
fprintf(stderr, "ReadFile: 0x%lx\n", GetLastError());
}
printf("Read %lu bytes\n", BytesRead);
#elif defined SLOW
section = CreateFileMapping(file, NULL, PAGE_READONLY, 0, 64*1024, NULL);
if (!section) {
fprintf(stderr, "CreateFileMapping: 0x%lx\n", GetLastError());
return 1;
}
#else
#error Define FAST or SLOW
#endif
printf("Success\n");
return 0;
}
As you can see, a.exe merely creates a section object for emacs.exe; it doesn't
even map it into memory. Still, after running a.exe on emacs.exe, the system
reloads all emacs.exe's code pages the next time we run emacs.exe.
If we build a.exe with -DFAST instead of -DSLOW, then a.exe grabs the first 64k
of emacs.exe using ordinary, buffered ReadFile instead of trying to create a
section object. When compiled this way, a.exe seems to have no effect on Emacs
startup time:
C:\Users\dancol\edev\trunk.nox\src
16:48:40.54
C:\Users\dancol\edev\trunk.nox\src
16:48:42.08
C:\Users\dancol\edev\trunk.nox\src
16:48:42.43
a.exe with -DSLOW mimics what av::fixup does when trying to determine whether an
executable is a Cygwin program. If av::fixup used ordinary ReadFile instead of
memory-mapped IO, program start performance would increase drastically, at least
for my workload.
I'm running 2K8R2. I'm not running any AV products, disk scanners, or other
exotic pieces of software. CYGWIN=detect_bloda reports nothing.
$ uname -a
CYGWIN_NT-6.1-WOW64 xyzzy 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin
files. The build system runs Emacs once for each .el file we compile, of which
there are thousands. Now, Emacs takes about a two seconds to start on my system,
so compiling thousands of files takes a while; the actual .el to .elc
compilation is nearly instantaneous.
According to xperf, Emacs spends most of its startup time re-reading emacs.exe
code pages from disk.
~/edev/trunk.nox/src
$ time ./emacs --batch -Q --eval '(kill-emacs)'
real 0m2.236s
user 0m0.015s
sys 0m0.015s
~/edev/trunk.nox/src
$ time ./emacs --batch -Q --eval '(kill-emacs)'
real 0m2.343s
user 0m0.062s
sys 0m0.016s
We shouldn't need to read this file more than once. After the first run, the
system should be able to read emacs.exe from the standby list, not the disk.
Now, if we run emacs.exe from cmd, not bash, that's exactly what happens:
C:\Users\dancol\edev\trunk.nox\src
type bench-emacs.cmd
@echo offecho %TIME%
.\emacs --batch -Q --eval "(kill-emacs)"
echo %TIME%
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs
16:39:46.3116:39:48.73
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs
16:39:50.9116:39:50.96
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs
16:39:51.3216:39:51.37
I came up with a simple test case that reproduces in cmd the behavior I see when
I run Emacs from bash. I've reproduced the program below. Here, I've compiled
a.exe with -DSLOW:
C:\Users\dancol\edev\trunk.nox\src
type .\bench-emacs2.cmd
@echo off%TMP%\a.exe emacs.exe
echo %TIME%
.\emacs --batch -Q --eval "(kill-emacs)"
echo %TIME%
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs2
Success16:41:55.12
16:41:57.24
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs2
Success16:41:57.62
16:41:59.69
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs2
Success16:42:00.05
16:42:02.20
Here's the program that generates a.exe:
#define UNICODE 1
#define _UNICODE 1
#include <windows.h>
#include <stdio.h>
int
main(int argc, char* argv[])
{
HANDLE file;
HANDLE section;
PVOID view;
LARGE_INTEGER size;
BYTE Buffer[64*1024];
DWORD BytesRead;
file = CreateFileA(argv[1],
SYNCHRONIZE | GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (file == INVALID_HANDLE_VALUE) {
fprintf(stderr, "CreateFile: 0x%lx\n", GetLastError());
return 1;
}
if (!GetFileSizeEx(file, &size)) {
fprintf(stderr, "GetFileSizeEx: 0x%lx\n", GetLastError());
return 1;
}
if (size.QuadPart > 64*1024) {
size.LowPart = 64*1024;
}
#if defined FAST
if (!ReadFile(file, Buffer, sizeof (Buffer), &BytesRead, NULL)) {
fprintf(stderr, "ReadFile: 0x%lx\n", GetLastError());
}
printf("Read %lu bytes\n", BytesRead);
#elif defined SLOW
section = CreateFileMapping(file, NULL, PAGE_READONLY, 0, 64*1024, NULL);
if (!section) {
fprintf(stderr, "CreateFileMapping: 0x%lx\n", GetLastError());
return 1;
}
#else
#error Define FAST or SLOW
#endif
printf("Success\n");
return 0;
}
As you can see, a.exe merely creates a section object for emacs.exe; it doesn't
even map it into memory. Still, after running a.exe on emacs.exe, the system
reloads all emacs.exe's code pages the next time we run emacs.exe.
If we build a.exe with -DFAST instead of -DSLOW, then a.exe grabs the first 64k
of emacs.exe using ordinary, buffered ReadFile instead of trying to create a
section object. When compiled this way, a.exe seems to have no effect on Emacs
startup time:
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs
16:48:38.2516:48:40.54
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs
16:48:42.0316:48:42.08
C:\Users\dancol\edev\trunk.nox\src
.\bench-emacs
16:48:42.3816:48:42.43
a.exe with -DSLOW mimics what av::fixup does when trying to determine whether an
executable is a Cygwin program. If av::fixup used ordinary ReadFile instead of
memory-mapped IO, program start performance would increase drastically, at least
for my workload.
I'm running 2K8R2. I'm not running any AV products, disk scanners, or other
exotic pieces of software. CYGWIN=detect_bloda reports nothing.
$ uname -a
CYGWIN_NT-6.1-WOW64 xyzzy 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin