This paper on a malloc() replacement that DOES COMPACTION even on C/C++ is making the rounds: https://arxiv.org/pdf/1902.04738.pdf
Scarily beautiful.
@zwol @fluffy I'd love to see a survey of how programs use signal handlers 🧐 It's hard for me to swallow these together:
* sigsegv as "game over, program is buggy"
* sigsegv as "perfectly legitimate way to catch writes to a page, who's your VMM now"
And of course I'm biased, but using this allocator on a memory safe language seems like a wonderful opportunity.
@federicomena @zwol chained handlers are a thing and segfaults are already a natural part of how mmap() et al work. Not to mention virtual memory in general. There’s definitely some care you need to take in chaining sigsegv due to the prevalence of crash reporters and such though. I don’t see this as something you’d want to LD_PRELOAD willy-nilly
@fluffy @zwol yeah, the paper mentions that they only do their wait-until-remapped thing in the sigsegv handler only if the pointer in question is known to the Mesh allocator. I haven't read the code to see if it chains to other handlers if it isn't, but
a) that sounds like the right thing to do, anyway;
b) I have absolutely no clue how they deal with the order of initialization of the handlers :)
@federicomena @zwol yeah and I assume the allocator itself tries to encourage offsets to be non-overlapping in the first place. But that’s just preloading the fragmentation instead...
@federicomena @zwol I guess my big question about this is whether it’s actually beneficial in actual usage - intuitively I feel like the chance of two pages having non-overlapping allocation offsets isn’t going to be much higher than the chance of a page having no allocations at all, and the OS VMM can already defragment physical allocations in the latter case.
@federicomena @zwol ah ok, I only skimmed beyond the overview :)
@federicomena @fluffy I was thinking about this with my “occasional contributor to glibc” hat on, and the issue there is that signal handlers are process globals, which means libraries mustn’t touch them. Also, come to think of it, nothing stops you from using sigprocmask to block delivery of SIGSEGV and (hurriedly writes test program) this causes both Linux and BSD kernels to kill the process instead of invoking a handler.
@federicomena @fluffy Chained handlers are a thing, yes, but not one I would consider reliable enough to use in the guts of malloc. That’s me though.
@zwol @federicomena yeah that’s absolutely fair and I would expect anyone who’s using this allocator to specifically know they’re doing it and only use LD_PRELOAD as a reliable means of overriding the libc one. Because overriding a default allocator in C++ at the language level is a gigantic pain in the butt.
@fluffy @federicomena I’m pretty seriously thinking about writing a variant that stops the world during copies and doing some benchmark bake-offs. It might not be slower, and stopping the world is, oddly enough, something the C library *can* safely do
@federicomena @fluffy as another thing https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/allocatestack.c;hb=HEAD#l1120 can do. (yes, it's horrible.)
@zwol thanks, now I can't unsee that loop. But it *is* pretty clever 😮
@federicomena @fluffy yikes, that might kill any chance of putting it into a general purpose C library, signal handlers in general can only be set by the application