If you need to allocate lots of memory, you should be using mmap(). If you need it faulted in, you use MAP_POPULATE. Try to use MAP_HUGETLB to economize on TLB cache entries; you may need to set "vm.nr_hugepages=16384" (or something) in /etc/sysctl.conf (or someplace) to reserve them.
If you are allocating lots of memory and don't know about hugepages, you badly need to learn.
If your program has threads, unmapping memory is generally to be avoided.
> If your program has threads, unmapping memory is generally to be avoided.
Could you expand on this? I recently encountered a situation where allocating and freeing many large allocations using mmap() seemed to eventually cause problems with thread creation, but I had assumed that it was probably because the virtual address space had become too fragmented, which of course would not solely be a result of unmapping. Or maybe that fragmentation is what you're referring to, and I'm just reading too much into that sentence.
mmap is an operating system interface, or at the very least a part of POSIX. it isn't part of the C++ standard, and so you cannot rely on it being present on systems that don't claim extensive POSIX compatibility. As for MAP_POPULATE and MAP_HUGETLB, try using that on a POSIX-ish OS that isn't Linux and see how far you get.
Every OS nowadays has mmap, by some spelling, and they all support hugepages, likewise. A program that doesn't use its host OS doesn't, generally, do much.
Perhaps you've not come across the idea of cross-platform development?
I have a large fairly successful cross-platform program that does a hell of a lot. All of our interactions with the OS are wrapped, either by libstdc++ or by the portability library that we use.
We would never allow a direct call to mmap() from the main application code ... we don't even allow direct calls to most of the POSIX API since we have to run on Windows too (without WSL or whatever MS' current POSIX layer is).
"By some spelling" is precisely why almost all programmers are better off using "new X[N]" than mmap, unless you actually think that code like this is good:
Actually Arduino code does tend to use dynamic allocation for things like strings - it's a lot easier than the alternative, they tend to have quite a lot of memory these days (e.g. 1 MB) and hobby projects don't need to be super-reliable usually.
One megabyte or ten isn't enough to bother with mapping, and on Arduino you don't typically have memory mapping hardware anyway. "Big" means gigabytes these days, sometimes hundreds of them.
> If you actually want to measure the memory allocation in C++, then you need to ask the system to give you s bytes of allocated and initialized memory. You can achieve the desired result in C++ by adding parentheses after the call to the new operator
Is this an actual thing? You can force reification of overcommitted memory just by calling operator() on it?
I don’t think that guarantees the memory gets committed. The compiler/standard library combination is allowed to know how the OS behaves, so if it, say, initializes new pages with 0x00, it can use that information to skip the initialization of that memory.
However, doing this is a very strong code smell. At the very least, using `new` and assigning to a raw pointer is a sign that the C++ developer is managing memory manually and is likely to hit a lot of problems including memory leaks or segmentation faults. Also many would forget that this is calling `operator new[]()`, not `operator new()` and might confuse with placement-new `operator new(...)` or `operator new[](...)`. And the developer might also forget that `new` could throw an exception... [0]
The developer should instead be using, at a minimum, a `std::unique_ptr<char[]>` [1]. Or, IMO, a `std::vector<char>` which reminds the developer not only of the pointer but also of the count of bytes which have been allocated (.capacity()) and also of the valid range which has been initialized (.size()).
IMO, if the developer wanted a pointer to a byte array then it's a lot easier to use `malloc()` than trying to remember all the different ways you can get screwed by `operator new`:
> using `new` and assigning to a raw pointer is a sign that the C++ developer is managing memory manually and is likely to hit a lot of problems including memory leaks
There is no need to badmouth all c++ developers. Plenty of us can keep our memory straight just fine. just because you don't like pointers, does not mean the rest of us can't use them perfectly safely.
I think the evidence is pretty clear that programmers can't, in general, use bare pointers properly. The very best write use after free, double free, stack smashers, and all manner of other memory related bugs. We can see the evidence of this fact in the CVEs, the syzkaller bugs, etc. If you haven't been bitten by a serious instance of one of these, then you just haven't written very much C++.
IMO the debate about whether programmers can safely handle bare pointers is over. They can't. The only question is whether smart pointers help enough to make the extra line noise worth it.
And the guys that implement smart pointers are what, exactly? Uberprogrammers? Is there a certificate to get into that club? What about compiler authors, that deal with, gasp, optimizations of these bare pointers? Kernel developers? Embedded system engineers?
It is ok to stay above a certain level of abstractions consciously, but stating that one cannot go below safely is just steadying mediocrity.
> And the guys that implement smart pointers are what, exactly? Uberprogrammers? Is there a certificate to get into that club? What about compiler authors, that deal with, gasp, optimizations of these bare pointers? Kernel developers? Embedded system engineers?
One cannot do any of those things safely by hand. Compilers, kernels, and embedded systems do indeed do these things; they also have bugs.
You cannot safely control an internal combustion engine's valve timings by hand, but it is clearly possible to implement this automatically (and fairly simply). Some things are much easier for machines to do than humans, even though the machines are designed and build by humans.
There is and the large majority fails at it, as proven by the CVE database, in spite the rigorous process to contribute code to the Linux kernel for example.
That's not the point. Smart pointer implementations aren't some dark magic, they're just the opposite: mostly fairly simple to implement (shared_ptr is a little trickier), and obviously right by construction: the compiler does the hard work of making sure the constructors and destructors get called appropriately (and this is a little harder but still far far easier than making sure that malloc() and free() match in all actual uses).
> just because you don't like pointers, does not mean the rest of us can't use them perfectly safely.
I actually use pointers quite safely. I just no longer see a need to ever return an allocation from `new` to a raw pointer unless I'm implementing my own pointer class.
Herb Sutter's correct [0]. `std::unique_ptr` or `std::shared_ptr` or some other pointer container should __always__ hold new-allocated objects. Just like you should __always__ wear your seatbelt. It's no danger to you or anyone else, it's slightly inconvenient, and it saves a metric ton of headaches about "what if?". Because in reality a pointer container explicitly marks the scope of the allocation and if you're not wanting to use C++'s scoping rules then why are you using C++?
The statement you quoted is simply a fact, not badmouthing. If there is any badmouthing by implication, it wouldn't be "all" C++ developers, just those who don't realize they're doing something spooky when their code looks like Foo *foo = new Foo.
Why do you think that marks a troll? Have you never had to use a C library which wants you to use the library's structure allocator / deallocator? In that case, this translates very well to malloc/dealloc.
@unlinked_dll has it correct: the C++ way to allocate an array of chars is `std::vector<char>` (or `std::array<char, size>`).
Doesn't that break when std::free does not have the default C++ calling convention? I think `decltype(std::free)` would be easier to read anyway, but writing a deleter class that calls std::free in its operator() has the advantage that it will not use any memory in the unique_ptr thanks to the empty base class optimization.
I was thinking about calling conventions because std::free can use extern "C", but the function pointer declaration in C++ code points to an extern "C++" function, and the standard says these are two distinctive types. Are they guaranteed to be compatible - since they're only linkage specifiers, not calling conventions? I'm not sure.
std::function would still work in this case because pointers to C functions have a normal operator().
Sometimes your just need a buffer for a syscall. Newing a buffer is canonical and unique_ptr isn’t free enough to use in the most performance sensitive contexts.
I worked with a team which used raw pointers instead of unique_ptr as they use more memory (twice as much, I think). The core data structure was a tree, and the tree was several hundred GB in size. So they were doing whatever they could to shave off memory.
Unique pointers imply ownership of the underlying memory. If you have multiple references to the same data, or in your case a tree, it might make sense to use raw pointers (or references) to access the tree data. It would not make sense to grant ownership of the same memory to two unique_ptr, or make copies of data unnecessarily.
In that case, you'd have something like `std::vector` or `std::unique_ptr` which owns the actual allocation (and governs the scoping and therefore deallocation), while using `vector.data()` or `unique_ptr.get()` to pass around an unowned pointer. All raw pointers can then be considered to be unowned by convention.
I gave up on C++ in part because the language changes too much.
I paid for lessons on it. The std:: stuff didn't exist, templates were a recent invention, and people were starting to agree that putting the overload keyword everywhere was bad style. Mostly, people were just excited about // comments and cout.
I keep hearing about how C++ is supposed to be done, and it seems to change every year. I don't know what people do with the older code as it becomes unfashionable with "very strong code smell". Updating it is risky busywork, kind of like porting to Python 3, with potential for serious bugs. Leaving it in place will make developers differently unhappy.
It isn't easy getting everybody on a team to agree on what subset of C++ is OK to use. People will sneak in their must-have feature. You'd have better luck getting agreement between emacs and vi.
C++ makes no pretense of ever planning to be "done". "Done" is another word for dead. Even COBOL is still evolving.
Failing to keep up with evolving languages is called stagnation. Everyone is free to stagnate, but I do not advise it.
You use the subset of the language supported by the compiler you have. Code using newer features is better because the new features were added for sound engineering reasons, not just to be different.
It's the opposite of sound engineering. Every version of C++ contains a new feature that does 11% of the proper solution because the feature in the last version only did 9% of the proper solution. Despite all that churn the language is still as far behind ML as ever.
What has the parent comment not keeping up with? Was it a reference to the ‘11%’ or to C++ being ‘behind ML’? Just curious to see what your objection was.
C has changed much less over the years. Arguably C89 was the most important change, C99 brought a bunch of small quality of life improvements (which you might or might not need, arguably the most important are stdint and snprintf), C11 got us a threading model, C17 is a bugfix.. After C89, no new version of the standard really changed the way you write idiomatic C without code smells.
Fact is it did change, it even introduced breaking changes, like gets() and Annex K removal on C11.
And you are missing the compiler extensions that I also mentioned.
It did not change even more, because nowadays C is left for UNIX clones and embedded development, having been superseded by other languages on most corporations, even its major compilers have been rewritten in C++.
I don’t know if parent is ‘missing the compiler extensions [you] mentioned’ or is just not addressing them. But I will take a stab: compiler extensions may be neat/effective/revolutionary, but they are unequivocally not a part of a programming language. That is why they are compiler specific extensions. I could code a compiler extension for LLVM that is a conservative, generational garbage collector and have the extension flag be —-no-more-manual-memory, but no one is then going to say that because that extension exist C++ is a garbage collected language.
tl;dr -> compiler extensions are by definition not part of a language, no matter how useful, and therefore do not count towards the argument.
The other big complaint I have about C++ is that it is too high-level. It isn't very good for low-level control. :-)
Writing start-up code and linker scripts to support C++ code running without the libraries is not easy. All sorts of things, such as tables of constructors, need a solution. Typically that would involve a lot of assembly code. I'd need to disable lots of C++ functionality unless I wanted to write complicated things like a stack unwinder for exceptions.
It's just easier to use plain old C. The result is smaller too, which is important.
It's weird feeling surrounded by web programmers who think Python and Javascript are the lowest you'd ever want to go.
C++ is annoyingly high-level if you are trying to write code to go in a flash chip. The code starts running at a specific address at soon as the CPU starts. There is no OS unless your code implements one. There is no C++ library unless you port one.
Very minor nitpick, but IMHO, this article creates a little unnecessary confusion by insisting upon a particular meaning of the word "allocate" without being very clear it's doing that.
That is, it uses "allocate" to mean "make ready for immediate use with no further (lazy) processing". Another perfectly reasonable definition would be "guarantee to be possible to use".
The distinction would arise on, for example, a system which doesn't over-commit memory but also doesn't fault pages in upon allocation. On such a system, allocation might give you a rock solid guarantee that you can write to (and read from) that memory but it wouldn't give you any guarantee about how fast or slow that would happen on initial access.
Personally, I prefer the second definition, but that's not really the point. The point is to be clear and avoid confusion.
This is a completely ridiculous article. The whole concept is flawed.
New/delete like this are for smallish general purpose allocation, for example of objects, where we want to keep the code at a fairly high level in C++ and not think about low-level concepts like "bytes" or allocation mechanics. Or performance.
Conversely, an app written in C++ that needs huge swaths memory for manipulating raw bytes and needs high-performance would not likely use new char[] or calloc/malloc directly at all. It would directly interface with the OS via mmap or indirectly via some domain relevant library, for example, OpenCV.
We might even still exclusively use the C++ language to write the low-level portions of the code, but if you need to interface with the OS in specific controlled ways or write performance oriented code, you are not going to do it using only high-level C++ operations. You are going to call exactly what you need to interface with the system directly. If you want mmap(), you'd just call mmap() from C++. If you want sbrk(), you'd call sbrk(). If you don't like the system new/delete and malloc/free, you could use something like dlmalloc and and even remap new/delete to it, or to some custom slab allocator.
Secondly, as pointed out by others, the author isn't benchmarking C++. He's benchmarking glibc malloc and the Linux mmap. We'd expect a C program using malloc/calloc to have exactly the same timings.
Thirdly, fallacious reasoning about initialization. If your app needs to allocate (for some ??? reason) 32 GB of memory, you would NOT automatically zero it first, unless you actually needed it to be zero, or you wanted your app to waste a bunch of time. Unnecessary zeroing huge arrays is not required for good security. We're mostly benchmarking memset here, not even malloc/mmap. So it's completely apples/oranges to compare mmap benchmark with a zerofill benchmark. I almost expect a follow-up article that points out accessing x[i] is much faster when x is a raw array of integers than when x is a std::map of strings.
Now read the footnotes. The author is misunderstanding what "idomatic C++" means. RAII is the best answer I can come up with for idiomatic - C++ doesn't have a built-in guard concept so we creatively mis-use constructors and destructors to get a similar result.
Footnote 2 proves the author knows bupkis about how C++ or any part of the system actually works. He's semi-admitting as much.
I'd hope for much better from a CS professor. Stick to benchmarking this:
Anonymous mmap() zeros memory so why is new wasting its cycles doing so too? I doubt any modern c++ standard library is using brk/sbrk I this day and age.
By definition the libraries present a portable interface to underlying system resources and are full of platform specific code (I was a Cygnus founder so spent plenty of time deep in these issues at a time when there was more diversity). All posix is systems I know (e.g. Linux, BSD, Solaris, etc) return pages of zeros from anonymous mmap. I consider it a bug if they post process in this case.
In a comment the blog author says he's using Ubuntu and glibc malloc. And glibc malloc has a threshold (128 kB or thereabouts) where it switches to using mmap(). So in practice this benchmark is a test of how the Linux kernel implements the mmap() syscall, and how the compiler implements the zeroing loop.
> And glibc malloc has a threshold (128 kB or thereabouts) where it switches to using mmap()
Very similar on Windows. In modern CRT, malloc is a thin wrapper over HeapAlloc WinAPI. That one has a threshold (512kb or 1MB depending on 32- or 64-bit process) where it switches to VirtualAlloc.
K&R actually contains an example malloc()/free() implementation using a "freelist" - i.e. a list of spare memory blocks held as a linked list in the "unallocated" (or de-allocated) blocks. Memory is retrieved from the OS using the ugly sbrk() interface.
If the pages don’t already exist (as indicated by the given timings), this is a test of the OS and has little to do with the language.
It’s a poorly posed question. C++ runs on many environments.