Cpluzplus

Comments for Sutter's Mill

Comment on Recommended reading: Why mobile web apps are slow (Drew Crawford) by Zeckul

@HerbSutter: "GC'd object graphs are antithetical to performance" -> I don't see what "GC'd" has to do with the performance of object graphs vs arrays. Object graphs have poor data locality whether they're GCed or not.

"you can still get arrays but you're fighting against the system" -> How are you fighting the system? Arrays and value types are first-class in .NET and popular .NET languages. The syntax, semantics and performance characteristics are the same. If using first-class citizens of the runtime and language is fighting the system then worse should be said about automatic pointers, memory pools and such techniques in C++.

"control over hold/cold data separation and cache line layouts both of which are hard in managed environments that often don't give you the control you need to specify things like alignment directly" -> .NET offers the same control over data layout as C++. Alignement is certainly a valid example where things are easier in C++ but it's not exactly rocket science to do it in managed code either.

The article applied to Javascript and I find that generalizing to all GC'd/managed environments is unfair. .NET and C# were clearly designed with performance and low-level control in mind unlike Javascript, and save from writing intrinsics or inline assembly there are precious few cases where dropping down to native code is necessary. It's entirely possible to avoid GC by managing object pools and arrays exactly like it's done in C++, and it's not "fighting the system" any more than trying to do the same in C++.

Comment on Recommended reading: Why mobile web apps are slow (Drew Crawford) by Herb Sutter

@Zeckul: Quick reply — yes, the emphasis is on “graphs” which popular GC-based languages are built around and encourage. Having said that, GC’d object graphs add an additional layer of performance overhead because they need to be traversed by the system, which adds extra memory operations and in some cases contention with program threads/cores.

You can get array allocation, but have to fight the system’s natural way of working and don’t get full support — for example, common limitations/gotchas of arrays in managed languages include that you can only use them with true contiguity with a subset of types (typically fundamental/value types and arrays of big-Oh Objects aren’t contiguous but are really arrays of references), you can’t make use of contiguity unless you pin (that’s a major performance penalty), you can only make arrays up to a 32-bit index size, multidimensional arrays are not contiguous, typically there’s no alignment control, and/or other limitations. BTW, I said “GC-based languages” in the previous paragraph because the language design itself often assumes a GC, making node-based allocation and GC semantics inherent in the language in places — you’re fighting those assumptions and that normal common-path way of working when you opt for arrays, that’s all.

Arrays just aren’t used as much in managed code, whereas they’re the recommended default container ([] and std::vector) in C and C++ code. I haven’t done this experiment, but try counting the mentions of techniques that use arrays in books/articles about managed code vs. books/articles about native code. Note: If you want to try this experiment, be careful when you count, because types called “*Array*” in C# and Java are not always actually arrays in the contiguous sense we mean! — which is symptomatic of what I’m talking about.

Comment on Recommended reading: Why mobile web apps are slow (Drew Crawford) by pjmlp

C# and Java are not the only languages with GC, there are quite a few that offer the same memory control that C and C++ do, besides GC.

As mentioned, so far they failed to make a dent into then mainstream due to lack of corporation support, but that doesn’t mean we should now take C# and Java as examples of the only way to implement GC in system languages and its performance.

Comment on Recommended reading: Why mobile web apps are slow (Drew Crawford) by Zeckul

@HerbSutter: thanks for clarifying your point. The equivalent of an array of MyClass in C# would be an array of MyClass* or MyClass& in C++; the equivalent of an array of MyClass in C++ would be an array of MyStruct in C#. Let’s not compare apple to oranges. .NET arrays and Lists are contiguous blobs of memory exactly like native arrays and std::vectors. It’s up to the programmer, in both cases, to make use of that contiguity for data locality as need be.

I understand your point on different emphasis, but even then I find it unfair since, as I said, value types are first class citizens of .NET; they’re fully supported and easy to write and use. It’s not like value types were an esoteric feature (as could be said of many C++ features ;) ). For example, all geometric and math primitives in the XNA Framework are value types. If someone cares about performance and avoiding GC overhead it’s entirely feasible to rely heavily on value types and contiguous arrays in managed code, and it’s not particularly difficult or obscure code to write either. At any rate it’s certainly less obscure than most C++ code out there.

“you can't make use of contiguity unless you pin (that's a major performance penalty)” -> You get the data locality performance benefits whether you pin or not, but granted, you can only use pointer arithmetic if you pin. That said, pinning only hurts performance if the array is small (LOH is never moved in memory) and if GC happens to run while it’s pinned; if you only pin for brief amounts of time as is idiomatic with the fixed statement, and do some manual memory management (object pools, value types etc) to avoid putting too much pressure on the GC, the impact can be kept minimal.

“you can only make arrays up to a 32-bit index size” -> Not likely to be an issue even for most performance-sensitive programs. One can always allocate unmanaged memory if need be, exactly as they would in native code.

“multidimensional arrays are not contiguous” -> What is called a “multidimensional array” in .NET *is* contiguous in memory ( http://stackoverflow.com/a/597790/154766 ), although array access are unfortunately not inlined by the CLR. Jagged arrays require an indirection per dimension, but so they do in C++ as well, i.e. an unsigned char** is two layers of indirection exactly like a C# byte[][].

“typically there's no alignment control” -> There’s no built-in wrapper for alignment control but nothing stops someone from allocating a chunk of unmanaged memory (using aligned malloc if need be) and using pointers in C#. While this is not particularly easy, I don’t find C++ to make things so much easier in that regard either.

“True (fully contiguous) arrays just aren't used as much in managed code, whereas they're the recommended default container ([] and std::vector) in C and C++ code.” -> Strange, I thought [] and List were the recommended default containers in C# as well. If you browse MSDN samples for C# those are by far the most commonly used collections. But if managed code litterature makes heavier use of fancy data structures in .NET, perhaps it is simply because they’re more abundant and easier to use.

Similar to how C++ lets you opt-in to use certain “heavyweight” features (virtual methods, exceptions), C# lets you opt-out of GC and type checking (value types, unsafe code). Certainly the defaults and emphasis are different, but let’s not underestimate the possibilities of managed code.

Delievered to you by Feedamail.
Unsubscribe

Cpluzplus

Tuesday, July 16, 2013

FeedaMail: Comments for Sutterâ€™s Mill