[#311528] binary-trees heap tuning

View Trackers | Feature Requests | Export CSV

2009-03-07 15:47
Submitted by:
Isaac Gouy (igouy-guest)
Assigned to:
Nobody (None)
Benchmark Design and Description
binary-trees heap tuning

Detailed description
For many language implementations the binary-trees programs are all about GC and GC work can be evaded by setting a larger initial heap, hinting that the heap will be in some size range, controlling when GC happens etc

We've taken an arbitrary approach - just default settings for all language implementations.

Maybe there's a better (rather than different arbitrary) approach?

Nothing persuasive has been suggested yet.

Followups: Sort comments antichronologically

Date: 2009-03-07 16:11
Sender: Isaac Gouy

"Note: these programs are being measured with the default initial heap size - the measurements may be very different with a larger initial heap size or GC tuning."
Date: 2009-08-20 13:53
Sender: Adrien Nader

I first commented about ocaml and its Gc.minor_heap_size setting. Afaik, the optimal value is equal to the size of the L2 cache which means it could be set automatically.

I think ocaml's default value is so low because it hasn't been changed in years and doesn't discriminate processor archs: a core i7 quad running in 32bit is treated like a pentium1 and a bigger minor_heap_size would be terrible for such a computer. On the other hand, some compilers such as gcc with its -mtune=native discriminate.

That makes the situation a bit biaised since compiler flags can be used.
Actually for ocaml, we could set the GC settings with command-line args too. :P

Also, this doesn't really avoid gc work, the benchmarks probably too short-lived for the gc to start working. When trying gc settings for the benchmarks, I found out that trying to make the gc work less and trade memory for speed gave no improvement[1].
We could check that by running the programs with the OCAMLRUNPARAM="v=1FF" environment setting actually.

I wouldn't risk an explanation as why this setting gives a very good speedup for ocaml in some benchmarks for I am no GC guru and haven't studied the benchmarks enough but I think it has more to do with allocation speed than deallocation speed and maybe that ocaml uses two heaps and a bigger young/minor/small one means less work to move data between the two and maybe also better "caching" or use of the cpu caches.

I think it'd be fair to allow them but I'd hate to see ten gc settings in each program (but it's like the gcc compilation settings, most doesn't help usually).

[1] actually they gave some improvement but really not much and I dissed them because I didn't enjoy having three more almost useless params

Attached Files:


No Changes Have Been Made to This Item

Powered By FusionForge