[#311523] Haskell GHC: parallel binary-trees

View Trackers | Archive One | Export CSV

2009-03-03 01:59
Submitted by:
Don Stewart (dons-guest)
Assigned to:
Nobody (None)
Haskell GHC
Haskell GHC: parallel binary-trees

Detailed description
Modernisation of the binary-trees code, in light of the new parallel GC in GHC 6.10.x

Compile with:

ghc -O2 -fasm --make -threaded A.hs

Run with:

$ /A 20 +RTS -N4 -H340M

Since GHC 6.10.x the parallel garbage collector will use the same number of threads by default as the -N flag.


Note: The GHC User's Guide also suggests:

"set your -H value high (3 or more times the maximum residency)"

When using the parallel GC.

The -H flag:

"provides a “suggested heap size” for the garbage collector. The garbage collector will use about this much memory until the program residency grows and the heap size needs to be expanded to retain reasonable performance."

I'm not sure how this differs to the -A flag (which sets the initial allocation area, and was ruled against the "initial allocation size" rule). -H seems to allow the heap to grow and shrink.

Either way, this program does much better if we follow the GHC User's Guide.

Should using the -H flag be not accepted, I suggest we put that in the "interesting alternatives", and then either do one of two things:

* Work out a better heap growth algorithm for GHC's default heap.


* Allow GC hints to some fixed size for all entries, so that parallelism
is visible in the GC.

Followups: Sort comments antichronologically

Date: 2009-03-04 17:48
Sender: Don Stewart

In one entry there's a comment:

"Setting the heap size cuts-out all that pesky heap-growth. (We should really set both the initial heap-size and the maximum heap-size but for now we'll accept the default settings.)"

This seems like a good idea. Either:

* you go with your default size


* you can use a particular minimum and maximum.

Date: 2009-03-07 15:37
Sender: Isaac Gouy

As you say in your analysis "yikes, we’re wasting a lot of time cleaning up after ourselves" - well the point of these binary-trees programs is the GC not the computation of binary trees.

My guess is that the heap grows and shrinks without -H otherwise the program would fail.

Larger heaps evade GC work.

At first glance some "fixed size" for all entries might seem fair but as tree nodes will likely be different size for different language implementations, and some will not allow a "fixed size" to be set - it's still pretty arbitrary.

Probably worth some kind of feature request.
Date: 2009-03-07 15:47
Sender: Isaac Gouy
Date: 2009-03-07 18:02
Sender: Don Stewart

This is reasonable, thanks for taking the time.

Some other entries that look like they might be in violation of the "default only" rule:

* Lua #3

("collectgarbage("setstepmul", 0) -- sometimes it helps much. For this benchmark ~ 10%"

* Python
import sys, gc

* It was also suggested that "java -server" (used by Scala #4 and Java) is essentially a GC / heap setting hint as well, which is interesting.
Date: 2009-03-07 19:41
Sender: Isaac Gouy

> "It was also suggested that "java -server" (used
by Scala #4 and Java) is essentially a GC / heap setting hint
as well, which is interesting."

It might be interesting if any evidence was provided that it was true or if there weren't explicit GC / heap setting hints for Java which do make a massive difference even when -server is used.

Attached Files:

Size Name Date By Download
2 KiBA.hs2009-03-03 01:59dons-guestA.hs


Field Old Value Date By
close_date2009-03-07 19:412009-03-07 19:41igouy-guest
close_date2009-03-07 15:472009-03-07 15:47igouy-guest
close_date2009-03-07 15:372009-03-07 15:37igouy-guest
ResolutionNone2009-03-07 15:37igouy-guest
status_idOpen2009-03-07 15:37igouy-guest
File Added3111: A.hs2009-03-03 01:59dons-guest
Powered By FusionForge