Email address obfuscation in effect -- please
click here to turn it off.
[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
Jonathan King wrote:
> On Sat, Jul 5, 2008 at 10:17 PM, Stephen Montgomery-Smith
> <EMAIL:PROTECTED> wrote:
>> Stephen Montgomery-Smith wrote:
>>> Thanks. Yes it does look like the various threads are working against
>>> each other. I will try to rewrite the program to see if I can mitigate
>>> this effect. Should have a version 1.16 out in a few days.
>> Well, my early attempts are not working out very well. So probably
>> there will be no version 1.16. If anyone else wants to try out my
>> program, I will still be very grateful.
>
> Do you have any feeling for whether this would work better/worse for
> network-type supercomputers? This is hardly my field, but if this is a
> processor contention issue or something, then maybe that's the
> approach you should explore. (And I wish I had ready access to
> Biowulf):
>
> http://biowulf.nih.gov/
>
> jking
My suspicion is that it would work quite badly on a cluster. The
different processors really do need to communicate with each other quite
a lot, and bandwidth would become an issue. But I might be wrong.
The issue with 3 year old Xeon's (e.g. Adam Proctor's computer) seems to
be this. Apparently they were extremely bad at synchronizing different
threads.
The programs can work in two ways. The first way is this:
thread() {
do some of the work;
exit;
}
main() {
for (do the loop) {
subdivide the work;
pthread_create(thread);
pthread_create(thread);
pthread_create(thread);
}
}
That is, each time we go through the loop we recreate the threads. This
adds a lot of thread creation overhead.
The second method is this:
thread() {
while (1) {
wait for condition-work;
do some of the work;
set condition-done;
}
}
main() {
pthread_create(thread);
pthread_create(thread);
pthread_create(thread);
for (do the loop) {
subdivide the work;
set condition-work 3 times;
wait for condition-done 3 times;
}
}
The second creates the threads, and then synchronizes them. The second
way should be much faster, and that is exactly what I do.
But, for example, the well know fftw package at fftw.org does it the
forst way, and only recently switched to the second way in beta
versions, as previously the second way wasn't any faster. I can onyl
guess that this is related to remarks I have heard to the effect that
for the older Xeon processors that the second way really ended up being
very slow.
Stephen
_______________________________________________
members mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/members