Email address obfuscation in effect -- please
click here to turn it off.
[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
On Fri, 3 Nov 2006, Stephen Montgomery-Smith wrote:
On Fri, 3 Nov 2006, Mike Miller wrote:
On Fri, 3 Nov 2006, Stephen Montgomery-Smith wrote:
On Fri, 3 Nov 2006, Mike Miller wrote:
We have written a program in C++ and we have also written it with a
slightly different algorithm in Python. We want to see how much slower
the Python program is than the C++ program. We get these results:
Python:
------
real 10m43.357s
user 9m51.050s
sys 0m23.420s
C++:
------
real 8m16.668s
user 5m38.360s
sys 2m37.780s
This is my understanding.
You want to add the user and kernel times together.
[snip good info]
What does your program do?
It reads in a large amount of data (a few million lines) and processes
every line by replacing certain strings with others based on a hash table.
Then it writes the reformatted data into a collection of about 300 files.
So one big file comes in and 300 smaller gzipped files go out.
The C++ program reads the whole big file into memory, reformats it and
writes it out to 300 files, one at a time. The Python program reads in the
big file one line at a time and writes each processed line immediately to
one of the 300 files, keeping all 300 open at once.
The Python script surely uses minimal memory while the C++ program uses
lots of memory. The Python program opens many files at once, but this
seems not to be a problem under most conditions, while the C++ file keeps
open only one output file at a time.
Thanks for the tips, Stephen.
Mike
I'm trying to account for the huge kernel time. Is the file you read in very
big? Maybe the C++ program uses a lot of swap space on the drive. That would
contribute to a huge kernel time.
Or another possibility. The reformatting of the data in the C++ program
involves a lot of calling malloc, which in turn will probably call things
like mmap. This would push a lot of the tasks from the user to the
kernel. But this would not be a bad thing in of itself, if it actually
leads to slightly better performance. (Whereas using the swap space a lot
strikes me as excessively ugly.)
_______________________________________________
members mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/members
_______________________________________________
members mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/members
_______________________________________________
members mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/members