MLUG: Re: [MLUG] Linux time command -- "kernel mode" and "user mode"
Re: [MLUG] Linux time command -- "kernel mode" and "user mode"
Email address obfuscation in effect -- please click here to turn it off.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Fri, 3 Nov 2006, Stephen Montgomery-Smith wrote:



On Fri, 3 Nov 2006, Mike Miller wrote:

On Fri, 3 Nov 2006, Stephen Montgomery-Smith wrote:

On Fri, 3 Nov 2006, Mike Miller wrote:

We have written a program in C++ and we have also written it with a slightly different algorithm in Python. We want to see how much slower the Python program is than the C++ program. We get these results:

Python:
------
real    10m43.357s
user    9m51.050s
sys     0m23.420s

C++:
------
real    8m16.668s
user    5m38.360s
sys     2m37.780s

This is my understanding.

You want to add the user and kernel times together.

[snip good info]

What does your program do?

It reads in a large amount of data (a few million lines) and processes every line by replacing certain strings with others based on a hash table. Then it writes the reformatted data into a collection of about 300 files. So one big file comes in and 300 smaller gzipped files go out.


The C++ program reads the whole big file into memory, reformats it and writes it out to 300 files, one at a time. The Python program reads in the big file one line at a time and writes each processed line immediately to one of the 300 files, keeping all 300 open at once.

The Python script surely uses minimal memory while the C++ program uses lots of memory. The Python program opens many files at once, but this seems not to be a problem under most conditions, while the C++ file keeps open only one output file at a time.

Thanks for the tips, Stephen.

Mike

I'm trying to account for the huge kernel time. Is the file you read in very big? Maybe the C++ program uses a lot of swap space on the drive. That would contribute to a huge kernel time.

Or another possibility. The reformatting of the data in the C++ program involves a lot of calling malloc, which in turn will probably call things like mmap. This would push a lot of the tasks from the user to the kernel. But this would not be a bad thing in of itself, if it actually leads to slightly better performance. (Whereas using the swap space a lot strikes me as excessively ugly.)








_______________________________________________ members mailing list EMAIL:PROTECTED http://mlug.missouri.edu/mailman/listinfo/members


_______________________________________________ members mailing list EMAIL:PROTECTED http://mlug.missouri.edu/mailman/listinfo/members


_______________________________________________ members mailing list EMAIL:PROTECTED http://mlug.missouri.edu/mailman/listinfo/members