MLUG: Re: [MLUG] Linux time command -- "kernel mode" and "user mode"
Re: [MLUG] Linux time command -- "kernel mode" and "user mode"
Email address obfuscation in effect -- please click here to turn it off.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
On 11/3/06, Stephen Montgomery-Smith <EMAIL:PROTECTED> wrote:
Jonathan King wrote:
> On 11/3/06, Stephen Montgomery-Smith <EMAIL:PROTECTED> wrote:

>> Or another possibility.  The reformatting of the data in the C++ program
>> involves a lot of calling malloc, which in turn will probably call things
>> like mmap.  This would push a lot of the tasks from the user to the
>> kernel.  But this would not be a bad thing in of itself, if it actually
>> leads to slightly better performance.  (Whereas using the swap space a
>> lot
>> strikes me as excessively ugly.)
>
>
> Something does seem pretty screwy here. He says he's keeping *300*
> files open in the Python version, while the C++ is doing things one at
> a time. I'm thinking the only thing that's keeping Python in the
> ballpark is that it's doing something better with caching writes and
> then doing all of them at once. Would it be possible to post the code,
> or at least a skeleton that shows the i/o? My guess is that as soon as
> they saw the code, half of the C++ dudes on the list could point to
> some crucial inefficiency you could fix.
>
> Or, maybe Python really can be pretty clever here.

My suspicion is that is the kind of problem where C++ isn't going to
give you much of an edge.  So I think that to write it in python is a
good way to go.

The caching you talk of does take place, but actually it takes place in
just about any decent file-io implementation.  For example, 'printf' in
C has caching.  It only writes the info to the files once it has
collected about 128 bytes to write out.  The only function that has the
possibility to write a byte at a time is the core function 'write' (or
printf if you first apply the function 'setbuf' which allows you to tune
the caching behavior).

Stephen

If you are interested in speeding this up, I would highly recommend Python's profiler. http://docs.python.org/lib/profile.html

I've increased performance of Python programs by orders of magnitude
after looking over the profiler's output for a few minutes.  One trick
is to inline very small functions that get called a lot.  Function
calls are not free in Python.

Also, make sure you are calling Python with "-O" to turn on the
optimizer.   The byte-compiled files will end with ".pyo" instead of
".pyc".  I haven't measured much improvement by the optimizer,
however.  I'm not really sure what it does, beyond increasing compile
times.

Regards,
Mark
EMAIL:PROTECTED
--
You think that it is a secret, but it never has been one.
 - fortune cookie

_______________________________________________
members mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/members