MLUG: Re: [MLUG] String manipulation in C
Re: [MLUG] String manipulation in C
Email address obfuscation in effect -- please click here to turn it off.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
The strings are quite long, and there are a *bunch* of them. My pattern
strings to be matched are short- 5 to about 50 letters, but the strings
I'm matching them to are about 1000 letters long. There are ~600,000 of
the short strings and 29,000 of the long ones. 

I originally was suggested to use MATLAB to do the matches as the
strings come from a data analysis project. MATLAB will match the pattern
string multiple times to the source string, but the result matrices get
very big very quickly and this is exceedingly slow. Running 8000 pattern
strings against 15,000 source strings took my X2 4200+ with 4 GB RAM 15
hours to accomplish and made 2.2 GB worth of output data matrices. (On a
side note, MATLAB for Linux x86_64 leaks memory like a sonofagun. Leave
it open with no data set and nothing going and it will eat a GB of RAM
within several hours.) This is unacceptably slow and I thought that
making a C program to do the regex matches would be faster. I guess I
could use Perl but I don't know Perl and I do know at least some amount
of C.  


Jack


On Sun, 2007-07-01 at 22:03 -0400, Jonathan King wrote:

> But all that said...unless your strings are absolutely huge, this
> would seem to be the kind of thing that Perl could do even better than
> C (at least the code would be shorter).
> 
> jking



_______________________________________________
members mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/members