Email address obfuscation in effect -- please
click here to turn it off.
[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
This is a really simple trick I have come accross, but I'm not sure it's
very useful. When you're storing string records in a database, there is
the question of separating those records. You can use a special
character (tab, comma, etc.) for separation, and then insert escape
character within each record whenever the special character arises
naturally and is not meant to be a separator. Then you also have to
insert escape in front of every escape character which natively occured
in the text. That may be a lot of characters to insert. Ok, here's the
idea: by using two (different) characters as a single separator mark,
you can get away with only substitution of one of those characters.
Example: We want to store strings XEV and KAI as records and use VK as
our separators. Replace all occurences of V in both records with VV, and
then you can safely join them together with VK as XEVVVKKAI. Note that
the separator can be found cheaply by regular expression matching -
there will be an odd number of V's. We need the second character ('K' in
this case) to tell us whether the even number of V's at the separator
border belongs to the first or second record.
So when is this absolutely better than an escape character system? When
the expected average number of occurences of both separator character
and escape character in a record is greater than 1. But I have not come
accross this idea while trying to optimize database storage. I was lazy
and wanted to use a standard search-and-replace function which can only
replace one pattern at a time. If you're using the escape system and do
not want to write a custom string matching function, you'll have to call
the search-and-replace twice: once for special character, and once for
the escape character (in reverse order actually). With this system you
only replace one pattern.
Sorry for the rant, just had to get it down in writing to get my
thoughts together. Now if we just patent it and lobby the government to
restrict export of technologies using the "multi-character delimeter
algorithm," we're home free.
Paul
--
To unsubscribe, go to http://mlug.missouri.edu/members/edit.php
Archives are available at http://mlug.missouri.edu/list-archives/