Email address obfuscation in effect -- please
click here to turn it off.
[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
Here's an interesting use of what I was just taught:
perl -e 's/^>(From \S+\s+\S+\s+\S+\s+\d+\s+\d+:\d+:\d+\s+\d+)/$1/g' -p
This can be used to fix a Listserv archive log to make it back into a
proper mbox file. It takes the '>' off of the front of lines that begin
each individual message. The first line of a proper e-mail message
matches this perl regular expression:
^From \S+\s+\S+\s+\S+\s+\d+\s+\d+:\d+:\d+\s+\d+
Coincidentally, I had another regexp problem today. I need to use some
sort of grep command to grab lines that match some string in the fourth
column of data in a text file. The columns are separated by a single
space. This seems to work:
Let $string represent the string we are interested in grepping in the
fourth column of the infile:
egrep '^[\!-~]+ [\!-~]+ [\!-~]+ $string ' infile > outfile
The regexp '[\!-~]' matches any ascii character including
!"#$%&'()*+,-./{|}~ and everything covered by '[0-z]'. Note that there
are many characters in '[9-A]' and in '[Z-a]' that are neither letters nor
numbers but match '[0-z]'
In other words, I think '[\!-~]' matches any ordinary ascii character
except for space. If you want to match space too, use '[\ -~]'.
Mike
"man ascii" produces this (among other things):
0 NUL 1 SOH 2 STX 3 ETX 4 EOT 5 ENQ 6 ACK 7 BEL
8 BS 9 HT 10 NL 11 VT 12 NP 13 CR 14 SO 15 SI
16 DLE 17 DC1 18 DC2 19 DC3 20 DC4 21 NAK 22 SYN 23 ETB
24 CAN 25 EM 26 SUB 27 ESC 28 FS 29 GS 30 RS 31 US
32 SP 33 ! 34 " 35 # 36 $ 37 % 38 & 39 '
40 ( 41 ) 42 * 43 + 44 , 45 - 46 . 47 /
48 0 49 1 50 2 51 3 52 4 53 5 54 6 55 7
56 8 57 9 58 : 59 ; 60 < 61 = 62 > 63 ?
64 @ 65 A 66 B 67 C 68 D 69 E 70 F 71 G
72 H 73 I 74 J 75 K 76 L 77 M 78 N 79 O
80 P 81 Q 82 R 83 S 84 T 85 U 86 V 87 W
88 X 89 Y 90 Z 91 [ 92 \ 93 ] 94 ^ 95 _
96 ` 97 a 98 b 99 c 100 d 101 e 102 f 103 g
104 h 105 i 106 j 107 k 108 l 109 m 110 n 111 o
112 p 113 q 114 r 115 s 116 t 117 u 118 v 119 w
120 x 121 y 122 z 123 { 124 | 125 } 126 ~ 127 DEL
--
To unsubscribe, go to http://mlug.missouri.edu/members/edit.php
Archives are available at http://mlug.missouri.edu/list-archives/