Email address obfuscation in effect -- please
click here to turn it off.
[Date Prev][
Date Next][Thread Prev][
Thread Next][
Date Index][
Thread Index]
- To: MLUG discussion <EMAIL:PROTECTED>
- Subject: [MLUG - DISCUSSION] HTML problem with "<base href=" and "<a name="
- From: Mike Miller <EMAIL:PROTECTED>
- Date: Sat, 1 Dec 2007 10:20:41 -0600 (CST)
- Delivery-date: Sat, 01 Dec 2007 10:20:51 -0600
- Envelope-to: EMAIL:PROTECTED
- Reply-to: MLUG Off-Topic Discussion <EMAIL:PROTECTED>
- Sender: EMAIL:PROTECTED
If you download a web page using wget, say, and insert in the header
something like
<base href="http://en.wikipedia.org">
That line helps a great deal because now almost all of the links in the
file will work. It finds style sheets and it looks good. The only
remaining problem is that the original document had some links of this
form:
<a href="#whatever">
These are meant to link to the part of the file that contains this tag...
<a name="whatever">
...but the change of base href effectively transforms them into this:
<a href="http://en.wikipedia.org/#whatever">
And that doesn't do anything.
So, what's the best way of dealing with this? The only way I can see to
make it work is to not use base href and to search for every relative link
in the file and change it so that it points to the correct page. That is
a bit annoying and I probably won't do it because I don't care that much
that the few name links don't work.
Know any good tricks? Maybe there is an option in wget to make it do
this. I want it to use the remote site for CSS, etc. -- I don't want to
download all of that (I know how to make wget do that).
Mike
_______________________________________________
discussion mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/discussion