MLUG: Re: [MLUG - DISCUSSION] Re: HTML problem with "<base href=" and "<a name="
Re: [MLUG - DISCUSSION] Re: HTML problem with "<base href=" and "<a name="
Email address obfuscation in effect -- please click here to turn it off.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
On Sat, 1 Dec 2007, Michael wrote:

What are you trying to do? Wget will usually work on links regardless to if the page uses base or not. You shouldn't need to insert anything or use any special options.


Download Wikipedia pages, usually for record albums, have the HTML locally, but get the embedded images, CSS, etc., from Wikipedia. I usually do this where "$1" is the URL of the Wikipedia page:

lynx -source "$1" | perl -pe 's#<head>#<head>\n<base href="http://en.wikipedia.org";>#' > file.html

I'm just grabbing the original source file and adding the base href.

What you are saying about wget is not true. Try it. There are various wget options that are supposed to deal with the issues I'm trying to deal with, but they do not work correctly with Wikipedia pages (e.g., the -k option).

Mike

_______________________________________________
discussion mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/discussion