GENERAL ======== This is a mirror of Compaq's website for back issues of the DIGITAL Techical Journal, at the current time of writing (17th of November 2003) available at: http://www.research.compaq.com/wrl/DECarchives/DTJ/ All content mirrored here is created and owned by HP/Compaq, unless otherwise stated. No modifications to the content has been made except for those mentioned in the MIRROR section below. This mirror archive was created and is made available for preservational purposes only. This document, the complete archive and a tarball of both can be found at: http://netilium.org/~mad/dtj/ MIRROR ======= A few minor modifications to the HTML code was made during the mirroring process, which will be described below. The scripts below use GNU-specific extensions of "find", but otherwise only utilitize UNIX standard commands and -line arguments. The bulk of the site was mirrored using GNU wget: --------------------------------------------------------------------- wget --mirror --no-parent \ 'http://www.research.compaq.com/wrl/DECarchives/DTJ/' --------------------------------------------------------------------- A number of references to absolute URLs existed, first we needed to get these files: --------------------------------------------------------------------- DECURL="www.research.compaq.com/wrl/DECarchives/DTJ/" find . -type f -iname "*.htm*" | \ xargs -n1 sed -n 's/.*\/info\/\([^ "]*\).*/\1/p' | \ while read LINE ; do if [ \! -f "${DECURL}${LINE}" ] ; then \ wget --force-directories "http://${DECURL}${LINE}" ; else \ echo "We already have ${LINE}. (Won't fetch)" ; fi ; done --------------------------------------------------------------------- We also need to change all absolute URLs to relative ones: --------------------------------------------------------------------- cd "www.research.compaq.com/wrl/DECarchives/DTJ" find . -type f -iname '*.htm*' -exec grep -l '/info/' {} \; | \ while read LINE ; do echo "Fixing '${LINE}'." SLASHES="`echo -n ${LINE} | sed 's/[^/]//g' | cut -c2- | sed 's/\//..\\\\\//g'`" ed "${LINE}" <