There's a single line you can add to your web host's control panel that will automatically archive your content.LISTEN CLOSELY AND YOU'LL HEAR THE OCEAN
Ever run commands in DOS? You've used a shell. A "shell" in computer world is a place where you enter commands and run files by name rather than clicking around different windows.
Most web hosts let you operate a shell remotely. This means that you can type commands in window on your computer, that are actually run on your web host, thousands of miles away.
I'd like you to log in to your shell now. If you can't do it by going in to DOS and typing "telnet your.domain.here", your web host probably uses "SSH" -- a secure shell. You'll have to ask your host how you can log in to shell, they might tell you to download a program called "PuTTY" and give instructions how to use it.
If you can't login to your shell, or aren't allowed, you'll just have to sit back and watch what I do.
Now that you're logged in, type: echo hi
On next line will be printed hi
Try this: date +%Y
This prints current year. That's 2004 for me.
So what if we combined two? Try: echo date +%Y
Well, that doesn't work, because computer thinks you're trying to echo TEXT "date +%Y" instead of actual COMMAND. What we have to do here is surround that text in what are called "back quotes". Unix will evaluate everything enclosed in back quotes (by evaluate, I mean it'll treat that text as if it were entered as a command.)
Your back quotes key should be located on upper-left corner of your keyboard, under Esc button.
PIPE DOWN, OVER THERE...
Type this in: echo `date +%Y`
Gives us "2004". You could even do something like this: echo `dir`
Which puts directory listing all on one line.
But now, we put our newfound knowledge to good use. Unix has another neat feature called piping, which means "take everything you would normally output to screen here, and shove it whatever file I tell you to." So say I had something like this:
echo "hey" > test.txt
Now type "dir" and you'll see a new file, test.txt, that wasn't there before. View it off web, or FTP it to your computer, do whatever you have to, to read file. It should contain word "hey".
Likewise, dir > test.txt would store directory listing into "test.txt".
HERE TODAY, GONE TOMORROW
But say we wanted that text file to be named according to current date. You already have pieces to figure all that out, if you think about it. Type: date --help to get a listing of all possible ways to represent date. The ones you want to represent year, month and day are %Y, %m, and %d (capitalization *is* important here).
This is what you want: echo `date +%Y%m%d.html`
Running this today, January 8th, 2004, results in: 20040108.html
I've just echoed this year, followed by this month and this day, with an ".html" at end. This will be our output file.
Now, to pipe it: echo "hey" > `date +%Y%m%d.html`
If this sort of thing were to run every day, it would save "hey" to a file called 20040108.html today, and tomorrow to a file called 20040109.html, then 20040110.html, and so on.
The easy part now, is figuring out what you want archived. I use wget, which takes an option to store output file, so we don't need to use piping. Here's an example of how to use wget to save page "http://www.google.com" to a file representing today's date:
wget http://www.google.com --output-document=`date +%Y%m%d.html`
PUT IT TOGETHER
And now, to setup your crontab. I won't explain how crontabs work, just that they're equivalent of Windows Task Scheduler, which automatically run a particular command at a given date and time. The following will save http://www.google.com to a different filename every day.
0 0 * * * wget http://www.google.com --output-document=`date +%Y%m%d.html` > /dev/null
Keep in mind that if you want to put it in a special directory, just put path in, i.e. change what's in "output document" parameter to: `date +/home/user/wwwroot/your.host/%Y%m%d.html`
I've piped output to /dev/null because wget saves file for us, and there's no reason to do anything else with output.