Continued from page 1
Tip: Pipe your cron jobs to /dev/null if you aren't doing anything with
output, because some hosts e-mail you
results and no one needs an extra piece of useless e-mail every day.
Just change http://www.google.com to
page of your choice. However it's important to know that
"archive" you're taking will only be a snapshot of that page on a particular day.
What I mean by that is, if you're archiving a blog page every day, this archiver won't archive that page on a particular day, it'll just be archiving what was there at that time. So it's not useful for everything, but it's good if you have access to a page that changes constantly, once a day, whose results you'd like to store.
Add that line above into your crontab file. These days every host has a control panel so there should be a place in there to add cron jobs. If you'd like
archiver to run at a time other than midnight, or if it should run weekly, monthly, or whatever, try this tool I've made for you:
http://www.robertplank.com/cron
I've designed it
same way Task Scheduler is setup, you can enter a certain time, run only on weekdays, run only on certain days of
week. Anything you want.
This tip doesn't take care of everything... for example, wget won't save
images on a page unless they're referenced by full URLs. In
next installment of this article series I'll be showing you how you can use PHP to make up for some of
things wget can't do (like grabbing images).
Here's my solution: http://www.jumpx.com utorials/commandline/get.zip
It's not
most perfect script in
world, but it should do what you want most of
time. If you'd like to delve into what it does, I've added comments within so you can see what it does. I've commented all
functions and a few of
important parts of
code.
ARGUMENTS (NOT THE SHOUTING KIND)
But wait, you want to use it in a crontab, which is run from
command line. You can't just do something like:
php get.php?url=http://www.google.com
Because it'll try looking for a *file* named all that, complete with
question mark and all. So what if you have ten different URLs to grab off ten different crontabs, but you only want one script.
How would you do all that? It's a long brutal ordeal so prepare yourself. Ready?
php get.php url=http://www.google.com
Yeah, that's all there is to it. PHP's pretty cool like that, it takes
arguments after
file name and stores them in
same array you'd check anyway.
One thing you might notice is that every time you run PHP from
command line, it gives you something like this:
Content-type: text/html X-Powered-By: PHP/4.3.3
your output here...
Those first couple of lines are
HTTP headers. But we're not using HTTP (not loading it from a browser), so in
command line it's better to call php with
"-q" option, like this:
php -q get.php url=http://www.google.com
The "q" stands for quiet, and will refrain from giving you
HTTP headers. If you're just piping
script to /dev/null (to nothing) in a crontab, it doesn't really make a difference but you should try to make this a habit when running PHP from
command line.
That's enough for you to at least get started. If you still feel liking poking about with
things PHP can do in
command line, you can try prompting a user for keyboard input, like this:
echo "Give me your name: "; $data = fopen("php://stdin", "rb");
while (1==1) { $chunk = fread($data, 1); if ($chunk == " " || $chunk == " ") break; $input .= $chunk; } fclose($data);
echo "Hello $input! ";
?>
Remember, that only works when PHP is run from
shell.
If you have PHP installed in Windows on a local machine of yours, you can also see what happens when you try to read (and write) to filehandles like "COM1:" and "LPT1:" ... yep, you guessed it,
serial port and printer port. If PHP isn't installed on
computer you're using now then don't bother. But it is possible to use PHP to print and interact with your peripherals as well.
You're welcome.

Robert Plank is the creator of Lightning Track, Redirect Pro, Rotatorblaze, and others.
An easy way to display the content saved by this article's script is explained in chapters 15 and 16 of his book, "Simple PHP": http://www.simplephp.com
You may reprint this article in full in your newsletter or web site.