brief,introduction,libwwwperl,module,advanced
Quick Search for:  in language:    
brief,introduction,libwwwperl,module,advanced
   Code/Articles » |  Newest/Best » |  Community » |  Jobs » |  Other » |  Goto » | 
CategoriesSearch Newest CodeCoding ContestCode of the DayAsk A ProJobsUpload
Perl Stats

 Code: 56,787 lines
 Jobs: 87 postings

 
Sponsored by:

 

You are in:

 
Login



Latest Code Ticker for Perl.
imgLeech
By Benjamin Tilley on 10/23


Even Odd Guessing Game
By Jason DeLuca on 10/15


Include Function
By -Oz on 10/13


Click here to put this ticker on your site!


Add this ticker to your desktop!


Daily Code Email
To join the 'Code of the Day' Mailing List click here!





Affiliate Sites



 
 
   

Writing Internet Clients with Perl - Fast and Easy

Print
Email
 

Submitted on: 1/1/2002 3:58:06 PM
By: T. E. Geek 
Level: Intermediate
User Rating: By 11 Users
Compatibility:5.0 (all versions)

Users have accessed this article 10091 times.
 
(About the author)
 
     A brief introduction to the libwww-perl module. No advanced knowledged assumed. Learn how to request webpages and then parse them within perl. Easy to learn in under five minutes.

 
 
Terms of Agreement:   
By using this article, you agree to the following terms...   
1) You may use this article in your own programs (and may compile it into a program and distribute it in compiled format for languages that allow it) freely and with no charge.   
2) You MAY NOT redistribute this article (for example to a web site) without written permission from the original author. Failure to do so is a violation of copyright laws.   
3) You may link to this article from another website, but ONLY if it is not wrapped in a frame. 
4) You will abide by any additional copyright restrictions which the author may have placed in the article or article's description.

Introduction to writing your very own Web Clients


Welcome to this brief tutorial. This tutorial will outline the creation of simple perl scripts which have the capability of requesting HTML data from the internet, storing it in a file, mirroring it, printing it, and searching it.
Perl: You've learned how to write scripts, parse text, blah blah... now what? Sure, its nice to save a text file to your hard drive... rename it... and... uh- save another text file after that. But surely there has to be somthing else- beside cgi- that perl can be used for. Here's an answer to that question- just one of the thousands found at www.cpan.org. Within five minutes you'll be writing perl scripts capable of retreaving webpages. Cool, eh?
Let's begin.
The first step to writing a www aware perl script is to download the libwww-perl module. This module features a non-object oriented package, and a object oriented package. For your conveniance, I'll cover the non-object oriented (it's faster and does the same stuff). To download this perl module, simply go here:

http://www.cpan.org/authors/id/GAAS/libwww-perl-5.63.tar.gz

For those wishing to do this on the Windows platform, you can find this at the ActiveState website (www.ActiveState.com, I believe)- if not, search either google for libwww-perl, or www.cpan.org, for a windows version. This tutorial was written on a linux machine, thus, I wont truly be able to refer you to the windows download location.
Once you've downloaded it, un tar/zip it to a new directory. Enter the shell/DOS, change directory to the directory you unzipped/tarred to, and type the following:

perl Maketest.PL -wait for this program to complete. If it cannot find this this file, type dir on dos, or ls on unix to view the content of the directory. Type perl and whatever file you have that ends in .PL (capitals)

Next, type
make

-wait for this to finish, then type
make test

-and then

make install

once you type all four of these commands(perl Makefile.pl -> make -> make test -> make install), your copy of Perl will have been patched to recognize the wwwlib-perl module.
Next, open up your favorite perl editor, and create a new file. In this file, you must say- "Hey, Perl-- I want to use the wwwlib-perl thingie I just installed, so stick it in my code" Which can be roughly translated into perl-speak by typing

use LWP::Simple;

So, not much of a program yet, eh? What LWP stands for is libwww-perl (Duh)... which should help you remember it. What next? Well, you've got a program that knows this modules there- but... how do you use it?
Simply! This package is called "Simple" for just that reason. Go figure.
What you'll find is that your copy of Perl has suddenly been expanded to house several brand spanking new functions- thats right, simple, old fashioned, functions.



1. get($url);
2. getstore($url, $filename);
3. getprint($url);
4. mirror($url, $filename);


Oohh, ahhh! I'm sure you guys could memorize those right now. Let me tell you what each of those do, and how your expected to use them.
First, you've got get($url). Simply take a scalar variable- lets say $html, and assign it to get("http://get_this_url/"). You can replace the string inside the get function with the url of whatever website you'd like to get. SO, This is an example of doing just that:

use LWP::Simple;
$html = get("http://www.megathink.com");
print $html;

A program that gets a webpage's html and crams it into a scalar
Don't forget the use LWP::Simple; at the top! That can fudge everything up.
Let's say you wanna do this real quick- illiminate the need for a variable at all. Well- you've got the getprint($url); function to do that! Simply type getprint("http://www.whatever.com/");, and the program will automatically print the html of that website.
What if you wanna store the html to a file on your harddrive- for backup. Or, let's say you want to start your own cache of websites. That's just as easy! Type getstore("http://url", "stored.htm");, and badda bing, badda boom- you've got a new .htm file in your directory, loaded to the brim with whatever URL you requested. A working example, you say? Sure!
use LWP::Simple;
getstore("http://www.megathink.com", "temp.htm");

Well- thats cool, huh? No? You want to be able to only store a website when it has changed-- like google does, for example? That's no big deal-- change the function in the last example ("getstore"), to mirror- leaving the parameters as they are- and the program will only store the html file if it has changed from the version already stored on disk. Cool? Yup!
At this point, you have a ton of things you can do. Let's say you want to check for dead links. Simple get the html into a variable ($html = get("http://www.google.com/");), and use a couple search strings and splits on it until you find all the <.a href.> tags-- next, find the src= parameter, and add each URL to an array. Create a loop (foreach $i(@array_of_links){}) to cycle through each, and attempt to connect to them.If the link is bad, get() will return a false string ("") [there is nothing between the quotes]. Otherwise, the get will return the html. I don't want to ruin this for you- since I'm sure you'd love to try it on your own [yeaaaa, riiight]. Another "creative" idea is to write a proxy. If you run Apache on your machine (or IIS for you windowers), you can now use these functions in cgi programs. YUP! Think of the bucks you can make for writing a proxy to get past that god forsaken netnanny, or bess, software your school/home/office forces onto you. Simply write a script to accept a query_string of a URL, and use getprint() to display it. Cha-ching!
I hope you enjoyed reading this tutorial, and continue to contribute to the free Perl community. It was my pleasure writing this lil' file.
Please post feedback so I know whether or not I'm actually helping.

Happy New Year!


Other 3 submission(s) by this author

 

 
Report Bad Submission
Use this form to notify us if this entry should be deleted (i.e contains no code, is a virus, etc.).
Reason:
 
Your Vote!

What do you think of this article(in the Intermediate category)?
(The article with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor See Voting Log
 
Other User Comments
1/2/2002 6:42:03 PM:T. E. Geek
I just wanted to say thank you to those who have voted. I'm always encouraged when someone benefits from somthing I do. Thank you :-) (If you'd like me to write a tutorial on anything you don't understand, simply post it- I'll write it asap)
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
1/18/2002 5:42:05 AM:Scratch Monkey
Very nice tutorial, well written and easy to understand even for a complete Perl lameo such as myself. Keep up the good work, looking forward to more of your tutorials in the future.
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
1/18/2002 5:23:34 PM:T. E. Geek
Thank you very much for your warm comment :-) I'm open to suggestions: If you guys have anything you'd like a tutorial written for- just post it as a comment. If no one says anything, I'll write a tutorial on writing a POP3 client- and eventually- how to use the AIM module to create chat bots ;-) Thanks for the support!
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
1/22/2002 4:54:06 PM:Tr1pX
Write a tutorial for making a client server application where you describe how to make the client comunicate with the server. please send a reply to this to tr1px@hackermail.com
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
1/26/2002 9:30:39 AM:kamal
this script is so good. umph!
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
2/19/2002 3:30:13 PM:Terry Paul
Thanks+for+sharing+your+knowledge+of+the +subject%2E+It%27s+really+kewl+that+ther e+are+still+people+like+you+out+there+do ing+good+and+sharing+what+you+know%21+Ke ep+up+the+awesome+work+bro%21
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
3/3/2002 1:49:03 PM:Taylor
I'd like to know how to connect to AIM using Cold Fusion, ASP, VB, or Perl.
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
3/10/2002 4:02:53 AM:Flaxus
I got a good laugh and learned something usefull at the same time. Thanks a million
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
3/18/2002 3:59:53 AM:Jay
How to use these functions are explained in the wwwlib manpage... This article was not needed.
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
3/19/2002 12:11:33 AM:T. E. Geek
Thats a good point. It should also be noted that C++, C, php, perl, sql, vb, java, pascal, kylix, delphi, xml, JavaScript, VBScript, Bash, Ata, Lisp, Basic, Cobol, and for you mac users, Applescript are all various other computer-related utilities which come with documentation. Surely, one could aster any of these languages quickly and easily with the documentation provided by each. (continued)
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
3/19/2002 12:11:53 AM:T. E. Geek
(continuation) My only question is why amazon.com offers all of those superfluous books on C++, OpenGL, ad nauseum, when anyone could simply learn what they crave through documentation.Documentation is not always user-friendly. I had hoped that this brief introduction would be a bit more friendly and easy to understand than the documentation provided. Furthermore,had you read the title of this article, you would realize that this posting was intended for those that do not know about PWL to begin with.Although this article was not intended for a user of your level, I still regret that my work was not to your liking.
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
3/20/2002 12:13:25 AM:T. E. Geek
On a happier note, as soon as life releases me from my utterly boring school-related responsibilities, I am going to write a tutorial on the Net-AIM module; perhaps even some basic chatter-bot theory. Thank you for being so supportive of this article! I promise many more in the future.
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
4/5/2002 1:44:31 AM:Marcel
First of all thx for your Manual. I still keep on smiling. In order to provide your Manual, i want to complete the Information for Windows-Perl-Users. - Since "ActivePerl 5.6.0.613" the LWP-Module is included ('think it was distributed in the Year 2000...) - It fits ;-) Last not least: This article was needed. Keep on going this way.
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
4/10/2002 7:39:15 PM:Allen
########################## use LWP::Simple; use strict; use LWP::UserAgent; use CGI qw ( :standard); print "Content-type: text/html\n\n"; my $url='http://www.yahoo.com'; my $con = get $url; print "$con"; ######################## Quest ions: 1) It works fine and gets the whole page info of http://www.yahoo.com but PROBLEM: if I switch to a page to this web page, get nothing. Steps Replace: my $url='http://merchantaccount.quickbooks. com/j/mas/signup'; 2) When I am in a webpage, such as yahoo page, I would like to select a radio button and press "Next" to continue. How could I modify the above to do it. Need help on it. Thanks Allen...
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
5/3/2002 3:23:28 PM:Ddl_Smurf
Hey, thanks, that looks simple enough. I do not code in perl, but it looks like a great language. I have experience with Delphi and VB, and C, and quite a few others, but I'm really having trouble getting through perl tutorials. Could you direct me to one that is as simple and friendly as yours ? Thanks, Best regards.
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
5/3/2002 8:25:06 PM:T. E. Geek
Well! I did write an introductory tutorial to the perl language itself awhile back ;-) If you're interested, parse the perl planetsourcecode directory, and see if you can find it. It should help you get your hands dirty ;-)
Keep the Planet clean! If this comment was disrespectful, please report it:
Reason:

 
Add Your Feedback!
Note:Not only will your feedback be posted, but an email will be sent to the code's author in your name.

NOTICE: The author of this article has been kind enough to share it with you.  If you have a criticism, please state it politely or it will be deleted.

For feedback not related to this particular article, please click here.
 
Name:
Comment:

 

Categories | Articles and Tutorials | Advanced Search | Recommended Reading | Upload | Newest Code | Code of the Month | Code of the Day | All Time Hall of Fame | Coding Contest | Search for a job | Post a Job | Ask a Pro Discussion Forum | Live Chat | Feedback | Customize | Perl Home | Site Home | Other Sites | About the Site | Feedback | Link to the Site | Awards | Advertising | Privacy

Copyright© 1997 by Exhedra Solutions, Inc. All Rights Reserved.  By using this site you agree to its Terms and Conditions.  Planet Source Code (tm) and the phrase "Dream It. Code It" (tm) are trademarks of Exhedra Solutions, Inc.