perl

Screen Scraping HTML

14 minute read Published: 2005-04-06

We've all found useful information on the web. Occassionally, its even necessary to retrieve that information in an automated fashion. It could be just for your own amusement, possibly a new web service that hasn't yet published an API, or even a critical business partner who only exposes a web based interface to you.

Of course, screen scraping web pages is not the optimal solution to any problem, and I highly advise you to look into APIs or formal web services that will provide a more consistent and intentional programming interface. Potential problems could arise for a number of reasons.

Regular Expression Primer

19 minute read Published: 2004-03-24

"Regular Expression" is a fancy way to say "pattern matcher." Humans can match patterns with relative ease. A machine has a bit more difficulty deciphering patterns, especially in text. As computing became more powerful, the methods for matching text grew into more flexible dialects.

Regular expressions can be one of the toughest concepts to grasp and use effectively in any programming language. Perl is no exception as its regular expressions engine is perhaps the most advanced regex engine in existence. Its power and flexibility also serve to confuse and intimidate many new comers. It is important to understand the Regular Expression engine as its often the cause of serious bottlenecks in programs of all shapes and sizes.


find me: