Pages tagged screenscraping:

HTML Scraping with scRUBYt! for Fun and Profit
http://advent2008.hackruby.com/past/2008/12/23/html_scraping_with_scrubyt_for_fun_and_profit/

Wait
Navigation is fairly obvious I guess (the other actions besides fetch - which should be always present as the first step - are fill_textfield, fill_textarea, click_link, check_checkbox, check_radiobutton, select_option, submit and if you can’t submit the form automatically for some reason, click_by_xpath as the last resort.
Peter Szinek walks us through the process of scraping data from web sites with scRubyt!. Impress your friends (and even your mother!) this Christmas with your slick data mining skillz!
Web Spidering and Data Extraction with scRUBYt! | Ruby Pond
http://rubypond.com/articles/2008/12/09/web-spidering-and-data-extraction-with-scrubyt/
David Ziegler's Blog - A Python Script to Automatically Extract Excerpts From Articles
http://blog.davidziegler.net/post/122176962/a-python-script-to-automatically-extract-excerpts-from
I recently had to write a script that takes a link to an article and returns a title and brief excerpt or description of that article
I recently had to write a script that takes a link to an article and returns a title and brief excerpt or description of that article. Ideally, the excerpt should be the first few sentences from the body of the article.
I recently had to write a script that takes a link to an article and returns a title and brief excerpt or description of that article. Ideally, the excerpt should be the first few sentences from the body of the article. The first thing I struggled with was something I thought would be trivial: fetching the contents of the webpage.
text=re.compile("DOCTYPE")