Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

Ruby Programming Language

scRUBYt! 0.3.1 released


Hello all,

scRUBYt! version 0.3.1 has been released with a plenty of new features
and bugfixes based on your feedback. Enjoy!

============
What's this?
============

scRUBYt! is a very easy to learn and use, yet powerful Web scraping
framework based on Hpricot and mechanize. It's purpose is to free you
from the drudgery of web page crawling, looking up HTML tags,
attributes, XPaths, form names and other typical low-level web scraping
woes by figuring these out from your examples copy'n'pasted from the Web
page.

===========
What's new?
===========

[NEW] complete rewrite of the output system, creating
       a solid foundation for more robust output functions
       (credit: Neelance)
[NEW] logging - no annoying puts messages anymore!
       (credit: Tim Fletcher)
[NEW] can index an example - e.g.
       link 'more[5]'
       semantics: give me the 6th element with the text 'link'
[NEW] can use XPath checking an attribute value, like
       "//div[@id='content']"
[NEW] default values for missing elements (first version was done in
       0.2.8 but it did not work for all cases)
[NEW] possibility to click button with it's text (instead of it's index)
       (credit: Nick Merwin)
[NEW] clicking radio buttons
[NEW] can click on image buttons (by specifying the name of the button)
[NEW] possibility to extract an URL with one step, like so:
       link 'The Difference/@href'
       i.e. give me the href attribute of the element matched by the
       example 'The Difference'
[NEW] new way to match an element of the page:
       div 'div[The Difference]'
       means 'return the div which contains the string "The Difference"'.
       This is useful if the XPath of the element is non-constant across
       the same site (e.g.sometimes a banner or add is added, sometimes
       not etc.)
[NEW] Clicking image maps; At the moment this is achieved by specifying
       an index, like
       click_image_map 3
       which means click the 4th link in the image map
[FIX] Replacing \240 ( ) with space in the preprocessing phase
       automatically
[FIX] Fixed: correctly downloading image if the src
       attribute had a leading space, as in
       <img src=' /files/downloads/images/image.jpg'/>
[FIX] Other misc fixes - a ton of them!

========
Comments
========

The win32 version is just being built as I am writing this, so it will
be available soon.

Please keep the feedback coming - bug reports, questions, suggestions
are warmly welcome at the scRUBYt! forum - http://agora.scrubyt.org.

Cheers,
The scRUBYt! team - http://scrubyt.org

Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc