Description:agCrawler is Grid customised version of Apache Nutch.
Abstract:agCrawler is a customized version of Apache Nutch (http://nutch.apache.org/), an highly extensible and scalable open source Web crawler. Its main goal is to discover resources on the Web (i.e. URLs), starting from some Web sites defined by the user.
The application is available for download at https://github.com/agrisfao/agrotagger/tree/master/crawler. It is a command line application, provided with some bash scripts that can be executed in a Linux environment.