You are here: Home / LBN / Up2date / Plone and Zope / BastionLinux 19 / transmogrify.webcrawler-1.2.1-7.lbn19.noarch

transmogrify.webcrawler-1.2.1-7.lbn19.noarch

Package Attributes
RPM  transmogrify.webcrawler-1.2.1-7.lbn19.noarch.rpm Architecture  noarch Size  1215826 Created  2019/09/30 06:55:35 UTC
Package Specification
Summary Crawling and feeding html content into a transmogrifier pipeline
Group Application/Internet
License ZPL
Home Page http://pypi.python.org/packages/source/t/transmogrify.webcrawler/transmogrify.webcrawler-1.2.1.zip
Description

A source blueprint for crawling content from a site or local html files.

Webcrawler imports HTML either from a live website, for a folder on disk, or a folder on disk with html which used to come from a live website and may still have absolute links refering to that website.

To crawl a live website supply the crawler with a base http url to start crawling with. This url must be the url which all the other urls you want from the site start with.

Requires
rpmlib(PayloadFilesHavePrefix)  
rpmlib(FileDigests)  
rpmlib(CompressedFileNames)  
rpmlib(PartialHardlinkSets)  
rpmlib(PayloadIsXz)  
Provides
python2.7dist(transmogrify.webcrawler)
python2dist(transmogrify.webcrawler)
transmogrify.webcrawler
Obsoletes
transmogrify.webcrawler-egginfo

Document Actions