Scrapy爬虫框架详细解析 Portia:Scrapy 可视化爬取, 我们在网络数据抓取网页内容抓取的时候不得不提到的框架是Scrapy框架, 但是有UI界面的Pyspider要简单易用很多就是因为有WEB UI界面. 但是有了可视化UI的Scrapy则大大加强.

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages.

Try it out

To try Portia for free without needing to install anything sign up for an account at scrapinghub and you can use our hosted version.

Running Portia

The easiest way to run Portia is using Vagrant.

Clone the repository:

git clone https://github.com/scrapinghub/portia

Then inside the Portia directory, run:

vagrant up

For more detailed instructions, and alternatives to using Vagrant, see the Installation docs.

Documentation

Documentation can be found here. Source files can be found in the docs directory.


欢迎投稿 职场/创业方向. 邮箱wangfzcom(AT)163.com:王夫子社区 » Scrapy爬虫框架详细解析 Portia:Scrapy 可视化爬取

    标签:

点评 0

评论前必须登录!

登陆 注册