内容简介:Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js.Documentation is available online at
Gerapy
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js.
Documentation
Documentation is available online at https://docs.gerapy.com/ and https://github.com/Gerapy/Docs .
Support
Gerapy is developed based on Python 3.x. Python 2.x may be supported later.
Usage
Install Gerapy by pip:
pip3 install gerapy
After the installation, you need to do these things below to run Gerapy server:
If you have installed Gerapy successfully, you can use command gerapy
. If not, check the installation.
First use this command to initialize the workspace:
gerapy init
Now you will get a folder named gerapy
. Also you can specify the name of your workspace by this command:
gerapy init <workspace>
Then cd
to this folder, and run this command to initialize the Database:
cd gerapy gerapy migrate
Next you need to create a superuser by this command:
gerapy createsuperuser
Then you can runserver by this command:
gerapy runserver
Then you can visit http://localhost:8000 to enjoy it. Also you can vist http://localhost:8000/admin to get the admin management backend.
If you want to run Gerapy in public, just run like this:
gerapy runserver 0.0.0.0:8000
Then it will run with public host and port 8000.
In Gerapy, You can create a configurable project and then configure and generate code of Scrapy automatically. But this module is unstable, we're trying to refine it.
Also you can drag your Scrapy Project to projects
folder. Then refresh web, it will appear in the Project Index Page and comes to un-configurable, but you can edit this project through the web page.
As for deployment, you can move to Deploy Page. Firstly you need to build your project and add client in the Client Index Page, then you can deploy the project just by clicking button.
After the deployment, you can manage the job in Monitor Page.
Docker
Just run this command:
docker run -d -v ~/gerapy:/app/gerapy -p 8000:8000 germey/gerapy
Then it will run at port 8000. You can use the temp admin account (username: admin, password: admin) to login. And please change the password later for safety.
Command Usage:
docker run -d -v <workspace>:/app/gerapy -p <public_port>:<container_port> germey/gerapy
Please specify your workspace to mount Gerapy workspace by -v <workspace>:/app/gerapy
and specify server port by -p <public_port>:<container_port>
.
If you run Gerapy by Docker, you can visit Gerapy website such as http://localhost:8000 and enjoy it, no need to do other initialzation things.
TodoList
Communication
If you have any questions or ideas, you can send Issues or Pull Requests , your suggestions are really import for us, thanks for your contirbution.
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
数据驱动:从方法到实践
桑文锋 / 电子工业出版社 / 2018-3 / 49
本书是从理论到实践的全面且细致的企业数据驱动指南,从作者的百度大数据工作说起,完整还原其从零到一构建百度用户行为大数据处理平台经历。详解大数据本质、理念与现状,围绕数据驱动四环节——采集、建模、分析、指标,深入浅出地讲述企业如何将数据驱动方案落地,并指出数据驱动的价值在于“数据驱动决策”、“数据驱动产品智能”。最后通过互联网金融、电子商务、企业服务、零售四大行业实践,从需求梳理、事件指标设计、数据......一起来看看 《数据驱动:从方法到实践》 这本书的介绍吧!