2019 iT 邦幫忙鐵人賽

DAY 22

AI & Data

量化投資與機器學習研究系列第 22 篇

22.爬蟲工具SpiderKeeper修改bug

2019鐵人賽

kuankuan

2018-11-05 22:05:51

1743 瀏覽

分享至

今天又沒工作了

首先有人對SpiderKeeper修了bug在這Here ，不過我用一下發現還有bug，當排程太多會動不了，原因在APScheduler這個工具，所以又修改一下新增可用兩個參數
--executor
ThreadPoolExecutor ,default: 30
--process
processpool ,default: 3
如果卡住的話，請調高這兩個，到目前還沒遇到什麼bug，如果怕危險，其實用scrapyd操作就好了，不然等我出問題了我會再通知

還有提醒一下SpiderKeeper跟scrapyd都沒有密碼保護，一定要用像Nginx的服務保護，不然很危險

SpiderKeeper-2-1

This is a fork of SpiderKeeper. Here is the changes

0.2.5 (2018-09-17)

add APScheduler executor & process options

A scalable admin ui for spider service

Features

Manage your spiders from a dashboard. Schedule them to run automatically
With a single click deploy the scrapy project
Show spider running stats
Provide api

Current Support spider service

Scrapy ( with scrapyd)

Screenshot

job dashboard
periodic job
running stats

Getting Started

Installing

pip install spiderkeeper-2-1

Deployment


spiderkeeper [options]

Options:

  -h, --help            show this help message and exit
  --host=HOST           host, default:0.0.0.0
  --port=PORT           port, default:5000
  --username=USERNAME   basic auth username ,default: admin
  --password=PASSWORD   basic auth password ,default: admin
  --type=SERVER_TYPE    access spider server type, default: scrapyd
  --server=SERVERS      servers, default: ['http://localhost:6800']
  --database-url=DATABASE_URL
                        SpiderKeeper metadata database default: sqlite:////home/souche/SpiderKeeper.db
  --no-auth             disable basic auth
  -v, --verbose         log level
  --executor            ThreadPoolExecutor ,default: 30
  --process             processpool ,default: 3
  

example:

spiderkeeper --server=http://localhost:6800

Usage

Visit: 

- web ui : http://localhost:5000

1. Create Project

2. Use [scrapyd-client](https://github.com/scrapy/scrapyd-client) to generate egg file 

   scrapyd-deploy --build-egg output.egg

2. upload egg file (make sure you started scrapyd server)

3. Done & Enjoy it

- api swagger: http://localhost:5000/api.html

Authors

Initial work - DormyMo
Fork author - kalombo
Fork kalombo - shiaukuan

License

This project is licensed under the MIT License.

Contributing

Contributions are welcomed!

21.第一次炒股就上手

23.股票要不要繼續凹

系列文

量化投資與機器學習研究共 30 篇

RSS系列文訂閱系列文

57 人訂閱

完整目錄

直播研討會

{{ item.channelVendor }} {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

參賽組數

1064 組

團體組數

40 組

累計文章數

22195 篇

完賽人數

600 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# windows server linux css react vue.js

IT邦幫忙

量化投資與機器學習研究系列 第 22 篇