raena-crawler-engine/tokopedia_crawler_engine/Readme.md

13 lines
564 B
Markdown

### Run: ###
* run "python tokopedia_crawler.py"
### Configuration: ###
* Ensure that tables are created already.
* cp conf.json.sample conf.json
* Install zyte certificate - https://docs.zyte.com/smart-proxy-manager/next-steps/fetching-https-pages-with-smart-proxy.html#fetching-https-pages-with-smart-proxy
### Notes: ###
* Cronjob can be setup for 'Master' to run every 1 minute.
* It is expected to capture all product urls in ~107 minutes.
* It makes only 2 API calls per minute(3 in the first minute) to prevent IP blocking.
* Infinite slaves can be added.