内容简介:The documentation in this repository describe the FullStack webscrapping platform for use in Machine learning.
The documentation in this repository describe the FullStack webscrapping platform for use in Machine learning.
Architecture
We first break the architecture into four distictive components namely Front-End, API, Scrapers and Database. The user sends information from the front-end to the API, the fron-end connects the API through a form. Inputs like the youtube URL are sent through front-end. Later the scrapers through the API pulls the necessary data and is saved to the database. Afterwhich the data is served to the front-end.
The Tech Stack are as below
- Front-End - javascript
- API - express
- scraper - puppeteer
- db - mysql (typeorm)
Also we need nodejs, npm and mysql.
The Architecture consists of several components:
Front End
For the Front-end we will have a header, an input box and a button. Below which we will have render boxes which renders relevant info from json. This will send data to the API.
API
We will have to create a single route with two methods GET and POST. We use nodejs and simple backed framework express.
Scraper
This function takes in URL and reaches out to YouTube, fetch the relevant data and then store it into the database.
Database
We use mySQL here. Here we add id, name, avatar and channelURL
To run the program
First go into server
$ npm install init
Install all the necessary packages
$ npm install express $ npm install body-parser
Run the index.js script
$ node index.js
Thanks to Aron from Uber
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
互联网的误读
詹姆斯•柯兰(James Curran)、娜塔莉•芬顿(Natalie Fenton)、德 斯•弗里德曼(Des Freedman) / 何道宽 / 中国人民大学出版社 / 2014-7-1 / 45.00
互联网的发展蔚为壮观。如今,全球的互联网用户达到20亿之众,约占世界人口的30%。这无疑是一个新的现象,对于当代各国的经济、政治和社会生活意义重大。有关互联网的大量大众读物和学术著作鼓吹其潜力将从根本上被重新认识,这在20世纪90年代中期一片唱好时表现尤甚,那时许多论者都对互联网敬畏三分,惊叹有加。虽然敬畏和惊叹可能已成过去,然而它背后的技术中心主义——相信技术决定结果——却阴魂不散,与之伴生的则......一起来看看 《互联网的误读》 这本书的介绍吧!