forked from perception/dss
8fea61553679ed97adabfc773c5c57c869f17da7
Daily Stormer Utilities
This is a little project to help me get out of my funk.
Goals
- Crawl all dailystormer articles.
- Save crawl results as JSON files.
- Be able to run the crawler again to get new articles.
- This implies that I can keep track of what I've already crawled.
- Make it possible to do a full text search on those articles.
Questions
Crawling
The recursive crawl to get old articles and the crawl to update should be different. The updater can assume a big crawl has already been done, and just pull articles down from the RSS feed. However, the recursive crawl doesn't have that luxury, because it must find past articles and handle newly discovered tags and categories as new crawl targets.
Description
Languages
JavaScript
64.2%
HTML
32.8%
TSQL
2.1%
Shell
0.9%