Small dockerized tool to regularly scrape http://knack.news and write all posts to an sqlite database.
Find a file
2025-12-24 17:58:23 +01:00
scrape Implement Nodes to compute text embeddings 2025-12-24 17:58:23 +01:00
transform Implement Nodes to compute text embeddings 2025-12-24 17:58:23 +01:00
.gitignore Implements Feature to cleanup authors freetext field 2025-12-21 21:18:05 +01:00
docker-compose.yml Adds TransformNode to FuzzyFind Author Names 2025-12-23 17:53:37 +01:00
Makefile Implements Feature to cleanup authors freetext field 2025-12-21 21:18:05 +01:00
README.md Dockerized Scraper 2025-12-20 20:55:04 +01:00

Knack-Scraper does exacly what its name suggests it does. Knack-Scraper scrapes knack.news and writes to an sqlite database for later usage.

Example for .env

NUM_THREADS=8
NUM_SCRAPES=100
DATABASE_LOCATION='./data/knack.sqlite'

Run once

python main.py