- Shell 100%
| .github | ||
| maintenance | ||
| LICENSE | ||
| README.md | ||
| smallcomic.original.txt | ||
| smallcomic.txt | ||
| smallcomic.verified.txt | ||
| smallweb.original.txt | ||
| smallweb.txt | ||
| smallweb.verified.txt | ||
| smallyt.original.txt | ||
| smallyt.txt | ||
| smallyt.verified.txt | ||
Small Web list
This is a fork of kagisearch/smallweb.
The purpose of this Small Web repo is to collect the RSS/Atom feeds of small, individual websites — personal blogs, independent creators — rather than big commercial output. And making sure that all urls are human-verified.
The repo over at Kagi was a great start but eventually the ceo started vibe-adding domains that absolutely don't belong in such a list. So I've decided to fork it and remove any lines he's added since 2025-11-01.
The plan for this fork
So what I've done is kept the current 'upstream' files in the .original.txt files for now. The current .txt files are the original ones but those lines from Vlad stripped out since that date. So that file SHOULD MOSTLY BE FINE. BUT because I wouldn't be surprised if accepting PR's also was pretty loose at some point, I figured it wouldn't be a bad idea to just start a manual review process. So the .verified.txt files are those. They're empty at the start of making this fork but plan is to fill them overtime as I go over the urls in the .txt file. Eventually once everything is verified, then the .verified.txt files will go and we'll stay with the normal .txt files. So if you have an application that uses the Kagi smallweb .txt files, you can use this repo instead. (I wouldn't use the .verified.txt files cause those will be much smaller and eventually get deleted anyway once everything has been verified)
What I've also done is removed any of the app/docker stuff from the original repo. I myself do not plan to maintain a viewing app or combined rss feed. I'm not against someone contributing that BUT I'm more in favor of you maintaining your own app and this repo sticks to maintaining the list.
What has yet to be done
- Maybe reconsider some of the guidelines for inclusion? (already made some changes like removing the recency criteria)
- Consider removing the non-english requirement, this seems like a very American thing to have been decided the original repo. BUT it probably wouldn't be a bad idea to have languages listed or something although that would then result in a csv instead of a txt.... 🤷 tbd (writing this now: maybe we could have a csv and then generate various .txt's based on certain criteria?)
- Remove yt channels that have now gone above the 100k sub requirement.
⚠️ Guidelines for site inclusion to the list ⚠️
Add a new personal blog RSS/Atom feed to the list. Rules:
- If submitting your own website, you must add at least 2 other sites that are not yours (and are not in list yet) in the same commit.
- Locate and submit the RSS/Atom feed of the website. Place in the file so that it remains sorted.
- Content must be in English (currently, other languages are not accepted).
- No illegal or NSFW content.
- No auto generated, LLM generated or spam content. (see below for the nuance)
- Only personal blogs may be submitted (no multi-author blogs).
- The site must not contain any forms of advertisements or undisclosed affiliate links
- Site should not have popups (newsletter signup, cookie selection, etc)
- A Youtube channel must not post more than twice a week.
- A Youtube channel must have fewer than 100,000 subscribers.
For comics:
- Must be independently created art (no AI generated content)
- RSS feed must show the full comic in the feed
- No commercial syndicated comics
Hint: To extract the RSS link from a YouTube channel, you can use this tool.
Remove a site or a channel
Remove a website if :
- It does not adhere to the above guidelines
- In the removal request, state which guideline does it break
Clicking "Remove website" will edit small web list in new tab, where you can locate and remove the website feed in question. Make sure to add in comments the reason for removal.
Small web is beautiful
What is Small Web exactly? Recommend reading:
- https://neustadt.fr/essays/the-small-web/
- https://benhoyt.com/writings/the-small-web-is-beautiful/
- https://wychwitch.neocities.org/finals/why
- https://ar.al/2020/08/07/what-is-the-small-web/
- https://news.ycombinator.com/item?id=29768197
Info
smallweb.txt - Contains the feeds of indexed blogs
smallyt.txt - Contains the feeds of indexed YouTube channels
smallcomic.txt - Contains the feeds of indexed independent comics
yt_rejected.txt - Contains the list of YouTube channels that were reviewed (in an automated way) and rejected
Sources
Small web
The original list of small web blogs has been assembled from various sources including:
- https://github.com/outcoldman/hackernews-personal-blogs
- https://news.ycombinator.com/item?id=22273224
- https://news.ycombinator.com/item?id=15154903
- https://news.ycombinator.com/item?id=30245247
- https://news.ycombinator.com/item?id=29758396
- https://news.ycombinator.com/item?id=27302195
- https://github.com/rushter/data-science-blogs
- https://github.com/kilimchoi/engineering-blogs#-individuals
- https://github.com/ysfelouardi/awesome-personal-blogs?search=1
- https://ooh.directory/blogs/personal/
- https://indieblog.page/all
- https://biglist.terraaeon.com
- https://tech-blogs.dev
- https://hn-blogs.kronis.dev/all-blogs.html
- https://dm.hn
- https://uses.tech
- https://nownownow.com
- https://personalsit.es
YouTube channels
The seed list for YouTube channels has been assembled from these HN discussions.
- https://news.ycombinator.com/item?id=32220192
- https://news.ycombinator.com/item?id=25647657
- https://news.ycombinator.com/item?id=32378309
- https://news.ycombinator.com/item?id=20385679
- https://news.ycombinator.com/item?id=24374979
- https://news.ycombinator.com/item?id=24589474
- https://news.ycombinator.com/item?id=24671019
- https://news.ycombinator.com/item?id=35120777
- https://news.ycombinator.com/item?id=12702651
- https://news.ycombinator.com/item?id=17202615
- https://news.ycombinator.com/item?id=29666539
Useful commands
Show duplicate domains:
awk -F/ '{print $3}' smallweb.txt | sort | uniq -d | while read domain; do echo "$domain"; grep "$domain" smallweb.txt; echo ""; done