It’s been a while since I posted anything here … both due to not being able to since my hosting provider did not update PHP versions and also due to the fact that I’ve been busy. Both with regular work and also other things related to project “Podmix”.
I started collecting podcasts feed links in September 2017, as I had an itch to scratch, I wanted to build a podcast search engine as the ones I found online sucked or were broken in one way or the other. I collected podcast feed links and was soon up to 30000, not enough, expanded and dug deeper and deeper, 80000 podcast feeds. I kept on digging and finding new directories, scraped them and defined kind of strict rules for my collection; feeds should have published episodes within the last 12 months and should have at least one audio episode.
In September 2020 I was up to about 800 thousand active podcast feeds and I started thinking about a way of publishing all these feeds in a searchable way. Lots of thinking as it was a bit of a hassle to export and move the information in a structured way, exporting the information required almost a gigabyte of disk space, trimmed data, excluded information, got it down to some 700Mb.
I searched around the internet for other solutions and found PodcastIndex.org – which aligned with my ambitions and goals, shot them an email and joined promptly. I added my indexed feeds to the common index on PodcastIndex after some de-duplication and bumped up the numbers a bit, we had a large bulk of same feeds so my initial contribution was a few hundred thousands.
PodcastIndex project aims to enable and make any podcast discoverable, anyone + dog should be able to record, produce and publish their own podcast and no one should be able to stop them. Any podcast should be discovarable without bias, filter bubbles or algorithms impacting the results.
The core, categorized index will always be available for free, for any use.cut from first page of PodcastIndex.org
… and …
Mission and Goalcut from first page of PodcastIndex.org
Preserve podcasting as a platform for free speech.
Re-tool podcasting to a platform of value exchange that includes developers with podcasters and listeners.
As PodcastIndex operates on the Value-for-value assumption, everyone donates or puts in the time, money or talent. Many of the people involved does several of these things, developers putting in time and talent, other just monetary contributions and any combination of these are good.
I’ve continued to dig for podcast feeds in the lesser known and dark corners of the internet, to find, validate and add new feeds to PodcastIndex. Currently up to 2.9 Million feeds, which probably will adjust down to somewhere around 2.8 Million feeds where de-duplication has been done. I have also worked on a massive set of ISO-639-1 and ISO-639-3 language codes, to use for tagging podcast feeds with languages, this is a chunk of about 7000 different languages and dialects, nowhere is there fine-grained search options like this available on the internet, together with geo-location data for these languages, there could be locality search for languages and dialects — imagine expats/refugees/immigrants listening to podcasts from their home region, district or even village.
There is also some headaches around the podcasts, some larger hosters are very bad at describing their feeds, think multiple feeds with the title “April” or just “2018”, where the landing page link describes the podcast better, I could code up fixes for these particular hosters but that would create a Sisyphus-like task of always need to re-describe feeds for them.
This far PodcastIndex have defined a specification for new tags in the podcast:* namespace for use with RSS/Atom feeds, these includes the podcast:locked tag that prevents feeds to be imported at other hosters (also prevents duplication), community created chapters and transscipts, crypto-currency payments through Lightning Coins, where small micro-payments in Satoshis are sent while listening, these can be split in many ways to pay hosting, producers, app developers and onwards. The payment problem has more or less been solved, podcasts won’t need advertising in the future.
If you’re interested in podcasts, are a podcast producer or are a podcast hoster, please join PodcastIndex. We need sharp minds, people able thinking outside of the box and that can imagine new features we can add. You need to bring and open mind and great ideas, no batteries included.