Create a feeds.jsonl file with the following format:
{"url": "https://pod.url1.com/FFFFF", "name": "Podname XYZ"}
{"url": "https://pod.url2.com/FFFFF", "name": "Podname ABC"}
{"url": "https://pod.url3.com/FFFFF", "name": "Podname DEF"}Next, run the script to download all the missing episodes and metadata. By default it will be stored under the pods directory, with one sub-directory per podcast.
We recommend if you plan on serving this over the web to use a different directory, such as /srv/www/petit-pois/pods.
python3 download_podcasts.py \
--archive_dir /srv/www/petit-pois/podsAgain, if you're interested serving, we don't want expose the podcast to just anyone, so we need to create a token for each podcast. This is done by running the generate_tokens.py script:
sudo python3 generate_token_map.py \
--archive_dir /srv/www/petit-pois/pods \
--map_file /etc/nginx/podcast_tokens.mapNext, run the script to generate the feeds, with the optional inclusion of a token map file:
python3 generate_feeds.py \
--archive_dir /srv/www/petit-pois/pods \
--base_url http://pods.yourdomain.com/ \
--map_file /etc/nginx/podcast_tokens.mapNow, each podcast will have a archive.xml file in its directory.
If you want to serve the files using a web-server, there are a few options. The next section gives an example using Nginx.
This tool is meant for personal archival, preservation, and research use only. It helps you download and locally serve podcast episodes and metadata to create a self-hosted or offline archive.
Please make sure you're respecting copyright laws and the original creators' terms of use. Many podcasts are protected by copyright, and redistributing or republishing them (especially publicly) without permission might be illegal.
Before archiving or sharing anything, it’s a good idea to:
- Check the podcast's license or usage terms
- Look for any Creative Commons indicators
- Read up on fair use if you're in the U.S., or fair dealing in other countries such as the U.K.
Install Nginx:
sudo apt update && sudo apt install nginxCreate a config file (e.g., /etc/nginx/sites-available/petit-pois):
map $secure_token $podcast_dir {
default "";
include /etc/nginx/podcast_tokens.map;
}
server {
server_name podcasts.archive.example.com;
location ~ ^/secure/([^/]+)/(.+)$ {
set $secure_token $1;
set $filename $2;
if ($podcast_dir = "") {
return 403;
}
# Optional debug logging
error_log /var/log/nginx/podcast_debug.log info;
root /srv/www/petit-pois/pods;
try_files /$podcast_dir/$filename =404;
}
# Optional: deny bare token URLs like /secure/abc123/
location ~ ^/secure/([^/]+)/?$ {
return 403;
}
location /pods/ {
deny all;
}
location = / {
deny all;
}
###### 🔐 TLS CONFIG ######
listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/podcasts.archive.example.com/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/podcasts.archive.example.com/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
server {
if ($host = podcasts.archive.example.com) {
return 301 https://$host$request_uri;
} # managed by Certbot
listen 80;
server_name podcasts.archive.example.com;
return 404; # managed by Certbot
}
Enable the site and restart nginx:
sudo ln -s /etc/nginx/sites-available/petit-pois /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginxFurther information on Nginx and web server configuration is outwith the scope of this guide.
In some cases you may already have local MP3 files (or partial archives) and want to generate a valid podcast feed without downloading from an RSS source.
For this, use bootstrap_local_podcast.py.
This script creates per-episode metadata files so that the existing feed generator can consume them unchanged.
Use bootstrap_local_podcast.py if:
- You already have episode audio files locally
- Some episodes are missing or private
- The original feed no longer exists
- You want a complete historical feed, even with gaps
The bootstrap script consumes a JSONL (JSON-per-line) file describing episodes.
Each line represents one episode, with the following format:
{"episode":"Episode Title","date":"YYYY-MM-DD","file":"audio_file.mp3"}The archive I had only had this information, if you have more, you may want to modify/extend the script.
When an episode has "file": null, this means the episode is missing.
You can then run:
python3 bootstrap_local_podcast.py \
--jsonl metadata.jsonl \
--podcast_dir path/to/archivePlacing the output in /srv/www/petit-pois/pods, you can then run the rest of
the serving pipeline.