I saw this post and I was curious what was out there.
https://neuromatch.social/@jonny/113444325077647843
Id like to put my lab servers to work archiving US federal data thats likely to get pulled - climate and biomed data seems mostly likely. The most obvious strategy to me seems like setting up mirror torrents on academictorrents. Anyone compiling a list of at-risk data yet?
shaarli bookmarks + hecat (
shaarli_api
importer +download_media/archive_webpages
processors +html_table
exporter for the HTML index)