With all the talk about posts being lost due to people deleting comments, posts, and subs going private or generally protesting, I wondered if people have an appetite for doing some work to move key bits of reddit history over to kbin/Lemmy for posterity?
Actually even subreddits are affected by the 1000 indexing limit IIUC. So we would have limitations on what content we could discover without an external source.
I guess we could grab from the pushshift torrents and use API access to grab as much as we can of the last couple of months? (Pushshift lost access at the start of May iirc so that’s where the gap would start.) Also getting stuff from subs still protesting as private would be a problem.
Basically not a fan of the API approach.