Edit: obligatory explanation (thanks mods for squaring me away)…
What you see via the UI isn’t “all that exists”. Unlike Reddit, where everything is a black box, there are a lot more eyeballs who can see “under the hood”. Any instance admin, proper or rogue, gets a ton of information that users won’t normally see. The attached example demonstrates that while users will only see upvote/downvote tallies, admins can see who actually performed those actions.
Edit: To clarify, not just YOUR instance admin gets this info. This is ANY instance admin across the Fediverse.
There’s a huge difference between Reddit keeping our data “locked away” on their private server vs. a system that puts it all out in public view. You can bet your behind that Big Tech and governments are harvesting ALL of it as we speak. This is MUCH worse than Reddit just selling some data to a few third party actors.
I completely agree that sharing it with other instances is a problem.
This is super nitpicky, but assuming it exposed even a minute amount of the data that Reddit freely ships to whoever buys it (including governments), I actually think it’s far less likely to be seen. Social media companies are well-known to freely give access to anything law enforcement, governments or advertisers would like. Most if not all, have exposed APIs which allow law enforcement at least to collect almost any data at their leisure. This data is packaged up by the orgs who have the data.
Scraping Lemmy for this information would require their own solutions, and backends to handle all the data. Here in the UK, our tecnically-inept government famously broke their multi-billion COVID test-and-trace system because the excel spreadsheet they used as a database, ran out of lines…
Even assuming it’s true that all of these groups have bothered to make their own solutions and bought server space to store the data themselves for a relatively tiny (certainly until very recently), the only data they get is who liked what post/comment.
That is a small snowflake compared to the iceberg that other social media organizations collect, package and sell. Facebook for example collect enough data that they earn more per user than Netflix.
Certainly, as Lemmy and ActivityPub gain more traction, this is a privacy hole which deserves some consideration, and should be immediately plugged. But I just don’t think it’s in the same solar system as exposing data to any social media site.