I was thinking about this recently… By going to a federated system, one that essentially copies all of your content from one instance to another, when you delete a comment, does that comment get deleted on every instance? Is that even possible?
All online systems suffer from this problem.
Bots are scraping websites daily, including places like archive.org, where they compile everything and save it for posterity. Half the time, your data is already saved by a third party, even if you delete it off a website.
Further, all databases have the option to flag something as “Deleted” and keep the original data while not showing the data on the main web page. Just because you “deleted” something online doesn’t fucking mean anything at all materially. It just means they are hiding it from end-users. The data is very likely still there. This is why people who are bulk-deleting their comments on Reddit are shocked to find those comments later restored… because they were never actually deleted to begin with. They were just flagged in the database as “deleted” and to not be shown to end-users.
Unless you are running your own server and your own service, you are at the mercy of strangers who are in full control over whatever data you share with them.
This has always been true, since the beginning of the internet.
This is why parents in the 90’s told kids to not post personal stuff online.
Because once it is sitting on a hard drive on a server owned by someone else, it is not legally any longer your data, it is now the data of the person who owns/operates that server and the hard drive.
Sorry for this message being kind of aggressive, I am very tired of everyone just figuring this out for the first time and thinking somehow it only applies to the Fediverse.
It applies to every single service you sign up for on the internet. You’re storing your data with someone else, and you don’t control the server software, database software, or hardware. That data is no longer yours. You are effectively hanging out on someone else’s property, and what you do on their property is being recorded.
This is not a Fediverse problem, this is an Internet problem.
EDIT: Forgot to add, it’s also the problem that the Fediverse is trying to help solve by allowing individuals to run their own instances and thus be in greater control of what happens to their own data.
I disagree. It is not an internet problem, it is a result of the fundamental properties of data that we couldn’t change if we wanted to.
No, centralized social networks suffer less from this problem. If all data is stored on one platform, only that platform needs to delete it and it’s gone. If they don’t, they risk enforcement by authorities. In the fediverse, every instance has to delete it and there are too many to effectively enforce.
Just to add some nuance;
Companies do delete data on individuals when they have no more economic value to them unless they’re required by regulation to retain that data. Yes it’s true the world is storing terabytes more of data per day, but my company holds on to customer records for 5 years, if they don’t do business with us in this 5 years we will physically delete that data everywhere. There’s many use cases like this where old data isn’t stored because it doesn’t make economic sense to. Maybe when there’s a next gen parquet file that can store a decades worth of records in the size of a few KB, but at a certain point data does rot.
I think you have a pretty weird understanding of “privacy” if you think that you have it when posting a comment in a publicly-accessible forum.
If you post it in a place I can find it, I can scrape it, store it, use it for my own putposes, in perpetuity. You might be able to convince a government to tell me to stop, but there is no guarantee I haven’t stored it somewhere you and they don’t know about.
That’s simply the nature of information. You don’t get to control my memory. Once you’ve put an idea in my head, you don’t get to take it back. That idea you put in my head is now my idea. It’s my thought.
You can’t unring the bell. You can keep a thought private, or you can post it. But once you’ve posted it, you can’t make it truly private again.
I beg to differ. It’s indeed possible to scrape and store any comment indefinitely, but there are certainly ways to limit the size and prevalence of that happening. With rate limiting, bot detection and legal enforcement you can reduce the likelihood that someone will scrape and store all your comments. By accepting that everything will be scraped, you are unnecessarily conceding privacy.
What the hell are you even talking about?
A post in a publicly accessible forum is a billboard on the highway. You put it up and anyone can read it. You have zero expectation of privacy after having done that.
Changing the speed limit on the highway (“Rate Limiting”) in no way affects the fact that you put up the billboard on the first place. People may be driving by a little slower, but they’re only reading what you chose to present for them to read.
Scraping does not infringe on privacy. The privacy infringement is that you made the post in the first place. Under normal circumstances, you are the only person at all capable of infringing on your privacy. Exceptions would be someone spoofing your credentials to create the post without your authorization, but someone who does that victimizes both you and the forum hosting your post.
What you’re talking about is more closely related to intellectual property protections like copyright. A musician can play their song over the radio without surrendering copyright protection. Nobody else can make (commercial) use of that song just because it has appeared in a public space.
Public chats are, well public. If you are in a public chat then everyone can see what you say. Encryption or any other attempt to make it private are silly here. If you are in a private, encrypted group, then only those people can see what you say (unless someone leaks). If you are in a e2ee personal chat with one other person, then only the 2 of you know what is being said. If you send a regular email that is the same as a postcard and anyone can look at what it says. You choose where and how you want to speak and adjust accordingly.
If you’re talking to the public, nothing you say is private. That includes federated systems like Mastodon and Lemmy. If you want privacy and federation, using an encrypted Matrix chat. There’s still of course the caveat that the people you’re talking with can leak your chats, since they have a copy of them, so don’t talk to glowies.
What’s a glowie?
look it up basically means federal agent/operative
Posting something on a public forum will never be private, no matter where exactly. There’s so much ways for this content to get “saved” like web scrapers, web archive, screenshots etc.
Never rely on being able to delete anything that has been published/posted. If you want privacy, don’t post it. Yes, some systems make it easier to delete a post, but you can never rely on it being deleted everywhere (someone could have made a screenshot, etc.).
If it leaves your box it’s no longer yours. Even if it doesn’t leave on the wire and you delete it from disk there are readily found forensic tools that can recover lost data if you get an old drive in hand. It has been said the internet never forgets, and it keeps being proven true time and again whenever someone gets called out for something they said 10 years ago.
Expect the future, own your past, make your marks and grow as you go.
Is it okay to encrypt a home server hard drive in this case?
That’s always an option, and my usual go-to when disposing of drives at least. It gets a bit scary to do so with the main prod data though, lose a key and everything is toast. If you have a solid means to keep crypto keys secure and redundant though by all means. It can put a hit on CPU and disk performance depending on how many random read/writes it has to do. I wouldn’t think it’s a great plan with a lot of fedi services just because of that factor. My mastodon instance has something like 116GB of attachment data in almost half a million objects, that’s a lot of encrypt/decrypt action to maintain.
I’m not all that concerned with ACTUAL privacy/encryption but rather more concerned with lower-level things like stalking, harassment, employers doing research about their employees’ non-work habits, insurance companies, etc.
I’m not talking about doing anything illegal and hiding from authorities who can use forensics on your data. Just general anti-corporate snooping and anti-harassment privacy protection.
Like, I feel more inclined to sign up and use something more like Raddle.me instead of lemmy because the owner of that site has a philosophical mission in favor of privacy.
because the owner of that site has a philosophical mission in favor of privacy.
Daniel Micay, the head programmer of GrapheneOS thankfully stepped down from his position, but not after entirely torching the goodwill of Louis Rossman, who liked GrapheneOS because it respected his privacy. Louis was then accused by Daniel of trying to destroy the GrapheneOS project and threatened with “exposure” which Louis expertly documented and lead to the GrapheneOS developer stepping down because of how absolutely unhinged he looked accusing Louis of this.
https://www.youtube.com/watch?v=4To-F6W1NT0
How are you so sure that the owner won’t pop off on you in such a way in the future? Lemmy at least you can 1. run your own instance and be in tighter control of your data and 2. If you really want to make it more secure, contribute to the codebase or 3. Make your own fucking fork of the codebase that is more secure and privacy oriented. Raddle may be open source, but it doesn’t look like you’re encouraged to run your own Raddle.
Also, you’re still handing your data off to a stranger, who has made promises. What about those promises makes you think this stranger will keep them? It’s still inherently a risk, even if they never end up doing anything nefarious. You just don’t know their mind and can’t know their mind, and being just a user instead of someone who actually knows them in person, you’re only basing it on promises they’ve made in an attempt to try to draw people to use their service. Are you really sure the code that is running on Raddle.me is exactly the same as the open sourced codebase? This is a question that regularly gets asked in respect to Signal Messenger, is the code on the servers the same as what is actually released. How far does this “trust” based on words alone, go?
To quote Mark Zuckerberg about people sharing information with him and why:
people just submitted it
i don’t know why
they “trust me”
dumb fucks
You know whose mind you can know and trust? Your own. Thus making your own instance.
And last but not least… You’re already here. You’re making a post about this here. You have an account. You have 23 posts and 352 comments. Sorry to say but you’re just not that worried about this issue, so this feels a little like concern trolling.