Dumb question: how do I know if an open source project is trustworthy?

BurningnnTree@lemmy.one · edit-2 6 months ago

Dumb question: how do I know if an open source project is trustworthy?

FizzyOrange@programming.dev · 6 months ago

Maybe all of the stars, forks, and discussions on the GitHub page are from fake accounts

All 9k stars, 10k PRs, 400 forks & professional web site are fake? Come on this is about the most obviously not fake project I’ve seen!

How do you know when a product like this can be trusted?

The same way you tell if anything can be trusted - you look at the signals and see if they are suss. In this case:

Lots of stars
Lots of real code in the repo
Professional looking website with commercial pricing
Lots of issues
Good English

The amount of effort it would take to fake this for very little benefit is enormous.

Maybe I’m just being paranoid.

Yeah just a little!

sus@programming.dev · edit-2 6 months ago

All 9k stars, 10k PRs, 400 forks & professional web site are fake?

Technically, it is entirely possible to find a real existing project, make a carbon copy of the website (there are automated tools to accomplish this), then have a massive amount of bots give 9K stars and make a lot of PRs, issues and forks (bonus points if these are also copies of actual existing issues/PRs) and generate a fake commit history (this should be entirely possible with git), a bunch of releases could be quickly generated too. Though you would probably be able to notice pretty quickly that timestamps don’t match since I don’t think github features like issues can have fake timestamps (unlike git)

though I don’t think this has ever actually been done, there are services that claim to sell not only stars but issues, pull requests and forks too. Though assuming the service is not just a scam in itself, any cursory look at the contents of the issues etc would probably give away that they are AI generated

FizzyOrange@programming.dev · 6 months ago

Yeah possible, but this of the amount of effort that would take!

magic_lobster_party@kbin.run · 6 months ago

There has been instances of popular and well meaning projects become hijacked by hostile actors. A recent notable example is xz, but there’s also event-stream npm package a few years ago that got infected with Bitcoin stealing code.

Just because a protect looks good now doesn’t mean it won’t turn bad in the future.

And not only would you need to audit the project. You also need to audit all of its dependencies as well. The xz vulnerability made it in to SSH. Who would think about looking into xz for vulnerabilities?

The amount of effort it would take to fake this for very little benefit is enormous.

The benefit of installing back doors can be enormous.

FizzyOrange@programming.dev · 6 months ago

A recent notable example is xz, but there’s also event-stream npm package a few years ago that got infected with Bitcoin stealing code.

They’re asking if the entire project is somehow fake, not if it’s a real project that got backdoored. That’s obviously impossible to tell just based on stars, language quality, and similar heuristic signals.

BurningnnTree@lemmy.one · 6 months ago

I agree it does look legitimate, I was just wondering what signs I should look out for in general. Like I’m sure fake GitHub engagement must be a thing, but I don’t know how widespread it is and I don’t know what the threshold is before a project can be considered definitely real. It sounds like you’re saying the level of engagement on this project is well beyond what can be considered sketchy, which is helpful information. Thanks

BananaTrifleViolin@lemmy.world · 6 months ago

As a software developer you should have a bit of a head start - you can read the code - one of the big pluses of open source projects is it’s all there in the open. Even if not familiar with the specific language used you can see the source and get a rough idea of scope and complexity.

And look at the Github details like the age, the frequency between releases, commits, forks. Malicious projects don’t stick around for long on a host site like that, and they don’t get 1000s of stars or lots of engagement from legitimate users. It’s very difficult to fake that.

Look at the project website. Real projects have active forums, detailed wikis, and evidence of user engagement. You’ll see people recommending the project elsewhere on the net if you search, or writing independent tutorials on how to deploy or use it, or reviews on YouTube etc. Look for testimonials and user experiences.

Also look at where the software is deployed and recommended. If it’s included in big name Linux distros repos thats a good sign.

Look at all the things you’d be looking at for paid software to see it’s actually in use and not a scam.

And try it out - it’s easy to set up a VM and deploy something in a sandbox safe environment and get a feeling if it does what it claims to do. Whether that be a cut down system with docker or an entire OS in the sandbox to stress test the software and out it through its paces.

There are so many possible elements to doing “due diligence” to ensure it’s legitimate but also the right solution for your needs.

sus@programming.dev · 6 months ago

for a large project, you can probably look at the history of issues, if there are lots of issues that are 5 years old, it’s almost certainly legit

Vector@lemmy.world · 6 months ago

Don’t forget the 300-comment-long “+1” feature request chains

grue@lemmy.world · edit-2 6 months ago

Trustworthy as opposed to what, some random proprietary product? Do you think you’re gonna somehow do better on that front with code that’s secret?

Now, don’t get me wrong: I’m not saying that every Free Software project is trustworthy. I’m just saying that as a first-pass screening criterium, rejecting everything that isn’t Free Software is a pretty good one.

Kecessa@sh.itjust.works · 6 months ago

At the same time it’s much easier to sue a company you know compared to LollipopCat35 from GitHub

grue@lemmy.world · edit-2 6 months ago

I feel like if the main advantage of something is that it’s easy to sue, it’s probably a bad choice to begin with. Instead, your criteria should probably be more about minimizing the chance of things going that wrong.

Free Software has an important advantage on that front too, by the way: you have the recourse of being allowed to fix it yourself. That is kinda the whole point of why RMS invented it in the first place, after all!

Kecessa@sh.itjust.works · edit-2 6 months ago

Minimizing the chance of things going that wrong… So not trusting anonymous people on the internet?

How many FOSS users are actually able to understand or fix the programs they use? Do you systematically check the code of everything you get from GitHub?

I understand the principle and I do use FOSS, I just don’t make myself believe that more than a ridiculously small minority of people actually check the code of what they’re installing.

hperrin@lemmy.world · 6 months ago

You don’t, really. And even a trustworthy open source project can be infiltrated or sabotaged. Basically, you just have to rely on the reputation of the organization or developers behind it.

RobotToaster@mander.xyz · 6 months ago

Personally I wouldn’t trust it.

First red flag🚩: there’s an “enterprise” self hosted version.

Second red flag🚩: It isn’t open source, the licensing structure is confusing 🚩, but it appears to be at best some mix of source available🚩 and open core🚩 (core available?).

BurningnnTree@lemmy.one · 6 months ago

Can you explain why the enterprise version is a red flag? Would you expect the company to make money some other way?

RonSijm@programming.dev · 6 months ago

It’s not a big red flag, but it indicates that the product is not fully open source. You can get the full community edition from Github, but for the Self-hosted Enterprise version you have to contact sales.

So all the Enterprise features are most likely closed source, and when you buy/license it, you’ll just get the compiled version. And since their Cloud hosting model has a “Per 1,000 sessions/mo” model, their Enterprise self hosted model might have that as well. So it’ll have some kinda DRM/License managing, and maybe a “call home” to check your license or usage every once in a while

nik9000@programming.dev · 6 months ago

The point of the license combination they use is to allow the enterprise version to be open and live in the same repo as everything else. Dunno if that’s what they do, but that’s why the elastic license exists.

SatansMaggotyCumFart@lemmy.world · 6 months ago

If you are looking for this program for free how much is the information that you are asking about worth to you?

BurningnnTree@lemmy.one · 6 months ago

What do you mean?

SatansMaggotyCumFart@lemmy.world · 6 months ago

I’m quite good at looking into this type of thing, but my services are not cheap.