Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Manjaro Linux prepares to enable telemetry by default (manjaro.org)
88 points by uldus on Nov 4, 2024 | hide | past | favorite | 100 comments


Based on this thread, it seems like the initial idea was 'just enable it by default and let people disable it' but that quickly changed after the first objections came in. Whatever step the proposal is in right now seems to be focused on making a system to gather telemetry first, I can't see any definitive decision on opt-in vs opt-out (though opt-in through the installer/welcome screen seems to be the best solution I've seen).

This is not unlike Debian's telemetry collection, which also asks nicely if you'd like to share your information with the project.


Asking for it and not enable it by default is the only way to abide by European and South American Data Protection laws..


My understanding (and I am not a lawyer) is that under European data protection law the important thing is to obtain user consent for this; I think there's a very reasonable argument that informing the user that you collect telemetry and that if they wish to avoid this they should just build their own copy of the software (which provides a very easy to access opt out which should satisfy everyone).

Although EU privacy and technology regulation is generally pretty ok, this seems to be one of those cases where their lack of technical skill or knowledge really shines through (other examples include the endless cookie banners and https://www.euronews.com/next/2024/07/22/microsoft-says-eu-t...)


Consent needs to be freely given; you can't nudge users into it and you can't hold access ransom over it. There's no way what you're suggesting would fly.


I've been told that if you have mandatory telemetry in your application that's fine because the user has a way to opt out (it's a free market and they don't have to use your software). I believe the territory where you add an opt-out is a bit murkier.


"Opt-in or pound sand" is explicitly not allowed.


I'm not an expert and not on either side, but couldn't a notice like "by agreeing to these terms you allow us to turn on telemetry by default, and you are free to simply not use this software instead" be allowed?


Nope, consent cannot be a prerequisite of using the service/software, if it is available in the EU (or UK, since they grandfathered in GDPR after brexit) it must be usable with or without consent.

That is the reason many local non-EU ad-supported businesses (like local papers in the US) outright block all EU traffic. For example if I go to https://www.chicagotribune.com/ I get a blank page saying "This content is not available in your region".

Manjaro could do something similar by just blocking EU users from downloading it.


Absolutely NOT!


Why not? Can you cite a specific law text ?


I don't know the law, but "build it yourself lol" is hardly easy, especially for software that needs to be constantly updated for security.


I don't think a "reasonable person" from the perspective of a court (non-developer, non-technical, end-user) can be expected to know (or even learn) how to compile software in this way, not to mention other downsides it has (like lack of updates and possibility to create new bugs) so I don't think this would be allowed, but it's up to a judge to decide on a case by case basis, not us armchair experts.


> I don't think a "reasonable person" from the perspective of a court (non-developer, non-technical, end-user) can be expected to know (or even learn) how to compile software in this way

I mean I don't think the EU can oblige you to make your software available to people who don't know how to use a computer.


Well that's a hot take if I ever heard one.


It's good that we have operating systems that are easy to use (e.g. Mac OS, Windows), but this is not a priority for Linux desktop distributions (which is fine); what counts as easy to opt in/out of is very contextual.


Why did MS comply to the EU request on installations outside of the EU?

MS Windows with crowdstrike BSOD'd for American airlines on the American soil afterall.


> Why did MS comply to the EU request on installations outside of the EU?

Because it's really expensive to maintain two versions of the same kernel?


"Click yes to consent and continue installation, click no to exit the installer and be redirected to a manual on how to build your own copy" would be in violation of the "consent must be freely given" stipulation of the GDPR.

You are more likely to get a regulator to agree to a version without consent (by minimizing personal data and arguing that your legitimate interest outweighs the weight of the little PII) than getting them to agree to your hostage situation


While I get the point you are making, I find it a bit over the top that you'd consider agreeing to telemetry in exchange for using the free software as tantamount to being held hostage.

In case it needs to be said, I'm 100% in favor of strong privacy protection laws.


Does that only hold if the data collection contains PII and isn't considered necessary for the product?

Either way I expect Manjaro's collection would be an issue if its opt-out, just curious how those edges of that law are defined.


I don't know how American data protection laws work in this sense, I've only read up on the GDPR. I don't think American data protection laws are any more strict than their European counterparts though.

You don't need to share this information for Manjaro's software to do its work so it's not necessary for the product. If it's strictly necessary, they may need to inform EU users, but don't need consent.

The edges of the law are pretty sharp. There are a few reasons for which data may be collected without consent, and "I want to see what kind of computers visit my website" isn't one of them. Most of the time, you'll need explicit consent (can't hide consent in the EULA or T&C).

This goes for anything containing PII. And, for the record, an IP address is considered PII in many cases. Pseudonyms also don't protect you.

Even with consent, collecting PII like this also adds a ton of extra overhead (suddenly you need to encrypt your database, serve information/correction/deletion requests from the people you've collected data about, not being allowed to host such data in the US, etc.) to the point I wouldn't even bother collecting this info from EU users. Foreign companies break the GDPR all the time and very few of them ever get fined, but when it comes to communities trying to do the right thing, the GDPR rightfully succeeds in making data collection expensive.


Manjaro doesn't have region specific isos, so it sounds like this will end up being the global policy. However international compliance isn't something every developer is aware of so it may take time before the project is releases a compliant version.


IMO asking for consent (or not collecting data at all) is always the right move, regardless of legal obligations. Might as well just ask everyone for consent.


This is the morally correct thing to do but it does result in selection bias for any statistics gathered. It's hard to figure out a way to get good data but users rights must be respected.


Somehow, before the wide availability of constantly connected Internet, software got made. Perhaps constantly collecting data on your users is not required after all.


If your competition is collecting user data and you aren't then they have a competitive advantage in understanding where to make investments for future development investments.

It's really best to just kill the arms race and restrict data collection.


You can fight back by exposing how much data the competition collects. I buy devices that collect less data as a choice. Many others do too.


You don't get any meaningful stats from opt-in. Might as well not collect any data at all.


I agree. And as was said in a comment by the author in the thread:

    > True, that. I wasn’t even thinking about the GDPR when I wrote that. :man_facepalming:


I will never, ever understand how Manjaro has the audience it has, including SBC manufacturers, etc. They don't work with the community particularly well, have a horrible, awful track record of distro stewardship.

And now this.


I used to use vanilla Arch. I really liked it, didn't mind the install process, and was reasonably happy with package stability. But there were a couple times when Arch's bleeding edge rolling package releases caused issues for me. I tried Manjaro partly because they update packages after a bit more testing, and I've never had any issues like that with them. I'm ok with being a month or two behind Arch in exchange for more stability. Now I'm thinking of going back to Arch tho...


Installing Arch is unnecessarily difficult for most people. Manjaro is Arch without all the hassle. All the "cool" people seem to have switched to Manjaro's successor, EndeavourOS, though, which does the same thing but with a rebrand.

I've never had trouble with Manjaro because of anything Manjaro caused. The occasional error when updating because the AUR is ahead of the distro repo is never a problem. The worst problem I've ever caught them on was "forgetting to renew their HTTPS certificate that one time".

Manjaro has a tendency to fork and ship stuff that works well enough. They don't wait for the last two or three years in the long tail of development before general availability. That makes their software rather unstable on some platforms (mobile, Apple ARM) but on the other hand, their software is actually available, not just a tarball and a makefile.

The biggest issue Manjaro brings is the flood of users who don't know what they're talking about coming to forums like Arch Linux trying to solve their instability issues. Once Manjaro fully dies, those people will still exist, they'll just ask about "how to fix libxyz on endeavour" instead of "how to fix libxyz on manjaro".


The occasional error when updating from AUR because of Manjaro's arbitrarily updated system stack is absolutely a problem, as evidenced by the occasional error.

EndevourOS is not the same as Manjaro, it uses Arch's binary repositories directly. The answer to "how to fix libxyz on endeavour" is exactly the same as "how to fix libxyz on arch' which means that the Arch wiki, one of the best Linux rescources in general, directly applies. I straight up think Manjaro is a waste of time for everyone, it doesn't really do anything better and they've handled so many things so much worse.


It wasn't once, it was multiple times, and at least once, they recommended clock-rollback as a workaround.


installing arch is just partitioning a drive and pacstrapping

it is in NO way unnecessarily difficult

there's even an archinstall script for the lazy

just because there is no gui installer != unnecessarily difficult


Because Archinstall is scary, even if you could train an elementary-schooler to use it. Manjaro just repackages the same thing with a harder-to-use and more confusing GUI which people are more willing to give a spin.


> Manjaro just repackages the same thing with a harder-to-use and more confusing GUI which people are more willing to give a spin.

Lol, if it was harder to use people would use archinstall instead.


I disagree. Archinstall has better disk partitioning tools, more options and a more logical and clear setup menu. The reason people do not use it and prefer Manjaro instead is because it has a GUI, despite the fact that it is objectively a worse interface.

I say this as a former Arch/Manjaro user that had to reinstall his system every 2 months. Once you get over the "text on a screen is scary" phase it's immediately clear how Archinstall is a better installer. To be clear I consider neither OS suitable for daily driving with so many better alternatives out there, but Archinstall is feared by many for practically no reason.


Well maybe after decades of evidence developers will come to grips with the fact that, for most users, a GUI is always better than a text interface.

Or you can keep shouting into a void, your choice.


If you are a normal person, you should not be installing Arch Linux.

Manjaro doesn't change that by making it easier to install, it baits users in with AUR packages and inevitably disappoints them because it's breaks your system even more often than Homebrew does.


Better in what metric?

The problem with users, of pretty much anything, is that they're stupid. Even me! I am stupid. When I buy a car, I'm not so worried about the drive train or the complexity of the transmission or how viscous the oil it uses is. Because that I don't understand. Pretty colors, sharp lines, and a good price makes my decisions. But those are surface level.

GUIs may attract more users, perhaps that's a metric they're better in. That doesn't make them faster to use, or even easier to use! Users greatly skew towards familiarity, but typically familiarity in surface-level ways only.


I first used Arch over ten years ago and nobody told me about archinstall until a few weeks ago! I was using Ubuntu out of laziness because I was tired of manually installing Arch every time I got a new machine (until now, I use Arch (again, btw)).

I wonder how many Manjaro users are like me, and just don't realize that Arch ships an installer now.


> I first used Arch over ten years ago and nobody told me about archinstall until a few weeks ago!

It wasn't really an official option ten years ago; https://archlinux.org/news/installation-medium-with-installe... puts it in 2021 (though that's just when it was added to official install media; I assume it was available in some capacity before that)


Archinstall fckd up my EFI partition for dual-boot. Had to reinstall windows. It was probably my fault, but Manjaro just worked first time.


Isn't EndeavourOS quite a bit better? It's ranked higher on DistroWatch.com and its Visitor Rating is 8.48 compared to Manjaro's 8.07.


DistroWatch ratings are completely pointless and meaningless.


The rankings are but the ratings aren't.


It does seem like this collects a lot of unnecessary information. I know it may be useful to know some of the statistics and I don't think there is a problem with the general statistics persay (i.e. kernel version, cpu, gpu - things that are common enough that they won't easily identify a user). However, it looks like it also records a bunch of information on installed packages which, given how uniquely people set up their systems, could easily be an identifying value point.

I think this might've been an easier sell if they kept it to general data and used it for the purpose they stated they wanted to originally - counting the number of users of manjaro - as the depth of information they send is unnecessary for that


This is so user-hostile. I do not want a spyware in an OS. All telemetry should be opt-in and voluntary.

Also, what would be the use of the telemetry? Find a reason to close a bug report with WONTFIX?


I agree it should always be opt-in, but there are plenty of reasons this data could help them make Majaro better.

Their install process likely gives them data already on things like user hardware, Intel vs AMD for example. Telemetry could help find performance issues though, like if one processor type is always slow to boot or if AMD graphics cards have frame rate issues with certain tasks.


> but there are plenty of reasons this data could help them make Majaro better

Then they should ask for a permission first.

> Telemetry could help find performance issues though

You don't need to have telemetry enabled for everyone, just find volunteers who are interested in contributing data, or make a performance test app and let users submit the report.

Also, as I understand, they don't collect framerate in current version.


I don't disagree, it should be opt-in.

> Also, as I understand, they don't collect framerate in current version.

That seems reasonable, I haven't used Manjaro for a couple years so I didn't take the time to see exactly what they want to track. That's just an example though, they wouldn't bother collecting the data if there is actually no value to them collecting it. I haven't seen anything about them selling data, so I'd expect the value is in improving the product.


Data collection from users used to drive decisions like UI/UX and product development. Because companies were actually interested in how people were using their products.

These days, however, I don't trust any telemetry even from the most trusted companies.


You need look no further than your favorite issue tracker. People can be vocally begging for X to be implemented and nothing happens. Why would I think telemetry will change the situation if the PM in charge does not want/care to do it?


Perhaps their partners and sponsors want it.


> The less I know about users of the software I write, the better.

This is one of my favorite completely irrational HN takes that is guaranteed to show up in a thread like this.


The less I know about users of the software I write, the better.

> we need at least some data about how Manjaro is being used by so many people around the world in order to show that the project has a future and also to plan for that future.

Charitably, this implies they want to plan around things like infrastructure scaling. Why can't they just look at present demands on their infrastructure? (i.e. why bother with a proxy metric when you have the real metrics right there?)


This can also be useful if you are looking for funding e.g. grants etc. to show how relevant your project is.

Just to add a different interepretation: To me this does not imply planning on infrastructure at all.


> The Manjaro project is backed by Manjaro GmbH & Co. KG, an open source driven company.

I'm guessing it's more likely they are looking for investments and need to show some credible numbers.


> we need at least some data about how Manjaro is being used

No, you don't. It is my computer, not yours.


And it's their software, not yours. They are free to require telemetry if they choose to do so. You're also free not to use their software if you don't like it.

FWIW, I think it should be opt-in by default, but I think it's reasonable (aside from adhering to necessary privacy laws) for a project to choose the policy they want.


> And it's their software, not yours.

Arguably, the point of open source software (or at least GPL licensed software) is that the owner of the hardware gets to "own" the software and can do what they like with it.


In that scenario they certainly can. They just have to take the source, modify it to suit their needs, and recompile it. They are 100% within their rights to do that.


I would have thought they'd find details on different architectures or specific chips of more interest. Personally, I don't mind allowing telemetry to open source projects, but it should always by opt-in rather than by default.


> The less I know about users of the software I write, the better.

Said no successful project ever.


Isn't the Windows release cycle based on slapping a new clunky interface on top of the old one, independently of what users want or say?


They and can should count users only by updates.

Make a package that is required and only changed upon each release (containing for instance /etc/os-release) and count how many distinct IP addresses download it.


They don't host all of the mirrors themselves so they'd be missing out on most of the data that way.


They can host all the default mirrors themselves, either by serving the actual data (with a CDN) or by redirecting requests to a 3rd party mirror by GeoIP.


Which would be an insanely expensive way to collect very little usable information. Mirrors are run for free by volunteers, a worldwide CDN to replace free mirrors would be prohibitively costly for a relatively small distro like Manjaro.


Presumably they can use Cloudflare or similar for free or relatively cheap, or perhaps find someone willing to gift CDN service to a Linux distribution.

Redirecting the requests to the closest existing mirror also obviously drastically reduces bandwidth requirements and should presumably be doable with a few (for redundancy) colocated dedicated servers running efficient server software.


Wouldn't the cost of the service scale basically linearly with the user base? So a small distro would have smaller costs?

I agree with the core point that tracking users via package updates is a bad idea, but I don't think this specific argument is very great.


> distinct IP addresses

NAT would like to laugh at you


It's an approximation, but if you don't do that, someone creating and updating a container in a loop (or intentionally running wget in a loop) will skew the statistics arbitrarily.

I don't think there is a more precise way to do it without compromising privacy. Obviously if you are willing to compromise privacy you can send information such as whether the distribution is running in a container or VM and the NIC MAC address and a persistent identifier, but that would be unacceptable since it is not in the interests of users to send that.


> > distinct IP addresses

> NAT would like to laugh at you

Not just NAT. On many ISPs, power cycling the CPE changes the IP address.


Manjaro has been a trash arch fork for some time. https://endeavouros.com would be a better choice these days.


What makes it better? I have yet to notice any differences between the two.


If they just want to count unique installs, then the comment on that thread I believe partially works:

    set -x
    stat / | grep -i "birth:" | awk '{print $2}'
Then use the output piped to md5sum to make a user-id.

And then finally use an opt-in on demand ephemeral instantiation of Tor to submit the results so they can not get the real IP if they wanted to. In my opinion all telemetry should be opt-in and provide a text/plain preview of what is going to be submitted ahead of time. This gives the system owner a chance to back out of posting the telemetry should they see something sensitive. set -x to show an audit trail for what commands were executed in plain text.


    md5sum /etc/fstab
might be viable too, for GPT systems that contain randomised volume IDs (don't think mine's changed since install). Although then there's a chance it could persist between installs if someone decides to re-use their existing partitions, which would be creepy.


If you want easy Arch, there's EndevourOS. It's rolling instead of "stable" but in my experience Manjaro's actual stability is overrated, an update straight up trashed the bootloader once.


I've had a manjaro update break an install 3 times in a several year span. Ironically, given the vetted package system, it was the only Linux distro to ever break an install for me.

Since switching to endeavour a couple years ago I've never had an update break anything.


I like RebornOS, also got frustrated with Manjaro I didn't have it installed more than a day. Immediately ran into package conflicts trying to set up because their repo hadn't updated with versions AUR packages depended on.


Been using EndevourOS after stability issues with Manjaro...


In addition to that, my experience is that Manjaro is terribly slow. No idea why.


Its simply illegal in many countries to make an opt-out. For such data comming from your OS you need an OPT-IN!


Is it though? Microsoft .NET has telemetry that you always have to opt out always. Dark patterns like this setting not sticking but being overridden after an update, and of course the shell command that you kinda have to google each time, where you set a parameter to "1" and get no verification that you have indeed successfully disabled telemetry come with the territory (of software vendors not respecting the user much)


I’m curious, how does that apply to open source projects? Who would they go after for redress in large distributed communities like Manjaro?


I'd guess that they'd target the system that collects the data as usually that isn't distributed (I'd be more concerned if telemetry was collected and then available for anyone to peruse rather than just the Manjaro organisation itself).


I don't think the project will be able to survive the decision to turn on telemetry by default.

There are many different distro's just like Manjaro.


Every shitty move, no matter how bad for all involved - is blindly copied by open source. If Google decided to distribute heroin and spoons to all developers, cause thats "what sherlock did" - one day later all of open source would chase the pink dragon.


Glad I left for Debian + make/checkinstall.

Manjaro is one of the messiest distros I’ve used.


How confident are you that systemd isn't sending telemtery from your machine?


The Debian project would patch that out if it were ever implemented.

Here's a list of package telemetries they couldn't, or wouldn't, disable:

https://wiki.debian.org/PrivacyIssues#Privacy_issues_in_Debi... ("Privacy issues in Debian packages")


Fairly as I don’t use systemd.


If you don't mind me asking, what init manager are you using instead?


openrc


> Until now what has been done, was counting systems via ping.manjaro.org 11. These pings are sent from Manjaro systems via the NetworkManager.

I really hope that's not the default out of the box behaviour.


Most operating systems do this. Ubuntu uses http://connectivity-check.ubuntu.com, the Arch wiki suggests http://nmcheck.gnome.org/check_network_status.txt (used to default to http://www.archlinux.org/check_network_status.txt), Debian uses http://network-test.debian.org/nm if you install `network-manager-config-connectivity-debian`. Of course Apple, Google, and Microsoft have servers of their own.

You can disable the check, of course, or upload a text file somewhere you control to maintain the connectivity check without sharing your IP address with your distro.


To be fair, Arch's base NetworkManager package does the same thing:

https://wiki.archlinux.org/title/NetworkManager#Checking_con...

Though Arch has a decidedly more privacy-friendly stance on their ping endpoint: no logging at all.


It sounds like it is. What's wrong with that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: