Another interesting technique to fingerprint users online is called GPU Fingerprinting [1] (2022).
Codenamed 'DrawnApart', the technique relies on WebGL to count the number and speed of the execution units in the GPU, measure the time needed to complete vertex renders, handle stall functions, and more stuff
browsers should come with a default software renderer, and behave like the mic and camera where the site will require user permission to release the hardware GPU render path.
I sometimes unironically say that JavaScript is a privilege that should only be granted to websites that actually need it. Most of the web is text and images. No Turing-complete client-side runtime environment is required to display that.
But I would also accept all those multimedia APIs (canvas, WebGL, WebGPU, everything audio and video, including the <video> tag) and some others (e.g. service workers and everything else app-like) requiring a permission. Again, most websites don't need them, so given the abuse potential, there's no reason why they should be openly available.
You have noscript to block all that but it breaks the most simple sites these days. Part of it is legitimate like responsive design (though most can be done with css these days).
But most of it is bullshit tracking, anti-scraping and similar stuff.
Responsive sites could’ve been done with css a decade ago too. IIRC Even IE6 has some support for flexbox and media queries. but people would rather pick up react and have a pile of js do it for them.
You could just fingerprint the cpu then, every cpu behaves differently. Buy any number of the same CPU and you’ll see different aspects in every one of them.
The point is that the "they" who abuse this and the "they" who use it for legitimate reasons usually aren't the same people, and so the "they" who abuse this have no incentive not to out of some concern about their legitimate uses being curtailed.
this is of course the ideal, but it is somewhat not to the point; as vidar says, the 'they' who are fingerprinting you have only a limited intersection with the 'they' who are doing awesome things with webgl like shadertoy or https://mitxela.com/projects/model-viewer, which doesn't even have google analytics
a somewhat bigger problem is that to a very significant extent the actual owners of the machines are microsoft, google, and apple, not the users; they make the rules, and the users are lucky if the owners allow their code to run at all. under those circumstances, blocking fingerprinting is practically quite difficult, because the 'they' who want to fingerprint you and the 'they' who make the rules about what code run on your machine are the same people, not two opposing groups
an additional problem is that an increasing part of the web is run by criminal elements like harvey weinstein and the rest of the mpaa, who will block you if they can detect you attempting to protect your privacy from them by blocking fingerprinting, even if apple decides it would be a good idea; cloudflare and google are perhaps the most prominent enforcers here, perhaps somewhat reluctantly
If you consider that watt hours is just a convenience unit for (3600) joules, then “1 gigawatt hour each day” correctly should be “3600 GJ/day” which works.
Hardware implantations of things like graphics routines can be hundreds of time more efficient than software implementations running on general purpose CPUs.
Try to decode mpeg video at HD resolution in software sometime.
The native GPU video decoding (and CSS and canvas GPU acceleration) can work without problems, if only webGL gets deactivated, what this was about here.
But you can probably also use those to fingerprint, but probably not as precise.
> Try to decode mpeg video at HD resolution in software sometime.
Firefox has only started to ship hardware accelerated video decode a few versions ago. Until very recently, all my video playback was software decoded.
surely, the user will be taught to enable the hardware for video if they start seeing stutter. Or the browser can prompt the user to switch to "high-end graphics" if it detects prolonged video decoding.
If a website that has no obvious case for using the GPU, but is instead using it to fingerprint, then the user won't experience any slow downs from a software renderer (as it is usually done relatively quickly).
If a website needs the GPU for their videos/graphics, but also incidentally wants to fingerprint you, you're shit out of luck in that case. But this is no worse than what we have current day.
> Do you have any concept of how many gigawatts per day that would waste?
It would still be way less than what large companies are burning on training proprietary LLMs. Do you think the ChatGPT model you use daily was the success at first go? And in that same world, consumers should not even try to protect themselves from GPU fingerprinting?
> The same people who have spent the past 40 years getting confused and worked up about cookies?
Stop with the condescension. It's not about being confused; it's about mitigating genuine privacy concerns. We're not idiots, and dismissing genuine worries won't make the issues disappear.
If the user experiences stuttering while decoding a video, they won't learn to enable special permissions for the website but instead switch to a different browser that hasn't yet implemented this "feature".
And most websites most users visit will need the GPU to be remotely usable. For them enabling specific permissions for every website they visit is very inconvenient.
Perhaps a middle ground: Default to software acceleration when in Private Browsing mode, because obviously you want to be private, default to hardware acceleration otherwise.
LibreWolf does this, actually: it initially blocks websites from using WebGPU (and canvas) by default and then gives you a popup to grant them permission.
What's the point? The capabilities of browsers are so vast they'll just find other ways to fingerprint.
Privacy in browsers is a lost cause. It's a 30+ year old technology that has become ridiculously bloated in scope, with privacy and security only considered as an afterthought.
I feel like these days (especially given the recent focus on side channel attacks) it is basically a given that adding uniform noise to something that leaks data does not work, because you can always take more samples and remove the noise. Why did Safari add this? I understand that needing more samples is definitely an annoyance to fingerprinting efforts, but as this post shows it's basically always surmountable in some form or the other.
A lot of Apple's "privacy" features nowadays are marketing. It's privacy theater. What matters is whether they can tell a plausible story to the public, not whether is technically effective.
That's an wild accusation to make without citations.
It doesn't even apply in this instance, since Apple's work on fingerprint resistance still results in real privacy improvements even when later shown to be imperfect. It means Apple has to improve what they've already done, not that what they've done so far is mere "marketing" or "theatre".
> That's an wild accusation to make without citations.
Shall I cite my list of CVE? Or perhaps it would be more interesting to cite my list of unfixed 0days.
> It doesn't even apply in this instance, since Apple's work on fingerprint resistance still results in real privacy improvements even when later shown to be imperfect. It means Apple has to improve what they've already done, not that what they've done so far is mere "marketing" or "theatre".
What does it say about Apple engineering that that they keep shipping features with very obvious and/or predictable flaws?
I haven't seen marketing related to audio fingerprinting protection. Maybe Hanlon's applies here.
As for your point about the pattern of vulnerabilities: I'd attribute this to being closed source. They keep shipping security features with limited auditing, and only discover flaws in production.
In general, Apple is trying to market itself as the privacy company. "What happens on iPhone stays on iPhone", yadda yadda.
> Maybe Hanlon's applies here.
I think my view is in alignment with Hanlon's razor. I don't think it's necessarily malicious deception. Rather, Apple has a habit of shipping the laziest implementations and slapping a "privacy" label on them, but the public doesn't know that these are lazy half-measures.
> As for your point about the pattern of vulnerabilities: I'd attribute this to being closed source.
WebKit is open source.
> They keep shipping security features with limited auditing, and only discover flaws in production.
I don't think this is a closed/open source issue. It's just bad engineering.
Bad engineering yet state of the art. What are Chromium’s protections against web audio fingerprinting?
In the game of tracking, minor hurdles are great at stymying many actors.
And finally, your citation in response to someone saying they haven’t seen Apple market web audio fingerprinting protections has no references to said feature. Are you saying all the privacy features in that press release are a smokescreen? It’s quite unclear.
> What are Chromium’s protections against web audio fingerprinting?
I'm not aware of any. But they aren't advertising fingerprinting resistance either.
> In the game of tracking, minor hurdles are great at stymying many actors.
That's questionable.
> And finally, your citation in response to someone saying they haven’t seen Apple market web audio fingerprinting protections has no references to said feature.
There were multiple antifingerprinting methods in Safari 17. The linked articles referred to them collectively.
>> In the game of tracking, minor hurdles are great at stymying many actors.
>
> That's questionable.
It's basically indisputable. Ask any online advertising buyer about the effectiveness of audience targeting for Safari users versus the competition. Or consider the ability of the average website operator to adopt Fingerprint.js instead of whatever half-broken tool their usual audience measurement provider offers them.
> In general, Apple is trying to market itself as[...]
And I haven't heard any argument suggesting that the marketing is deceptive. Apple implemented numerous fingerprinting protections and nobody has demonstrated any of them to be "theatre" or mere "marketing", only that a security researcher was able to defeat one protection among many (and then published their work so Apple can solve for it in the next release). In reality, ALL such work is an ongoing battle between developers and security researchers.
In each subsequent reply you are shifting your stance in order to deflect away from this original claim, essentially by holding Apple to an impossible standard where anything less than perfection on the first try is equivalent to scamming the public with lies. Do you want to defend your original claim that "a lot of Apple's privacy features nowadays are marketing" and "privacy theater"?
“Apple’s quantity and resolution rate of security bugs undermine its privacy marketing” and “Apple’s privacy marketing is a lie” are two very different claims, and it seems like you meant to make the first. Even that claim though is unsupported since Safari users are definitely harder to track across the web in practice than Chrome users.
> Shall I cite my list of CVE? Or perhaps it would be more interesting to cite my list of unfixed 0days.
The list of vulnerabilities is not very informative for the same reason a trackers blocked statistic is not. It doesn’t give any baseline for comparison and may just be a reflection of how important and interesting to security researchers the target is.
I think you missed the point of my comment. When I said my list of CVE and my list of unfixed 0days, I meant that literally: CVE attributed by Apple to me, and unfixed 0days that I personally discovered. I wasn't making a "wild accusation".
No I understood exactly what you meant. The number of reports is not helpful data without a lot of other context, but you offered it as if it would be convincing or definitive. How many CVEs and 0-days have you filed against Audacity? Is it because that software is security bug free?
That's a rather bold claim, unless you're a mind-reader.
> The number of reports is not helpful data without a lot of other context, but you offered it as if it would be convincing or definitive.
I didn't give a number. I only said I have a list. It seems that you're still missing my point, which was simply that my knowledge of and experience with these specific technologies means that my original comment was not a "wild accusation". That's it, that's the whole point.
> How many CVEs and 0-days have you filed against Audacity?
I don't use Audacity, and I have no idea how it's relevant here.
> That's a rather bold claim, unless you're a mind-reader.
It seems like you have me confused with someone else in the thread who used the phrase "wild accusation" and are responding rudely. I think your original comment was needlessly exaggerated and inflammatory and defending it, instead of clarifying it, is a bad look. Clearly you have an axe to grind with Apple, and my advice to you is you should put a little more effort into hiding it if you want others to take you seriously.
> It seems like you have me confused with someone else in the thread who used the phrase "wild accusation"
No, I'm not confused. But that comment was the context for my mentioning CVE and 0days, which you decided to discuss yourself.
simondotau: "That's an wild accusation to make without citations."
me: "Shall I cite my list of CVE? Or perhaps it would be more interesting to cite my list of unfixed 0days."
you: "The list of vulnerabilities is not very informative for the same reason a trackers blocked statistic is not."
If you don't want to discuss my previous quoted comment, that's fine, but you have in fact mentioned it and continue to mention it. Thus, the context is very relevant.
> and are responding rudely.
Where exactly was I rude?
> I think your original comment was needlessly exaggerated and inflammatory and defending it, instead of clarifying it, is a bad look.
I'll respond to that comment, though it may take some time.
> Clearly you have an axe to grind with Apple
I've been a Mac user for more than 20 years, a professional Mac developer for more than 15, and I currently sell apps in the Mac App Store and iOS App Store. Do I have critiques of Apple? Yes, of course. However, they are the critiques of an insider who has no intention to leave the ecosystem.
You're throwing around your ego and responding with alleged claims of personal expertise when this discussion has nothing to do with that. You're deflecting with puffery; which is irrelevant with respect to your original claim which I questioned. I don't care that you have a list and I don't care how long it is. You accused Apple of engaging in privacy theatre, and that "many" privacy features where mere "marketing". Defend that claim or move on.
The term “security theater” has a specific meaning which is not a bug or less than perfect protection: per its creator, “Security theater refers to security measures that make people feel more secure without doing anything to actually improve their security.”
Public Relay is obviously not accurately described by that term and any rule which classifies it as such would be useless because it would classify all browser security as theater because everyone has had bugs, and everyone has had to adopt more sophisticated defenses to counter more sophisticated attackers.
Yes, and privacy theater is clearly an attempt to apply the same concept to a closely related topic. I edited my comment to focus on the problem here: you started with this absurdly sweeping claim which you’ve been unable to meaningfully substantiate throughout the thread. Trying to dismiss something as theater based on a bug fixed in the beta period is not only self-contradictory (you’re tacitly admitting that it’s not theater now) but also almost useless as a heuristic because very few products never have bugs.
Now if we want to talk about guidelines, consider that the broad claim you originally made would have to be widely accepted in the industry not to need supporting evidence, at which point it wouldn’t be contributing anything; since the opposite is true, the guidelines about flame bait cover it. It could have gone in a potentially useful direction if you’d been willing to define your terms and support them with evidence, and that would have helped suggest less hyperbolic terms. For example, if you said that Apple could do better at vetting and implementing their features I doubt many people would disagree with you.
Built in tracker blocking and the various ways Safari makes it hard to share user sessions with 3rd parties absolutely has real effect. It also has real costs: part of Safari’s poor compatibility reputation comes from websites that are broken by its tracking prevention features. This is why Google claims they haven’t rolled out the same. If Apple only cared about the problem at a superficial level, why wouldn’t they do the same as Chrome and talk a big game about the problem but continuously delay changes?
I’d say the privacy report is the only real false security feature, but Apple was a laggard in that market. For all we know, they could have been trying to match features with Ghostery or Brave that teach consumers this is a feature you should expect from your browser. Users may also have been needed education about that behavior in order to justify the compatibility regressions cookie blocking incurs. It’s impossible to know from the outside, but your body of evidence to support a really strong accusation is quite weak.
> If Apple only cared about the problem at a superficial level, why wouldn’t they do the same as Chrome and talk a big game about the problem but continuously delay changes?
If Safari behaved the same as Chrome, then Apple couldn't market Safari as more private than Chrome.
This is obviously untrue. People accuse Apple of marketing differences where none exist all the time. Thus the trope "X did it first" or "Y on Z is basically the same."
Well I’ll add to my response that I think it is delusional to think that (security) features ship bug free. It is a bar that _nobody_ can or has met. It is not how the software world works at large.
Do you have something more recent than a leak from over 2 years ago that has long been fixed? I'm curious why iCloud Private Relay is theatre at the moment.
And a VPN leaks a lot of information about your network activity to the operator, so by your standard it is privacy theater. Do you see why you’re coming across as having inconsistent standards and thereby perhaps an axe to grind?
iCloud Private Relay is used for all network activity from Safari which does not seem like a “limited amount of activity.”
> And a VPN leaks a lot of information about your network activity to the operator, so by your standard it is privacy theater.
A VPN isn't designed to keep your IP address hidden from the operator. iCloud Private Relay doesn't hide your IP address from Apple either. That's not the point, and everyone knows this in advance. The point is to keep your IP address hidden from the request destination servers.
Your logic is that any flaw in an implementation renders it useless. In the case of VPNs, operators can and do share information about clients to destination servers, law enforcement, and more out of band. Just because it involves a spreadsheet and not a WebRTC request does not mean it can be forgiven if you're going around making absolutist claims regarding efficacy.
> Please spell out the implication of that claim for me then.
I'll spell out my views below, but I want to start by noting that I don't agree with the way you've characterized them. Going all the way back to your initial reply, I don't like the way this leading question was phrased:
> What specific features do you allege exist just to mislead the general public?
I think Hanlon's razor is a false dilemma. With a big company like Apple, there's typically a combination of bureaucratic incompetence and marketing exaggeration. Clearly, Apple leadership has decided to make privacy a consumer differentiator for their products, so they have a financial incentive to hype privacy features as much as possible. As a consequence, Apple management would be eager to be pitched any and all privacy features from engineering; these may even lead to bonuses and promotions, though that's purely speculation on my part. Regardless of the personal motivations of employees, the company is pursuing privacy features in earnest and isn't intending for them to be fake. Nonetheless, the company also has the unfortunate habit of shipping half-baked features and implementations. This is driven largely by the artificial, forced march of the annual release schedule, which demands that great new features be continually announced at a certain time, whether they're ready or not. The situation is not unique to privacy features either; Apple's entire software product line is suffering in quality. Engineering simply doesn't have enough time to do things right, which results in new features that are superficial and/or flawed. You could say it's marketing-driven incompetence.
Several commenters have mentioned that all software has bugs, as if that were somehow profound, or as if I were somehow ignorant of software development as a software developer. (I actually had to spend some time fixing a bug before I wrote this reply.) But not all bugs are created equal. From my perspective, a bug that's discovered relatively quickly by someone else is worse than a bug that's discovered only years later, in the sense that it suggests insufficient QA on the part of the developers, who themselves should have noticed the bug before it shipped. And a bug in the primary functionality of a product or feature is worse than a bug in a more obscure part of the software. This is why I'm not impressed by the length of time since a bug was fixed; if a feature or product was shipped with an obvious, fundamental flaw in its main functionality, that's a stain on the reputation of the developers. And if they keep making such mistakes, why should you ever trust them to be competent? No bug fix can fix the bug writers.
I don't want to focus too much on iCloud Private Relay, though. It wasn't what I had in mind when I was writing my original comment, and I don't even use iCloud Private Relay myself. I mostly don't use a VPN, except on rare occasions. I've discussed iCloud Private Relay here only because you asked me about it.
It's been a busy afternoon/evening for me, so I've kind of run out of steam now on this comment, but I promised I would reply.
The essence seems to be that the web audio API has a lot of algorithms that do a lot of math, and every browser has a slightly different implementation, and the exact results depend on the operating system and cpu too. So if you use the web audio API to generate a small signal all browsers will generate something that's really close, but the tiny differences can be used to help tell them apart.
But why would it vary in ways that are consistent run to run on one machine, but not consistent with the same process executed on another similar machine?
Every datapoint reduces the number of people it could belong to. CPU + browser + browser version + OS + major OS version can narrow it down by a lot.
Then add resolution, IP address location (which VPN they use is also a datapoint), which time they are active at, etc. and you can get a good almost-unique identifier.
Maybe we should make the browser implementations consistent to the point they can't be told apart. Alternatively, we can reduce the precision of the results so that the tiny differences are deleted.
i think it comes from similar tricks that are played with webgl where there is a lot of entropy that comes from pc videocard drivers and the hardware itself.
it's a shame that browser people have to add noise to audio buffer handling to try and thwart it.
TL;DR different codepaths even within the same codebase (e.g. SIMD variants) can result in subtly different floating point results (iiuc, likely related to to the fact that floating point math is unexpectedly sensitive to order of operations etc.)
Floats are deterministic, though (if they weren't, this wouldn't be a workable fingerprinting vector). Reordering of operations (etc.) in a way that would actually change the results needs to be done by human edits, or with compiler options like ffast-math that explicitly allow the compiler to "break the rules" and make such changes. In either case, the concrete instructions emitted by the compiler will have deterministic behavior (and if they don't, that's a hardware bug)
What word would you use if sin(x) returns a different value on different platforms, or even different OS or librsry version? Sure smells like a fiction that depends on external state rather than simply its input.
Probably implementation details and compiler optimizations, float addition is not commutative for example. Implementing the same algorithm with the same formulas correctly can still lead to slightly different results
Floating point addition is not commutative, but it is still consistent. Getting different results is usually the result of using alternative algorithms or relaxing standards (that may, for example, reorder terms).
I don’t think the web spec implemented by the browser specifies the order of every operation, only the algorithm. If safari and chrome devs implement the audio api based on the spec, there can still be minor floating point differences because of the way they implemented the same calculations. That’s why they can fingerprint your browser versions with this.
That’s what it does, users on the same browser and same hardware should have identical fingerprints. It’s just one way of multiple to narrow down your fingerprint.
Someone definitely correct me if I'm wrong, but the success of the fingerprinting workarounds here seem to boil down to the following choice wrt handling oscillator anti-aliasing in the Web Audio API spec:
"There are several practical approaches that an implementation may take to avoid this aliasing. Regardless of approach, the idealized discrete-time digital audio signal is well defined mathematically. The trade-off for the implementation is a matter of implementation cost (in terms of CPU usage) versus fidelity to achieving this ideal.
It is expected that an implementation will take some care in achieving this ideal, but it is reasonable to consider lower-quality, less-costly approaches on lower-end hardware."
AFAICT this means that the OscillatorNode output they are exploiting here is almost guaranteed to not be deterministic across browsers (or even in the same browser on different hardware). The non-determinism is based on whatever anti-aliasing method is chosen by the browser (or, possibly, multiple paths within the same browser which could get chosen based on the underlying hardware). This includes changes/fixes to the same anti-aliasing algos.
I don't really understand this choice of relegating anti-aliasing to the browser given that:
a) any high-quality audio app/library will want full control over how the signals they generate avoid aliasing and will not use these stock oscillators anyway, or
b) the kinds of web applications that would accept arbitrary anti-aliasing algos (and the consequent browser-dependent discrepancies therein) probably wouldn't care whether the aliasing algo is hardcoded SIMD instructions or some 20MB javascript web audio helper framework
Edit 3: I wonder if the same kind of solution could be used here as was used by Hixie to standardize the HTML5 parser. Namely, just have some domain expert specify an exact, deterministic algo for anti-aliasing that works well enough, then have all the browsers use that going forward. I'd bet the only measurable perf hit would be to tutorials that show how to use the web audio api to generate signals from the stock anti-aliased oscillators. :)
IIRC it turned out that way in large part because realtime audio is very sensitive to performance hitches, and idiomatic JS is hitchy by nature due to relying on garbage collection, so they wanted to hoist as much as possible up into native code provided by the browser. If WASM had existed at the time it would have been easier to make the case for just exposing a simple raw audio interface instead.
Well... Mozilla had ASM.js at the time. In part to showcase their superior performance with certain portions of JS compared to V8 - at the time I remember the things like console emulators preferring Mozilla's JS engine due to it offering more reliable performance than V8 on the tight loops and large switches. Mozilla was also demonstrating how their engine could offer comparable performance to Google NaCl in an image processing demo which was conceived to show how NaCL could cover limitations in V8 at the time.
I wonder if we might well have had more traction with Mozilla's approach and ASM.js if V8 had had similar features.
Oh well. Is what it is, and Mozilla (and Microsoft and Apple) did at least manage to get WASM which has been super useful even outside of browsers.
(That is, I'm pretty sure ASM.js uses same trick WASM does given it was the predecessor - just preallocate a ton of memory in an array, and work with primitive types, and no GC to worry about most of the time)
I wonder why audio API's are even available without giving a website permission? It feels like this could easily be fixed with a simple "This site would like to use your sound devices"-dialog.
It raises the question of whether the current networking stack is the one we want to have for the next 100 years. The internet in its current form has ruined a lot of the dream of personal computing because companies (and the state) are so asymmetrically powerful versus individuals. Should it be possible for my technology to send data to a server without my explicit approval?
I assumed a level of irony here, from fingerprint.com. It’s like if a website popped up popularising loopholes to get around tax burdens as an attempt to disgust the world into closing those loopholes.
Even if that’s wishful thinking, there’s still immense virtue in publishing this research and getting it out in the open. If an article gets published explaining how a particular brand of green backpack helps with shoplifting do we worry that everyone’s going to shoplift more? I’d err more on the side of knowing shops are more likely to catch on to the tactic.
Unfortunately in this case, the website does content marketing with known, easy to fix vulnerabilities presumably to put competition out of business while keeping unknown, harder to fix vulnerabilities as part of their "pro" products.
It seems like rather than adding a random amount to each sample (which lets them compute a mean by recreating the same audio and extracting out the differences), Safari could instead add randomness that is based on a key that rotates every hour. (Function of audio sample and key, so the noise would be the same in a given session, but useless for tracking an hour later).
If you averaged together ten such samples, you'd get something that approaches the true values from the device. The more samples you have, the closer it would get.
Fixing this would require removing the information leak entirely, not just masking it under a layer of random deviations.
The grandparent post accounted for exactly that criticism. By having the source of randomness fixed for a limited time period, a fingerprinting algorithm wouldn't be able to gather enough unique samples for averaging to be useful. And given the extremely fine differences in the floating point numbers, any injected noise would so overwhelm the data that you'd need hundreds, perhaps thousands of samples in order for averaging to be useful.
Wouldn’t it help if the noise added were deterministic based on origin? That way it can’t be averaged out by oversampling. So something like RNG_SEED = HMAC_SHA256(PERSISTENT_SECRET,Location.origin)
The problem is that that by being "that guy" you're probably giving them 10 bits or more of identification. If they can just scrape a few more bits from somewhere they'll have you uniquely identified.
But, yeah, these guys can get on Golgafrinchan Ark B with the rest of the adtech industry as far as I am concerned.
Good luck. It's amazing how little of today's web is good old HTML. A while ago I visited a website that used Markup - but that wasn't compiled into HTML and then statically served, oh no - it was rendered in JS client side. WTF.
Join me, and do it! There is a great Firefox extension called uMatrix, which makes it easy to disable JavaScript not just on a site-by-site basis, but also by subdomain (and easy to re-enable for sites that break without js).
I really don’t see how this can come up with more than a few thousand unique combinations. Browser type x browser version x os version x accelerator version x … what else? That doesn’t seem like enough variation to create anything remotely unique. I don’t get it.
This is similar. Audio algorithms often call OS functions and make use of CPU optimizations. One example they mentioned is the fast-fourier transform (FFT). All OS's include a version of that function but it tends to be optimized over time, and tends to behave differently on different CPUs depending on what SIMD instructions are available.
Couldn’t you just replace the prototype of the Audio API to return back whatever you wanted? The difficulty would be in getting enough fingerprints for your desired imitation but the article itself seems to have that information.
This.
That's why I feel we are all doomed regarding privacy. The only way we could maybe protect ourselves would be to all send manipulated but looking like plausible average data.
Wait, is it just me, or is it not wild that there's a company openly advertising their fingerprinting services? Their landing page implies it's primarily for fraud detection / abuse prevention. But one of their customer testimonials is from Neiman Marcus boasting they increased the number of repeat customers they could identify.
> "With the adoption of Fingerprint, we can now recognize and personalize approximately 23% of total visits to NeimanMarcus.com, up from the previous baseline of 8-10%."
Of course, these companies have always been around. But this post reads like it threads the line between "our product defeats Apple's futile defenses" and "we care so much about user privacy we're white-hat cracking Apple's defenses".
And another "between-the-lines" joke is that this site doesn't throw up a cookie dialog when you load it. What a joke! "We don't need cookies to track you, haha!"
So they say this is for fraud prevention and that all other uses need consent.
On their front page they tell me how often I have visited and that my incognito mode does not prevent their tracking.
Isn’t that “other use”?
> Does Fingerprint Pro require consent?
> Our technology is intended to be used for fraud detection only; for this case, no user consent is required. However, any use outside of fraud detection must comply with GDPR user consent rules.
I expected this article to be published by some hackers or defenders of privacy like EFF, not by a company whose goal is to fingerprint people. Such dystopian times.
There’s a push to make every single last thing a normal application can do, available to web apps through some half-standardized JavaScript API or another. Generally google comes up with use cases, implements it in chrome, and tries to call it a standard. Then everyone complains when Apple doesn’t implement these standards fast enough, and that Safari is “holding back the web” or “the new IE” because it’s not keeping up with every last feature Chrome implements.
I would prefer websites just be websites and that we don’t have every single damned API available to whatever trashy site I accidentally click on, but I guess you and me are outliers here. Most people on HN seem to welcome every single JS API because web development is the only platform anyone seems to care about any more.
That’s how location services and notifications work today, and all it means is that websites just constantly nag me to enable them.
Things like this make for a more annoying web all around, because now it’s just one more tool sites can use to track me and increase engagement. (Edit: sibling poster chuckles said it way better than I can.)
If I had my way, JavaScript on the web would be limited to XMLHttpRequest and basic DOM manipulation and couldn’t do anything else. A totally separate “rich” JavaScript engine could be opted into by the user for any website that presents itself as an “application” like ones that legitimately want audio API’s like these. All these half-baked web app “standards” that google is forcing down our throats can be confined to that leper colony.
Then the most important bit: browsers could let me completely disable the “rich” engine, and I can go back to having a sane web experience again.
> That’s how location services and notifications work today, and all it means is that websites just constantly nag me to enable them.
It also means you can tell the browser to outright deny every request, thus avoiding even getting prompted. If a website detects the request was denied and still prompts you any other way, that’s an undeniable signal to close the tab and never return.
Right, I think the fact that these features exist at all means sites are gonna ask for them… even if your browser denies it, the site can easily pop up a dialog saying “hey you should give us notification access!”.
The result is that the web just keeps getting incrementally worse and worse. It’s all good intentions in creating these API’s but the result is that everything just gets more terrible.
I’m kinda surprised that no fork of Firefox has added both global and domain-scoped toggles for web feature support. I know there’s flags in about:config but that only covers some things and isn’t very user friendly.
That’d let users turn support for all the fancy bits off by default and enable them in the tiny handful of cases that they’re actually desired. This way as far as sites are concerned your browser simply doesn’t support those features and thus can’t nag you.
When a surveillance company (in this case Google) is leading the push, security against surveillance isn't on their list of requirements. In fact it's more of an anti-requirement, which escapes human judgement via design by committee or other anti-scrutiny technique. So then we end up with yet another insecure API that we've got to suffer for years as the browser makers who actually care about security painstakingly figure out how to mitigate the vulnerabilities in the original standard.
And I'm all for focusing on technical security, but it's worth mentioning that the biggest most concentrated win would be making commercial digital surveillance illegal (ie the path the GDPR tries to head in). Imagine if large public companies had to make their revenue by honest means instead of working as advanced persistent threats.
Audacity's an awesome piece of software that I've used many times. Never once have I thought "by golly this thing should be a website, and my web browser should be made to expose an audio graph API to every website I visit to that it can be so!"
Well, my application level firewall prompts me if audacity attempts to connect to the internet, and I refuse it. Refusing firewall access to a web browser renders it useless for every website, not just a 'audacity as a service' site.
Get with the times. Your privacy must be sacrificed so some random web app you have never heard of can do something no website should be able to do at all. Or maybe that’s just a pretense and not the real reason Google keeps adding all these APIs. People seem to forget that ChromeOS is literally Google trying to turn every computer into a thin client for their services.
What's key here is consent. (the real kind, not the EULA kind)
If you knowingly opt into being identified (like ticking an unticked "remember me" box that clearly explains the precise purpose of the identification) then it is okay.
If some asshat decides to do it without your informed, clear, and uncompelled consent, throw the fucker in the slammer.
Me too. But more likely they will do the opposite. Apple’s anti-fingerprinting is anticompetitive to the market for European data trackers or some such bullshit.
My charitable take is that it does take both ends of the spectrum to arrive at a solution that does not exactly satisfy everyone, but is an acceptable place to stop the impossible arms race. The unfortunate reality is that we are nowhere the end of that race.
Admittedly, that was the first time I read about fingerprinting in this manner and bypassing explicit privacy protections is definitely not something I would want for my future self ( or that my of my family ).
In other words, I think you are right. Privacy probably needs to be codified. It may seem hard to do given existing entrenched interests, but you have to start somewhere. Not that long ago people thought buying people is 'just the way world works'. Things can change. Slowly, but they do.
> So as a user my preference not to be fingerprinted or tracked takes a back seat in the name of fraud detection?
the issue is murky for certain use cases. take payments for example. fingerprinting is used at scale in that field, and for good measure. you want to be able to know the risk associated with a user (chargebacks, fraud, etc).
> they are actively discussing how they are circumventing browser privacy protections.
I'd love to see a successful prosecution as something like a US CFAA violation, setting a precedent that puts the fear of god into the widespread slimy side of our field.
But I suspect it will have to be a non-US country leading that, because a lot of the US economy and power is now tied up in widespread slimy behavior of our field.
Did I read this correctly and audio fingerprinting is mainly about identifiying the used browser version and OS or laptop, but it cant identify end-users in a stable way?
Yeah, it doesn't tell a website who you are. Instead, it allows them to recognize you again when you come back to visit again, even if you clear cookies.
This is particularly a problem with big advertiser networks because they can track you across many sites you visit, even if you disable third-party cookies.
It has positive uses too, like preventing click fraud and concert ticket arbitrage.
>Instead, it allows them to recognize you again when you come back to visit again, even if you clear cookies.
I don't think that's what stockhorn said. stockhorn said it can only identify a what browser and OS and laptop model you're using. Someone else with the same browser, OS, and laptop model would have the same fingerprint. So audio fingerprinting couldn't precisely recognize you again when you come back again.
> Someone else with the same browser, OS, and laptop model would have the same fingerprint.
the collision rate of their ids is stated to be 0.05%
what they do is basically collect a lot of signals from the browser (audio processing stuff being only a part of it) and then compute an id on the server.
My phone running Firefox for Android produced the same results as the sample data for Firefox on Windows which does seem to fit with this largely being a browser identification scheme
I think that is correct, but it still seems like an amount of leakage that could be further correlated with other another trick.
There was previously a site which could indicate how globally unique your environment was (some combination of screen size, user-agent, fonts?, etc). Locking down to a specific hardware+browser configuration probably does a lot to remove anonymity.
Not the one I used, but this one actually looks better.
Just being Linux + Firefox is terrible for blending into the herd. Let alone everything else that leaks (having a desktop + GPU + good monitor basically destroys all remaining hope).
> Fingerprinting is used to identify bad actors when they want to remain anonymous. For example, when they want to sign in to your account or use stolen credit card credentials. Fingerprinting can identify repeat bad actors, allowing you to prevent them from committing fraud. However, many people see it as a privacy violation and therefore don’t like it.
This doesn't seem to acknowledge the use of fingerprinting in intentional violation of the privacy of ordinary people, for marketing profiling and just selling them out because someone is willing to pay.
On https://demo.fingerprint.com/ , they do start to hint at non-anti-fraud purposes, but the use case seems to be full of poo. (Logins or cookies are the way to do this. Anything else is trying to circumvent privacy mechanisms. And if they don't distinguish users perfectly, they're doubly violating privacy by then leaking private information between people.)
> Personalization -- Improve user experience and boost sales by personalizing your website with Fingerprint device intelligence. Provide your visitors with their search history, interface customization, or a persistent shopping cart without having to rely on cookies or logins.
> Heads up! -- Fingerprint Pro technology cannot be used to circumvent GDPR and other regulations and must fully comply with the laws in the jurisdiction. You should not implement personalization elements across incognito mode and normal mode because it violates the users expectations and will lead to a bad experience. -- This technical demo only uses incognito mode to demonstrate cookie expiration for non-technical folks.
Sounds a bit like a disingenuous bad actor doing CYA while demonstrating their capabilities, nudge, nudge, wink, wink.
"Funniest" part is that this page also tries to establish a webrtc connection which i know because my firewall told me browser tried to connect via nat-stun port to some server. Webrtc is a common way to fingerprint vpn users because in some setups it leaks your real ip.
Based on the article, it sounds like this doesn't activate a device's microphone at all. If it did, most (all?) browsers would give a pop-up requesting permission for that.
From the article: "In a nutshell, audio fingerprinting uses the browser’s Audio API to render an audio signal with OfflineAudioContext interface." It links to a previous article with more details:
This is using differences in the audio processing pipeline of the browser, they just use some input sound which could be taken from a file. The fingerprint is the slightly different output signal when applying filters to the input signal.
How is it possible that this produces enough variations to be usable without sampling some sort of audio source? The entire pipeline is digital, there is not any room for interference.
Please stop wasting everyone's time with your random assumptions as to why this does or doesn't work and just click on the link in the article to the detailed explanation of exactly how this works.
> The technique is called audio fingerprinting, and you can learn how it works in our [previous article].
It’s doing signal processing using floats, that can lead to differences in the result even if the implemented algorithm is identical. Float addition is not commutative so reordering some calculations, either in different implementations or with different compiler options, can lead to slightly different results. This just detects browser version and maybe OS/Architecture, the same browser binary should still give the same results between different devices with same hardware.
They just generate a sine wave and do some processing on it. The results are very similar but because the processing depends on functions like fast fourier transform, plus the exact algorithm in the browser code, tiny differences emerge.
It's about variations in the implementation of the digital pipeline that are traceable to the output. It has nothing to do with analog processing or interference.
But all iPhones of the same model have the same processor. Every iPhone 15 Pro Max, of which Apple sells hundreds of millions, all have the same processor.
They don’t. If you read this post carefully, it just claims to be able to tell Intel Macs from ARM Macs. It can also distinguish from older Safari versions that don’t have the fingerprinting protection.
I think web browsers should implement already an API that allows developers to track any user in a "private" way, by generating a unique hash using your computer specs or something, and make it different for each website.
So, if you visit Google, your hash would be something like "h38kflak". If you're visiting twitter, the API would generate something different, so you won't be tracked across websites.
That way, even if you clean your cookies, you can still be identified as the same user.
The use case? Fraud detection and that kinda stuff. For example, you may create a web game where you allow users to play instantly without "creating" an account. So, an anonymous account would be created in the background, in order to log in. Any bad actor can just clear their cookies/storage to bypass a ban. IP banning isn't reliable, as multiple users may share an adress.
It's a shame that we have to rely on web api hacks in order to fingerprint users for legitimate reasons, and that ends up in an eternal cat and mouse game, because anything you try today may be broken tomorrow.
Because users do not want to be tracked or fingerprinted. I don't care about fraud detection and I am not a fraudster so why do I have to be tracked? There is no way that a feature like that would not get abused in one way or the other.
Codenamed 'DrawnApart', the technique relies on WebGL to count the number and speed of the execution units in the GPU, measure the time needed to complete vertex renders, handle stall functions, and more stuff
________________
1. https://www.bleepingcomputer.com/news/security/researchers-u...