Popular open source project Moq criticized for quietly collecting data

  • 🔧 Issue with uploading attachments resolved.
  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
Popular open source project Moq criticized for quietly collecting data
By Ax Sharma

cyber-smiley.jpg

Article | Archive

Open source project Moq (pronounced "Mock") has drawn sharp criticism for quietly including a controversial dependency in its latest release.

Distributed on the NuGet software registry, Moq sees over 100,000 downloads on any given day, and has been downloaded over 476 million times over the course of its lifetime.

Moq's 4.20.0 release from this week quietly included another project, SponsorLink, which caused an uproar among open source software consumers, who likened the move to a breach of trust.

Seemingly an open-source project, SponsorLink is actually shipped on NuGet as closed source and contains obfuscated DLLs that collect hashes of user email addresses and send these to SponsorLink's CDN, raising privacy concerns.

Moq breaks user trust
Last week, one of Moq's owners, Daniel Cazzulino (kzu), who also maintains the SponsorLink project, added SponsorLink to Moq versions 4.20.0 and above.

This move sent shock waves across the open source ecosystem largely for two reasons—while Cazzulino has every right to change his project Moq, he did not notify the user base prior to bundling the dependency, and SponsorLink DLLs contain obfuscated code, making it is hard to reverse engineer, and not quite "open source."

"It seems that starting from version 4.20, SponsorLink is included," Germany-based software developer Georg Dangl reported referring to Moq's 4.20.0 release.

"This is a closed-source project, provided as a DLL with obfuscated code, which seems to at least scan local data (git config?) and sends the hashed email of the current developer to a cloud service."

The scanning capability is part of the .NET analyzer tool that runs during the build process, and is hard to disable, warns Dangl.

"I can understand the reasoning behind it, but this is honestly pretty scary from a privacy standpoint."

SponsorLink describes itself as a means to integrate GitHub Sponsors into your libraries so that "users can be properly linked to their sponsorship to unlock features or simply get the recognition they deserve for supporting your project."

GitHub user Mike (d0pare) decompiled the DLLs, and shared a rough reconstruction of the source code. The library, according to the analyst, "spawns external git process to get your email."

It then calculates a SHA-256 hash of the email addresses and sends it to SponsorLink's CDN: hxxps://cdn.devlooped[.]com/sponsorlink.

sponsorlink-reverse.jpg
Telemetry code hidden within Moq and SponsorLink (GitHub)

"Honestly Microsoft should blacklist this package working with the NuGet providers," writes Austin-based developer Travis Taylor.

"The author can't be trusted. This was an incredibly stupid move that's just created a ton of work for lots of people."


Developer defends change
In a comment, Cazzulino explained his reasons, admitting that the "4.20" version was "a jab so that people wouldn't take it so seriously."

"I've been testing the waters with SponsorLink for a while now (~6 mo since the announcement)," says Cazzulino.

"It has been hard getting actual feedback, so even if the comments are a "bit" harsh, I really appreciate it!"

Cazzulino further updated the SponsorLink project's README with a lengthy "Privacy Considerations" section shown below that clarifies that no actual email addresses, just their hashes, are being collected. The update came as of a few hours ago—after the backlash emerged.

There was some concern that SponsorLink might be collecting your email without your explicit consent. This is incorrect, and can easily be verified by running Fiddler to see what kind of traffic is happening.

Specifically, the actual email is never sent when performing the sponsoring check. The email on your local machine is hashed with SHA256, then Base62-encoded. The resulting opaque string (which can never reveal the originating email) is the only thing used.

The only moment SponsorLink actually gets your email address (to perform the backend- side association of that opaque string with your actual email and GH user to link your sponsorship), is after you install the SponsorLink GitHub app and give it explicit permission to do so.

Also, the moment you suspend or uninstall the app, we delete all records associated with your account and your email(s).

"The notice seems to be a reactive response to the online backlash rather than the project being upfront about what data was being harvested," Ankita Lamba, senior security researcher at Sonatype told BleepingComputer after spotting the update.

In the past, Cazzulino has also defended his decision to keep SponsorLink closed source and obfuscated so as to prevent some of its checks being bypassed. In his words, the opaque features of the library are "by design."

A potential privacy concern
The quiet inclusion of SponsorLink in projects, such as moq, is a matter of privacy from an ethical and legal standpoint.

First comes the question of an obscure, closed source dependency (SponsorLink) being distributed via open source channels, and being included in popular OSS projects, such as GitInfo—which is also created by Cazzulino and downloaded millions of times.

Collection of email address hashes may not altogether be anonymous either.

In theory at least, SponsorLink's developer could compare the harvested hashes against a database of email addresses leaked somewhere and identify users.

"I consider your hashing more as a security by obscurity. Even hashed mail should be sent only after consent," states Michał Rosenbaum.

"I'd say serious concerns have now been raised. The vast majority of users don't even know this change has been made and would have a problem," states another software engineer, Kevin Walter.

"Trust with moq is now broken as has GDPR. This is underhanded to say the least. Be one of the good guys," Walter urged Cazzulino to be more transparent with regards to the obscure SponsorLink package.

In reaction, several developers either threatened to discontinue use of Moq [1, 2] in the favor of alternatives, and building tools that would detect and block any projects that run SponsorLink.

Some went a step further, suggesting they would boycott projects that use SponsorLink or even report "SponsorLink" as malware to the NuGet registry [1, 2].

Although the controversial change to Moq has been rolled back in v4.20.2, for a reason that others have, yet again, called out, there remains a possibility of future Moq releases reintroducing a similar "feature."

BleepingComputer contacted SponsorLink's creator, Cazzulino for comment prior to publishing but did not hear back.
 
This reminds me of the Audacity controversy a while ago, and the whole Sneedacity thing that came out of it.
 
Every tech company is collecting your data. Every single one. How is this still surprising people?
 
Every tech company is collecting your data. Every single one. How is this still surprising people?
This isn't a major tech company. This is an open-source project, whose maintainer decided to package a closed-source component which exfiltrates your personal data without disclosing it to you. The primary maintainer is a member of a boutique consultancy firm based out of Argentina. He was mad that people weren't sending him enough money to maintain his tool that he uses himself, and wrote a closed-source component that connects to his Patreon and regularly complains that you haven't paid him enough.

This is against open-source community norms, in addition to probably violating privacy laws in many jurisdictions. If you have a closed-source add-on to your open-source project ("open core"), you usually would have a separate package for it. If you have an open-source add-on to your open-source project, you can't charge for it because it would be trivially bypassed.

This isn't the first time people have done something similar. There was a problem with node.js maintainers panhandling for donations from your build script, which eventually resulted in the npm fund command to tell you how to donate to his crude linter.

If your "open-source software" is really shareware, label it accordingly and you won't have problems.
 
Ok, can someone TLDR what the fuck Moq is and why it's popular? I've never heard of it until now
 
Ok, can someone TLDR what the fuck Moq is and why it's popular? I've never heard of it until now
I was just about to ask the same thing. Fuck this journoscum for failing to do the most basic of reporting tasks -- explaining (even briefly) what the hell they're even talking about.

Also gotta lol at the now-standard response to any complaint about unsavory behavior by the developer (no matter how egregious): "it's my software, I can do what I want with it!" And of course describing any criticism as "harsh" or "an attack" or only being from a "vocal minority of users," heavily implying that their opinions don't matter since they're inconvenient.

It's especially scummy that these weasels always wait until a project is popular and in use by millions to pull this shit, trusting inertia to keep the project afloat and prevent forks that get rid of the garbage. Fuck these assholes.

ETA: I looked it up. Moq is a "mocking" library for .NET (C#). Mocking is a technique used in software testing where you're testing components that reach out to other systems (network requests to other services, like databases, REST APIs, etc.), but your test focus is the component, and not the other systems. Tests need to be as specific and "granular" as possible -- if you're testing a function called verifyLoginCredentials(request_from_client), you don't want to actually poke a remote authentication server or go to the trouble of spinning up your entire application just to submit a login request to it. Instead, you want to fake (or "mock up") both the request and the response from the remote server because you're focused on testing the function's behavior given specific conditions.

It's useful because you wind up testing exactly what you care about and nothing else. Your test doesn't need to "hope" the remote authentication server is alive, behaving itself, responding correctly to valid requests, or even that the test credentials are valid on that server. You're not testing the authentication server, so with mocks your test can focus on your code and just assume it's going to get a good response from the server. Conversely, you can also easily write a second test to see how your code behaves when it gets a bogus response from the server (you can mock that too) or even another type of failure (timeout, no route to host, DNS record not found, etc.).

You can do all that with no network connection (i.e. you can test your stuff on a laptop mid-flight over the pacific) and no external dependencies.

It's a seriously fucked up dick move for this developer to pull this shit. It's going to take a lot of work for people to migrate from this to whatever replaces it (it'll eventually happen no matter how much Moq's developer backs down or apologizes). You can pull this shit on private end-users, but corporations that depend on this sort of thing and have legal requirements to consider won't tolerate it. Anybody operating under HIPPA-controlled conditions are fucked for example -- they'll have to pin their code to the latest version of Moq that doesn't have this shit in it immediately and then start scrambling to replace it. Government contractors, government agencies, etc., are all going to be affected by this.
 
Last edited:
I was just about to ask the same thing.
I found this from GitHub:
Moq (pronounced "Mock-you" or just "Mock") is the only mocking library for .NET developed from scratch to take full advantage of .NET Linq expression trees and lambda expressions, which makes it the most productive, type-safe and refactoring-friendly mocking library available. And it supports mocking interfaces as well as classes. Its API is extremely simple and straightforward, and doesn't require any prior knowledge or experience with mocking concepts.
But I still don't fucking know.

Edit: thank you @SIMIΔN. Next question, how the fuck is that getting 100k downloads per day?
 
Next question, how the fuck is that getting 100k downloads per day?
It gets downloaded any time anybody does a "clean rebuild" of a project that uses it (Windows, NuGet, and Visual Studio are all astonishingly terrible at caching unless you go to the trouble of setting it up right, which few people do), and lots of projects use it. It also gets downloaded again any time an automated build happens (because it typically gets done in a virtual machine or container -- the build starts "from scratch" each time).

A single developer working on a single project could very well be responsible for 50+ new fetches per day if he's doing build/staging tests all day in preparation for an upcoming release. Spread that across a bigger team and you might have one department triggering hundreds of downloads per day just by themselves.
 
This isn't a major tech company.

It gets downloaded any time anybody does a "clean rebuild" of a project that uses it (Windows, NuGet, and Visual Studio
Major enough that people shouldn't shocked-pikachuface when they turn around and resell your data.

If software is connected to the net, the data is sold as extra revenue.
 
Major enough that people shouldn't shocked-pikachuface when they turn around and resell your data.

If software is connected to the net, the data is sold as extra revenue.
I don't think you understand the concept of free and open source software. The whole reason this is a major controversy is that this is an open source project and not some big tech product. With open source you can actually verify and modify the code you're running to prevent just that.

The dev added some proprietary crap to the codebase and people can actually see it in there.
 
Back
Top Bottom