Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Vuvuzela: private messaging system that protects metadata (vuvuzela.io)
165 points by 0xmohit on Nov 30, 2016 | hide | past | favorite | 45 comments


> To hide metadata, messages are routed through multiple servers, and each server adds noise. This makes the metadata incomprehensible, even to powerful nation-state adversaries.

I'm interested to learn more about these servers. Can anyone host one? How are they discovered/networked? Personally I'm still more inclined to stick with Ricochet.IM since it piggybacks Tor so there are already tons of servers out there.


I'm curious about ricochet, but not excited about a service that is integrated with Tor...I've heard a lot of things recently about Tor's insecurity. If this new thing is better about protecting users (as it seems it might be), then I'd definitely stay away from something interfacing with Tor.


A lot of the high profile stories about people being busted over Tor recently have been related to vulnerabilities in FireFox/Tor Browser Bundle and the FBI running honeypots that can exploit those vulnerabilities.

Ricochet works by essentially starting up a Tor Hidden Service and then listening for messages from other users, so there's no exposure to browser-based attacks. It's end-to-end encrypted and less susceptible to traffic analysis attacks because it never hits an exit node.


honeypot has to be the most generous name for "federal government controlled and operated pedophilia distribution network" ever.


Brings new meaning to the adage about how online men are men, women are men, and kids are fbi agents...


You are ill informed. Tor is as safe as it ever was. Just be smart even using it. Please visit /r/tor and learn about it yourself. Instead of spreading disconsent through your vague wording and accepting media outlets at face value


That sounds like an https://en.wikipedia.org/wiki/Argument_from_ignorance . You hear a lot of things about Tor in general because Tor is popular and has existed for many years. Vuvuzuela is new, so we don't actually know whether it's secure or not.


What an appropriate name for software that hides a signal in the noise. Nothing can be heard over a vuvuzela.

For those who didn't watch the 2010 world cup, or have blocked out the memory: https://www.youtube.com/watch?v=bKCIFXqhLzo


Couldn't Matrix clients adopt this as well, if only as an opt-in feature? Perhaps some clients could even use it by default, and then if such a client talks to a different client that has the feature opt-in, it would be automatically enabled for the opt-in client, too.

I imagine this would work better for Matrix due to its federated nature than it would for Signal.

The sub-2 seconds latency doesn't seem that bad, if it actually offers strong anonymity at that level. I would've thought it would be more like 15-20 seconds, which would probably be useless for all but the actively under attack targets.



|Vuvuzela: Scalable Private Messaging Resistant to Traffic Analysis

Their server is throwing a 500 error


that means that its working with perfect privacy


Vuvuzela seems great except your anonymity is guaranteed by a set of high availability and capacity servers.

Those servers need to independently operated and resistant to global compromise.

We don't really have a good template for a system like this. We have centralized HA systems and decentralized high churn systems like Bittorrent and Tor.

I've been intrigued by the observation that forthcoming "proof of stake" blockchain systems have very similar requirements in terms of availability and capacity to anonymous messaging systems. I wonder if we can use the nodes in a PoS system to bootstrap an anonymity system like Vuvuzela.


My model was using ideologically different people in competing jurisdictions whose company's products are used by governments and big business for critical stuff. Such reducing subversion risk plus aligning incentives of attack and defense a bit better.

Diverse, hardened OS's & CPU's too. Verified protocol stack. The usual.


There is a proposal from David Chaum called PrivaTegrity: https://eprint.iacr.org/2016/008.pdf

As to the proof of stake algorithms - the efforts to do it in a strict way all fail - I have a feeling that it might be impossible.


> We don't really have a good template for a system like this.

Sounds like the distributed trust system Apache Milagro (incubating). http://milagro.incubator.apache.org/


There's very little on the Github page - 18 commits, with the last "real" commit (not just an organizational change) being in September. This might be a thing some day, but right now it's just another clever idea.


Author here.

I've been working on a follow-up system called Alpenhorn that addresses the bootstrapping problem in Vuvuzela: https://vuvuzela.io/alpenhorn-extended.pdf

I will release Alpenhorn and a new version of Vuvuzela in January.


There may be even lesser activity now, given that the author has moved to Google.


Only temporarily. I'm interning on the Go team, but still working on Vuvuzela and Alpenhorn in my spare time.


I love this! The only thing it would require in addition is if it was serverless and p2p, like torrent.


It is impossible to build anonymity system without some sort of centralization due to possibility of Sybil attack. You need some trusted entity beforehand, otherwise your ISP can just simulate the whole network for you and never let you connect to the real network without you even noticing this.


You still need a mechanism for peer discovery.


Reserve few ports and connect to random IP address until you've found a node. Works only for IPv4, though.


Yikes, no... we have much better ways than that.

Typically bootstrapping a list of a few thousand nodes with top uptime is a good starting point, then using peer exchange between the nodes to find more, store them and prefer them when possible. Bittorrent DHT could also be (ab)used as a discovery mechanism.


It's too easy to block those few thousand nodes and render the entire software useless. If I would start Tor on my PC, it won't connect to any nodes, because all bootstrap hosts are blocked by my provider. It's centralization with all related drawbacks.


Not really - those few thousand nodes are only used for the initial bootstrap, once you've connected to the network or can get a node list from a friend once, you'll have more than that. And it's very hard to act legally against thousands of nodes instead of a single entity. Or to convince ISPs to block them all. Definitely an improvement over a centralized system. Some services will try to improve this further by only sending random nodes to each person who downloads, using the hash of their IP as an rng seed.

Yes, an ISP who's willing to block a few thousand nodes will be able to take your shit down no matter what. That's why Tor uses private "bridge" servers for people in highly censored countries and tries to mask traffic as some other type. There's no way around that. But that doesn't mean decentralization is useless - it's incredibly useful in most countries and with most ISPs.

If you need to scan IPv4 space randomly to find a node, you can easily find all nodes. This takes about a day or two on a cheap VPS.


FWIW here is the only data that Signal had available to turn over when requested by the government: https://twitter.com/whispersystems/status/783325788883955713


There is actually more than that stored in the server's database. Push messaging IDs are there, for example. But ignoring that, let's say the attacker is watching all the Signal server's connections. Who is talking to who can be determined based on size, direction and timing of traffic between clients and the server. The server could also be modified to log it.

The idea with projects such as Vuvuzela is to make metadata less usable.


I wonder, why even store those two pieces of information? I mean, they're not exactly essential.


Pure speculation, but backend tidying?

if ( days between acct creation and last check-in > N Days ): archive record; rm prod record;

There may be less identifiable means for these boring operations though. Just the first thing that popped into my head.


How is that connected to vuvuzela? It is not even subtle at this point.


HN hug of death might be throwing it off. I'm getting a '500 Internal Server Error' when I try to visit the link.

Here's the GitHub page: https://github.com/vuvuzela/vuvuzela


Interesting system, and neatly picked name. Though where can i find more information about knowing who you are talking to is really who you think you are talking to?


Awesome. What I want in addition, is for others to never know Vuvuzela has been downloaded, installed, or used. That has implications up and down the stack of course. But otherwise, this knowledge is enough to flag users of privacy protecting technology.


Could you comment on how that would work?

I'm trying to imagine how I would hide that I downloaded an app on a phone using the appstore (unless you want this to end up like PGP, which is used by the crypto community and no one else because of the perceived complications vis-a-vis implementation.)


Embed the app in seemingly innocent other apps? If there were a raft of say, free useless game apps, there would be the element of innocence there, such as "My mother downloaded Patience, no idea it contained Vuvuzela"


I read their slides, and am reading through their paper. How is this different from steganography?

In my understanding, the set of all communiqués between every user and the Vuvuzela network approaches pseudorandom noise, among which the actual conversations are hidden.


steganography is a concept; vuvuzela is an implementation of that concept


With steganography you make your message look like something else (eg a text message inside a JPEG). But vuvuzela isn't doing that. They are making it look like noise, I guess. But who transmits noise.


Well, that's exactly the problem with systems like this.

If you transmit something that looks like noise, and no one else sends something that looks like that particular kind of noise, then you are raising a flag that says "No, really, please, capture this data, it's interesting."


How can you differentiate between different "types of noise"? If the traffic is cryptographically sound, the signal is indistinguishable from the noise. If the messages from multiple people have a guessable seed, in such a way that you can identify what is noise, it just means that the system is not cryptographically sound.


> How can you differentiate between different "types of noise"? If the traffic is cryptographically sound, the signal is indistinguishable from the noise.

Message length, relative timing, average bandwidth, ports used, source/destination addresses, activity punch-card.

I'd assume that this kind of traffic is identifiable to within near perfect certainty, which would also make it easy to block.

The situation is a bit similar to early crypto-analysis: it's totally easy to devise a cipher that makes text look random to the eye, but is still easily cracked using statistics (eg. frequency method). Just because traffic patterns look all complex and random doesn't mean that there is a meaningful amount of entropy in it (but you need a lot of entropy to hide all the metadata - who with whom and when). Just because bandwidth or packet frequency looks independent of user activity it doesn't mean that it actually is.


[flagged]


This is not an acceptable comment on Hacker News. Please post civilly and substantively or not at all.

https://news.ycombinator.com/newsguidelines.html





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: