Ggwave: Tiny Data-over-Sound Library

ggerganov · on Feb 13, 2021

Thanks for posting this - I'm the author. I recently posted this to Show HN with a bit of background about the project [0].

My next goals for ggwave is to improve the mobile SDKs and provide more examples. If you want to help out - check out this issue [1].

I am also planning to add some even lower bit-rate protocols, but make them more robust and hopefully remove the sound markers.

- [0] https://news.ycombinator.com/item?id=25761016

- [1] https://github.com/ggerganov/ggwave/issues/2

amelius · on Feb 13, 2021

Have you looked at other types of modulation? Like spread-spectrum:

https://en.wikipedia.org/wiki/Spread_spectrum

This would potentially give many nice features including superimposing multiple signals, so you can have full-duplex and more people can use the audio channel at the same time.

ggerganov · on Feb 13, 2021

Looks interesting. I haven't tried other fundamentally different approaches different than the one I am currently using. I only recently found that my approach is very similar to DTMF, but I use 6 tones instead of 2 and I have sound markers and error correction integrated in the protocol.

zcw100 · on Feb 14, 2021

Fountain codes?

cheeaun · on Feb 13, 2021

Any difference between this and the Quiet Modem Project? https://github.com/quiet/quiet

There are other data-over-sound libs listed here too https://github.com/ganny26/awesome-audioqr

adrianpike · on Feb 12, 2021

This is a neat hack, and cool that there's examples for a wide vareity of languages.

I've been using WebJack (https://github.com/publiclab/webjack) wrapped with a better coding scheme for some similar work, I'm curious why ggwave is intended to use such a low bitrate - I'm able to get pretty good free-air reliability at 1200baud.

ggerganov · on Feb 13, 2021

I didn't know about webjack - looks very interesting, thanks.

About why use such a low bitrate - I believe ggwave is a bit more pleasant to the human ear, although this can be subjective I guess. I also think the transmission is quite reliable - at least this is true for the few devices that I am using. But I guess the main reason is when I started coding this, I had basically 0 knowledge about existing audio modulation schemes and today this hasn't changed much :-) I find it more fun to experiment this way, even though I am probably rediscovering basic knowledge or missing some well-known approaches or techniques in this field.

marcodiego · on Feb 12, 2021

Cool! I like minimodem: https://github.com/kamalmostafa/minimodem and heard from someone who compiled it on termux. Would love if ggwave allowed to chose encoding.

It would be nice if ggwave made a apk available or published on f-droid.

brian-armstrong · on Feb 13, 2021

Shameless plug: I have one of these too :)

https://quiet.github.io/quiet-js/fsk (fsk demo) https://quiet.github.io/quiet-js (standard modem demo)

zokier · on Feb 13, 2021

As you happen to be here, I noticed that the "audible-fsk" profiles in https://quiet.github.io/quiet-js/lab.html seem to be broken, at least on my machine. I get this error:

> Sorry, it looks like there was a problem with this profile. Revert to your last working settings or load a preset, and then try again.

and in console

> error: fskmod_create(), samples/symbol must be in [2^_m, 2048]

brian-armstrong · on Feb 13, 2021

Ah, sorry about that. I haven't updated the lab in some time and it's likely not tracking changes that went in to enable FSK mode. I'll make a note to come back to this later. The lab could use a lot of love in general.

ggerganov · on Feb 13, 2021

Hi Brian, good to see a FSK version of quiet-js. Will definitely give it a try later.

Edit: fix typo

brian-armstrong · on Feb 13, 2021

The demo I linked uses a slowed down profile with lots of error correction applied in an attempt to get longer ranges working. I've tried to expose all the knobs for this on the profile system to make it configurable though. Looks like yours is quite a bit faster.

Have you done much testing with long range and non line-of-sight transmission? The acoustic echoes present some interesting challenges.

edit: Oh also it'd be nice to compare notes about mobile devices sometimes. At least for Quiet I've found the transmission quality is very dependent on specifically how the mic/speakers are set up and it can be tricky to avoid hidden resamplers and noise cancelation.

ggerganov · on Feb 13, 2021

> Have you done much testing with long range and non line-of-sight transmission?

Not much. I tested this today at home and ggwave does not perform too bad. Even without direct line-of-sight it is still able to pick up the data. Hard to give quantitative description of the performance - it's good enough to make me satisfied :)

> Oh also it'd be nice to compare notes about mobile devices sometimes. At least for Quiet I've found the transmission quality is very dependent on specifically how the mic/speakers are set up and it can be tricky to avoid hidden resamplers and noise cancelation.

True. I am still not sure if noise cancellation helps or not. I guess it also depends on the noise cancellation implementation. When running in a browser, I usually disable it.

wyldfire · on Feb 13, 2021

> The bandwidth rate is between 8-16 bytes/sec depending on the protocol parameters.

This makes me wonder - what's the ideal channel capacity for in-room audio with cell phone mic and speakers? And what are the limiting factors? If you had some really sophisticated channel coding, how good could the throughput ever get?

In the 90s we could do 56ish kilobits/sec over PSTN. What's state of the art now? Unchanged because of a lack of a market?

th0ma5 · on Feb 13, 2021

I would consider the FT8 family of amateur radio encodings to be the SOTA. Many of them like WSPR are able to be decoded below the noise floor using forward error correction. It is very limited bandwidth, short messages with callsigns, etc. https://physics.princeton.edu/pulsar/k1jt/

motohagiography · on Feb 13, 2021

Imagining this as a tool for autonomous robots to communicate to people with devices without needing pairing or apps. Like R2D2 but that produced articulate text.

I'm also already thinking about how to implement a batch sending mesh protocol like uucp for bots to have sessionless interconnect.

It seems so simple and obvious, but we didn't have robots the way we do now when 300baud was around. Cool project!

jisbruzzi · on Feb 13, 2021

Is it possible to hide these signals in a song? I mean play song + signal simultaneously and extract the signal.

wyldfire · on Feb 13, 2021

Yes, absolutely. Either in audible frequencies but too quiet to notice, or perhaps in inaudible ones. Though the inaudible frequencies are ones that might get filtered by whatever transmission media you're using.

gbolcer · on Feb 13, 2021

...and exactly the second thing I was going to try!

gbolcer · on Feb 13, 2021

Still exploring, but first thing I'm going to try is non-audible data over sound. (Like the ringtones).

ggerganov · on Feb 13, 2021

Would love some feedback. Non-audible does not work on some devices though. There is for example this weirdness that in Safari, the higher part of the audio spectrum is not captured even though it is with other browsers [0]. Interested if someone here has any idea what could be the issue?

[0] https://github.com/ggerganov/ggwave/issues/5

toomanybeersies · on Feb 13, 2021

There's a possibility that it's a deliberate restriction to prevent the audio interface being used for data (i.e. exactly what your library does)

It would be pretty easy to use this a for malicious reasons, such as device tracking

debbiedowner · on Feb 13, 2021

Are IT departments worried about this analog hole that could enable employees to siphon away data from work? At some bigcos that worry about theft CD drives and USB ports are disabled, but the headphone jack works.

IvanLudvig · on Feb 13, 2021

This method is not that efficient to transfer loads of data

debbiedowner · on Feb 13, 2021

Sure a gigabyte would take many transfers, but a few megabytes would be easy. That could be a lot of text.

Tho true I heard the Waymo guy took 10GB or something. Lots of pdf schematics

Photographing the screen like Snowden is probably still the preferred method.

BlackLotus89 · on Feb 12, 2021

I had a look at something like this some time ago when I tried to implement a PoC to safe backups on audio tapes ;) never came to fruition since the storage capacity was abysmal