OpenFace: Free and open source face recognition with deep neural networks

kefka · on Jan 19, 2016

I also built an Open Source (GPL3) facial recognition program as well, called uWho (github.com/jwcrawley/uWho). Mine doesn't need anything like CUDA or OpenCL, and runs on a 6 year old laptop at 1280x720@15fps.

I ran it at our free, ticketless convention called Makevention (bloomington, IN). Estimates were that 650-700 people showed up. My tracker counted 669 uniques, which I think is spot on.

I also wrote mine for privacy in mind. The database was a KNN on a perceptual hash of the face. The data that was stored was only a hash and could only verify a face: it could not generate the face from the hash. Considering the application (Maker/Hacker con) I wanted to be sure that this was the case. (The data only resided on that machine, and it's wiped now.)

I've halted work on the gui version of it. Now I want to make it into a client/server, where the clients are RasPis (or other cheap compute with camera) and the server is whatever good machine you have. Initially, I'll reimplement the same algo, but I know that KNN has exponential time/cpu requirements the more samples I get.

bdamos · on Jan 20, 2016

OpenFace can optionally use a CUDA-enabled GPU, but it's not a requirement. The performance is almost real-time on a CPU. After detection (which varies depending on the input image size), the recognition takes less than a second. We have a few performance results on the FAQ at http://cmusatyalab.github.io/openface/faq/

I'm surprised (and skeptical) uWho can do detection+recognition at 15fps. I would expect face detection alone in 1280x720 images to be much slower than 15fps. On my 3.7GHz CPU with a 1050x1400px image, dlib's face detector takes about a second to run. This is also my experience with OpenCV's face detector, which I noticed your code is using. Also OpenCV's face detector returns many false positives, especially in videos. See this YouTube video for an experimental comparison: https://www.youtube.com/watch?v=LsK0hzcEyHI

Also, I think it's a strong claim that faces can't be generated from a perceptual hash. One property of perceptual hashes is that hashes that have a close hamming distance to each other are more similar (of the same person). I wouldn't be surprised if a model could successfully map perceptual hashes to faces given enough training data. I read a good paper about doing this (not specific to faces) but can't remember the reference now.

Edit: I just added some simple timing code to this sample OpenCV face detection project on my 3.60GHz machine: https://github.com/shantnu/FaceDetect On the John Lennon image from the OpenFace FAQ sized 1050x1400px, it takes 0.32 seconds, which is about 3fps. This is slightly quicker that dlib's detector on the same image, but it also returned a false positive.

kefka · on Jan 20, 2016

I'm doing a bit of tricks to speed up the performance. And I'm perfectly find that you question the performance :) I encourage you to try it out.

My first problem/observation is that Haar cascades looove running on a GPU due to their Float-y nature. But dealing with them on a CPU frankly stinks. I was getting 1 frame/10 seconds at 800x600 with the included Haar face detector. That's effectively unusable.

Turns out there's also LBP cascades, which are integer based. And they run fast on a cpu. But, from my observations, they have many false positives. But they seem to have no issue with false negatives, so I grab all the faces, plus a few "junks".

The speedup is, is that I can use an LBP and then throw the region of interest (the potential face) onto a Haar cascade eye detector. Now that I'm dealing with much smaller pictures, haar runs acceptably. Literally, if (eyes.size > 0){is valid face.....}

Then I proceed to use the built in function on OpenCV contrib Face library. The problems with the library are numerous. Mainly, the settings are provided without good descriptions, or whatever the defaults of whatever academic papers had them set as.

Because I'm also an academic, I was able to get ahold of quite a few large datasets of face data. After doing so, I wrote a few small programs that attempted to calculate the ideal settings for the FaceRecognizer call, which I believe I did so. (The settings are in the call, in the source.)

Of course, I do get some slowdowns depending on how many faces there are (mainly, stay away from google searches for faces). But then again, 4 haars on 50x50 images is not that bad at all.

My machines used: Thinkpad T61 (8GB ram), Intel NUC (8GB ram, I5 cpu) Camera: Logitech C920 webcam

I did try my code using the max resolution the camera could acquire (1920x1080).... 1 frame/5 seconds.

anonym28492 · on Jan 19, 2016

i have just checked and you are only using existing opencv functionality - openface is based on state of the art research in face recognition using convolutional triplet networks

kefka · on Jan 19, 2016

you'll get no argument out of being that I use OpenCV. It is the tool that I had, and works effectively the way I programmed it. Also, I do not have a general purpose GPU so I cannot do gpu-accelerated calculations.

this model is indeed a new and novel way of handling facial recognition. the only caveat for me using this is that I do not have a GP GPU. if I am able to acquire one, I will undoubtedly use it instead!

bdamos · on Jan 20, 2016

The argument isn't that you use OpenCV: OpenFace also uses OpenCV. However, I think you should target and present your program as being a program that uses face recognition, not as a face recognition program. You are using and not crediting here that your program uses existing, off-the-shelf face recognition functionality already in OpenCV: https://github.com/jwcrawley/uWho/blob/2823479d5abf9f8f2de21...

anonym28492 · on Jan 19, 2016

you can utilize a neural network on a cpu as well - it will just take longer for an answer to be retrieved

the intention behind my comment was only to point out that your solution does compete on a different level - i did not intent do speak badly of it

kefka · on Jan 20, 2016

Sorry, I wasn't angry at all. I was at a stoplight and read what you wrote, and responded with my voice to text (android). :)

And they are indeed different projects. I would dare say that they have much better quality, but I've not used it. 120d of freedom does give a great deal of unique clustering data. I know mine doesn't compare with that, but also doesn't require as much power either.

Sincerely,

baldfat · on Jan 20, 2016

I think I now have an idea for counting people who attend church. :)

kefka · on Jan 20, 2016

You don't actually need a computer with my software. There's a button that you can easily load a video file and do the same classification on the video file.

Just record a video of the front doors, and load it later on. You don't freak the people out and you can quietly do the classification later on.

baldfat · on Jan 20, 2016

That was what i actually was thinking about. We have a security system that takes HD video on our three entrances.

Couldn't this be a HUGE tool for smaller restaurants and figure out how many repeat customers they have and if they start losing or gaining them?

kefka · on Jan 20, 2016

Absolutely. My problem was that of how to market something like this.

I have no clue how to do so, and the 2 people that contacted me fizzled after initial contact. I know I did get it in Hackaday, where there was a a bit of a spat between the "cool" and "evil" factions regarding this area. ( http://hackaday.com/2015/03/04/face-recognition-for-your-nex... )

It's also why I wanted a server/client architecture, where each machine can handle face tracking (of a face, not a specific face) and then upload that image to the server, where it is processed for "whom" it is. I wanted the interface to be a clean HTML5 app, where I am now able to bring something pretty to market.

With that setup, I could be feeding in data from 80 cameras in a beefy server and keep on chugging nicely.

If you'd like to talk further about this, my email is whois: crankylinuxuser.net :) My phone# is real there too.

thedangler · on Jan 19, 2016

I would use it to detect friends that come over to my place and load their taste in music on my media server.

seiji · on Jan 19, 2016

Can do the same thing a little more simply by sniffing phone 802.11x mac addresses and/or bluetooth identifiers.

Can also modify the idea a bit to create an "unknown person" warning system if it detects unauthorized device IDs broadcasting within range.

1024core · on Jan 19, 2016

> Can do the same thing a little more simply by sniffing phone 802.11x mac addresses and/or bluetooth identifiers.

I believe the iPhone now randomizes MAC addresses when probing.

mvid · on Jan 20, 2016

No, not in practicality. It only changes the MAC when the iPhone is asleep. Since apps have so much background functionality, the phones almost never sleep.

godzillabrennus · on Jan 19, 2016

Oh good, now the 7/11's will be able to detect who I am by my face through their security cameras and will do some targeted advertising to me as I walk the aisles.

hueving · on Jan 19, 2016

They can partner with the local gov so when they see you approaching on a street camera they can put a fresh hot dog on the roller for you.

strictnein · on Jan 19, 2016

See, mass surveillance has its upsides too.

avereveard · on Jan 19, 2016

Needs to be trained first, and if you train it with a population as large as 7/11 customer base it's accuracy is gonna crash to the ground

Karawebnetwork · on Jan 19, 2016

Have a camera take photos of the customers as they pay with their VIP cards (or when you ask for the customer's email) and train it from those images.

zo1 · on Jan 19, 2016

I thought it came pre-trained?

joshfraser · on Jan 19, 2016

on celebrity faces, yes.

brokentone · on Jan 19, 2016

Or know if you're a shoplifter:

http://www.bbc.com/news/technology-35111363

jameslk · on Jan 20, 2016

What would be more practically valuable to them is to track how many times you visit 7/11 so they can determine how often their customers are returning and use that to make predictions about their future. Then they can sell that to their investors.

Sanddancer · on Jan 20, 2016

There are already a number of facial detection engines. If 7-11 wanted to do that, they could have done it by now.

nl · on Jan 20, 2016

Face Detection yes. Face Recognition engines exist, but are a lot rarer (and typically more expensive) than detection.

kefka · on Jan 20, 2016

That's what my github.com/jwcrawley/uWho project does:

Facial recognition on an old computer with no GPGPU processing.

Now , any of the CVDazzle "exploits" screwing up a LBP cascade on face detection works to subvert my model, as does closing ones eyes. Aside those small details, it was in large a great success for my initial goal.

Now, I'm looking at getting it into a client-server architecture, where I can recognize who lives here, and whom comes over regularly. The idea is that I can have an "aware front door" that can ring a doorbell or send a message (with who: text/picture) if we are away. Even crazier is to allow bidirectional voice to us over a VoIP connection.

I see its purpose in a part of the home IoT infrastructure that isn't like a "internet toilet" kind of uselessness.

baldfat · on Jan 20, 2016

Minority Report - Personal Advertising in the Future

https://www.youtube.com/watch?v=7bXJ_obaiYQ

legulere · on Jan 20, 2016

That's why you need laws that make clear what shops are allowed to do and what they aren't.

teps · on Jan 20, 2016

It would be interesting to see if it's possible to recognize people in films. I'm not sure if it's much harder or not. In a way, a video is more complicated than an image, but you have way more data to recognize a face. Someone know if there is any work in that direction?

A plugin for vlc that can show you the name of any actor when you ask would be really fun!

andreyk · on Jan 20, 2016

Google Play has already had this for a little while: http://ccm.net/faq/30199-google-play-movies-tv-now-integrate...

It's pretty trivial since you know the actors in the movie and can do the face recognition for the people on screen using the existing methods when the video is paused, but still quite neat to see it happen.

mr_spothawk · on Jan 19, 2016

can copwatch.org find a clever way to use this?

exolymph · on Jan 19, 2016

I like the way you think!

mike_rochannel · on Jan 20, 2016

Actually, it's nice to see functionality like this in the open. Attached to a wide range camera pointing to a local train station it should be fairly easy to match faces to people driving to work leaving their house and flats unguarded.

There was a nice book, "Database Nation", that described a case of scanning licence plates of cars crossing a bridge to see who's at home and who left for work. Made burglaries a lot easier.

And now we do that based on faces ... nice .. /s

exolymph · on Jan 19, 2016

"We do not support the use of this project in applications that violate privacy and security. We are using this to help cognitively impaired users to sense and understand the world around them."

Oh okay. Surely this will stop any bad actors.

nl · on Jan 20, 2016

I strongly, strongly support the open sourcing and wide distribution of this functionality.

Face recognition is widely used by large players now. Some of those players are bad, some are good. It is important that this technology becomes widespread so people understand it better.

jcoffland · on Jan 19, 2016

Do you suggest they not release this code? Or would you post the code with out the disclaimer?

exolymph · on Jan 19, 2016

I think they're acting in good faith, but I wish they'd acknowledge that disclaimers are not a particularly effective preventative measure. I agree that putting this provision in the license would help, although it would still be subject to interpretation.

I guess my main objection is that it's naive to expect to have the best of both worlds -- there are always tradeoffs, and the disclaimer doesn't acknowledge that tools like OpenFace cannot be released without negative consequences along with the good ones. It is what it is.

AndrewGaspar · on Jan 20, 2016

They're not saying that you can't use it for that, they're saying they're not supporting use cases that do that. The disclaimer really just means "please don't file bugs and expect help if your scenario is to monitor a public intersection" for example.

comboy · on Jan 19, 2016

Depends if they put something related to it (probably formulated differently) in the license or not. If they do, then yes, I think it may stop some companies from using it.

Unfortunately currently their license page returns 404 http://cmusatyalab.github.io/openface/LICENSE

bdamos · on Jan 19, 2016

Thanks for pointing out the 404, I just corrected the link.

There's an interesting discussion on lobste.rs from a few months ago about privacy issues and licensing: https://lobste.rs/s/sajz0s/openface_face_recognition_with_go....

finnn · on Jan 19, 2016

The efforts people will go to to avoid using HTTPS.... https://cmusatyalab.github.io/openface/demo-1-web/

dsp1234 · on Jan 19, 2016

To be fair, the likelihood of someone MITM'ing a connection to a docker container running on the localhost is near zero. Which is what the original issue that prompted those instructions was about[0].

[0] - https://groups.google.com/forum/#!topic/cmu-openface/TNZR_QV...

finnn · on Jan 19, 2016

Ah, that makes more sense. I hadn't considered that running it with docker would make it appear to not be at 127.0.0.1 to the browser