Typefont – An algorithm to recognize the font of a text in a photo

Bluestrike2 · on April 30, 2017

It's a neat implementation for a very annoying (and common) scenario. I've used various font ID tools in the past and always found them rather limited. The vast majority are manual ones that ask you to identify particular characteristics of the font in question and then go from there. While that might be helpful, it's also pretty limited. Most of the time, I wound up posting an image to a typography forum to get an ID.

A few sites like My Fonts have an API that lets you generate font samples.[0] You could probably use that to automate generating your font database. Throw in a few other APIs (Google Fonts[1], Typekit[2], etc.) and some other type stores--along with some of the foundry websites that don't sell their fonts on other sites (Hoefler & Co., cough)--and you could build up a very rigorous database/recognition tool in short order.

0. https://dev.myfonts.com/#font_sample

1. https://developers.google.com/fonts/docs/developer_api

2. https://www.adobe.io/apis/creativecloud/typekit/docs/overvie...

mrspeaker · on April 30, 2017

My Fonts had a tool for doing this back in (at least) 2009... I remember because I tried to run the burning "NERDS" sign from "Revenge of the nerds" through it. Didn't work very well though: I had to resort to human intervention. http://www.mrspeaker.net/2009/01/21/revenge-of-the-font-nerd...

Keen for an online version of this project so I can see if it fares any better!

rendernos · on April 30, 2017

I changed the description of the repository and removed the AI tag, thank you and don't be evil!

nom · on April 30, 2017

Please stop using "AI" to describe simple algorithms like this. This has nothing to do with the field of artificial intelligence. The code just extracts individual characters from the image and compares them visually to a prepared database of known fonts, using the hamming distance. You wouldn't call OCR an "AI algorithm", would you?

Edit: "nothing to do" is not accurate, but AI is definitely not an appropriate term to use here

keppanaviimen · on April 30, 2017

"The code just extracts" please. Do you think extracting letters from a image is a joke? There is a engine in development since 1985 for doing that. http://www.kloover.com/publications/Kluever_-_OCR_using_ANN....

nom · on April 30, 2017

Yes, I know it's a hard problem. Tesseract (the OCR library that is used in this project) is in development since 1985, but even google doesn't dare to call it an AI algorithm. OCR is of course a part that is required to build an artificial intelligence, but the term "AI" is just to overused nowadays and calling every single algorithm capable of recognizing something in some data is poison to the field itself. Solving AI is a huuge and daring problem and we should use the term appropriately.

hk__2 · on April 30, 2017

It’s not a joke, but it’s not AI either.

GordonS · on April 30, 2017

> You wouldn't call OCR an "AI algorithm", would you?

It depends on how it was implemented; if it used a neural network for example, then yes, I would

rsrsrs86 · on April 30, 2017

This is a canonical AI problem. This is a task usually performed by human beings that was found to be excellently performed by computers. It falls within the grand field of computer vision. It is a solved and trivial problem nowadays, but is sure is AI.

stevehiehn · on April 30, 2017

This problem is nice because unlike other datasets i think you could automate the creation of the data. I'm thinking you could generate a couple million photos with text captions all with known fonts.

aamederen · on April 30, 2017

Can this technique be used for recognizing handwriting of individuals?

rsrsrs86 · on April 30, 2017

Yes. You wouldn't need to worry about breaking it down to individual letters, though. Not necessarily. You would need pages and pages of writing from the individuals.

janpio · on April 30, 2017

Nice project, but:

> An artificial intelligence [...]

Really?

rendernos · on April 30, 2017

The term "artificial intelligence" is applied when a machine mimics "cognitive" functions that humans associate with other human minds. Understanding the shape in a image, reading the text from a photo can also be considered AI.

nom · on April 30, 2017

Extracting characters from an image and comparing them to a database of known fonts using a naive distance metric can be called AI? Really?

surajx · on April 30, 2017

The key term here is mimics - to be classified as intelligent, the an algorithm should be able to significantly mimic human level intelligence in your domain. For example if your algorithm could read human handwriting and predict what font it was written in, or maybe what font it's closely related to, is something that's closer to the AI bandwagon.

jacquesm · on April 30, 2017

But that doesn't make it 'an AI'. That's a subtle difference but it matters.

amelius · on April 30, 2017

See also https://www.myfonts.com/WhatTheFont/

retox · on April 30, 2017

What The Font has been good to me in this past.

tyingq · on April 30, 2017

https://www.whatfontis.com/ as well, though it presents an irritating survey you have to decline to see the results.

chc · on April 30, 2017

This certainly sounds like an AI application. Do you have some reason to think it isn't?

hk__2 · on April 30, 2017

It uses OCR then compares what it find with everything in the database. It doesn’t “learn” anything from what it finds; in fact you can’t even train it because it uses a static database.

nom · on April 30, 2017

Yes, it is a necessary part of an AI, but I still wouldn't dare to call it an "AI Algorithm". Overuse of the term poisons the actual AI research field, it's only use is to aid click bait.

rsrsrs86 · on April 30, 2017

It is AI, it is just that not everything called AI is really interesting anymore. This should have attracted interest in the 80's.

mrcactu5 · on April 30, 2017

this is amazing ... how is this even possible?