This is a pretty old technique in the demoscene ('called bump mapping there'). There's probably a few hundred demos from the late 90s that use this technique on everything from Amigas to DOS systems of the time.
Normal mapping has a long history. Jim Blinn did the original work in 1978, with "Simulation of Wrinkled Surfaces" [0] which precomputes the normal pertubation into a height texture ("bumpmap") and does some sleight of hand arithmetic to integrate it back, and there was a ton of noise about putting it in hardware around 1998 or so.
The modern normal map was introduced by Kilgard as a rotation on top of the unperturbed surface normal [1] in surface tangent space, allowing it to be encoded with three channels, and compressed to two.
How the hell people like you have links to things like [0] will never cease to amaze me. This is why HN is irreplaceable imho. This place is too damn interesting
Bump maps are subtly different from normal maps. Bump maps are a grayscale map that holds height info (tangential to the face normal). Slightly smaller file, but holds slightly less information.