Wikipedia Grappling with Deletion of IHT.com

duskwuff · on May 10, 2009

> How can we count the links to the IHT in Wikipedia?

Someone didn't do their research. It's easy.

http://en.wikipedia.org/wiki/Special:LinkSearch/*.iht.com

randomwalker · on May 11, 2009

Looks like there are 9450 of them (as of now). Quite a lot.

http://en.wikipedia.org/w/index.php?title=Special:LinkSearch...

rantfoil · on May 11, 2009

This seems like a colossal waste of valuable SEO linkjuice. Good lord! Would it be so hard to do a proper redirect showing the redesigned pages / branding?

functional-tree · on May 10, 2009

The URL is the modern C pointer. But rather than pointing to objects we control, we now hold pointers to anything we want. We're at the mercy of everyone's ability to keep those objects alive, lest our URLs just become dangling pointers.

scott_s · on May 10, 2009

Which is like having a C pointer to an object controlled by a GC, and the GC does not know (or care) that you have a reference to one of its objects.

jrockway · on May 10, 2009

No need to make it that complicated. In this case, the NYT deliberately "free"'d the "memory", even though they knew it was still in use. "It's the caller's problem now."

biotech · on May 10, 2009

Maybe we should use call by value: make a copy and link to the internet archive. (When legal)

jrockway · on May 11, 2009

Definitely a good idea. I personally know that it's difficult to keep links around when they are implementation-dependent.

Look at my blog for example; with URIs like /comments/b31956d4-3dd1-11de-ba89-db8db9812c92/b5a70270-3dd1-11de-a405-db2750a175d7/b5a70270-3dd1-11de-a405-db2750a175d7.pod, it is going to be a nightmare to keep all links alive when I rewrite the software. In this case, I think I'm going to just kill them, since nobody links to individual comments. But I am not the NYT :)

ars · on May 11, 2009

That'll teach you not to use GUID's for primary keys! :)

Use a regular auto incrementing number, starting over at 1 for each story, and you'll have a much easier time transitioning software. (Plus you'll save (a little) on bandwidth.)

They only good reason I have ever heard of for using a GUID for a primary key is if you need to merge two databases, the GUID's will not collide.

jrockway · on May 11, 2009

No, that's not the problem. I would have the same problem if I used integer primary keys -- "/comments/4/87/98/293.txt". I would still have to somehow map that old scheme to my new one. (If I used the same storage system for my rewrite, my UUID keys would be fine; no need for integer primary keys.)

Additionally, the current version uses an immutable lockless storage system; if I had to increment keys I would have to lock. (The new version will also use UUIDs for the same reason.)

carterschonwald · on May 10, 2009

not really, just change it to call by value lazy evaluation semantics (sort of). The solution is to have your webserver be designed so that for every (static?) link that it serves, to have both a cached / archived version, and a link to the live location.

ars · on May 11, 2009

BTW: IHT is the International Herald Tribune (a newspaper).

(Which is not mentioned until the very end of the article, and then only accidentally.)

nopinsight · on May 11, 2009

They could have redirected it to the Internet Archive version of those pages?

Why don't they do that?