Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Wikipedia Grappling with Deletion of IHT.com (thomascrampton.com)
39 points by functional-tree on May 10, 2009 | hide | past | favorite | 13 comments


> How can we count the links to the IHT in Wikipedia?

Someone didn't do their research. It's easy.

http://en.wikipedia.org/wiki/Special:LinkSearch/*.iht.com


Looks like there are 9450 of them (as of now). Quite a lot.

http://en.wikipedia.org/w/index.php?title=Special:LinkSearch...


This seems like a colossal waste of valuable SEO linkjuice. Good lord! Would it be so hard to do a proper redirect showing the redesigned pages / branding?


The URL is the modern C pointer. But rather than pointing to objects we control, we now hold pointers to anything we want. We're at the mercy of everyone's ability to keep those objects alive, lest our URLs just become dangling pointers.


Which is like having a C pointer to an object controlled by a GC, and the GC does not know (or care) that you have a reference to one of its objects.


No need to make it that complicated. In this case, the NYT deliberately "free"'d the "memory", even though they knew it was still in use. "It's the caller's problem now."


Maybe we should use call by value: make a copy and link to the internet archive. (When legal)


Definitely a good idea. I personally know that it's difficult to keep links around when they are implementation-dependent.

Look at my blog for example; with URIs like /comments/b31956d4-3dd1-11de-ba89-db8db9812c92/b5a70270-3dd1-11de-a405-db2750a175d7/b5a70270-3dd1-11de-a405-db2750a175d7.pod, it is going to be a nightmare to keep all links alive when I rewrite the software. In this case, I think I'm going to just kill them, since nobody links to individual comments. But I am not the NYT :)


That'll teach you not to use GUID's for primary keys! :)

Use a regular auto incrementing number, starting over at 1 for each story, and you'll have a much easier time transitioning software. (Plus you'll save (a little) on bandwidth.)

They only good reason I have ever heard of for using a GUID for a primary key is if you need to merge two databases, the GUID's will not collide.


No, that's not the problem. I would have the same problem if I used integer primary keys -- "/comments/4/87/98/293.txt". I would still have to somehow map that old scheme to my new one. (If I used the same storage system for my rewrite, my UUID keys would be fine; no need for integer primary keys.)

Additionally, the current version uses an immutable lockless storage system; if I had to increment keys I would have to lock. (The new version will also use UUIDs for the same reason.)


not really, just change it to call by value lazy evaluation semantics (sort of). The solution is to have your webserver be designed so that for every (static?) link that it serves, to have both a cached / archived version, and a link to the live location.


BTW: IHT is the International Herald Tribune (a newspaper).

(Which is not mentioned until the very end of the article, and then only accidentally.)


They could have redirected it to the Internet Archive version of those pages?

Why don't they do that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: