Content Hashing

The idea being that similar things have similar hashes.

For comparing strings: levenshtein, similar_text, gmp_hamdist (Hamming distance), libdistance?

Comments

Some interesting ideas about topic map similarity, which may be related:
http://kill.devc.at/node/186

All fields are optional, email address will not be shown; no HTML, URLs are automatically hyperlinked.