In order for a file to be matched up to an exact copy on the web, the computer calculates a hash value for the file, and searches for the same hash value.
There are different ways that hash values are figured. There is SHA1, SHA2, edonket, etc. So it makes it makes it harder to do a search. A file can have different hash values, depending on how it was figured.
Also, something like a photo file. You could zoom it in to a single pixel, can charge the shade just a hair, and that file would have a completely new hash value.