3.13. Generate checksums for a document by adding the bytes of the document and by using the Unix command cksum. Edit the document and see if both checksums change. Can you change the document so that the simple checksum does not change?
3.14. Write a program to generate simhash fingerprints for documents. You can use any reasonable hash function for the words. Use the program to detect duplicates on your home computer. Report on the accuracy of the detection. How does the detection accuracy vary with fingerprint size?