There are several details you need to 'match up' with for a particular CRC implementation - even using the same polynomial there can be different results because of minor differences in how data bits are handled, using a particular initial value for the CRC (sometimes it's zero, sometimes 0xffff), and/or inverting the bits of the CRC. For example, sometimes one implementation will work from the low order bits of the data bytes up, while sometimes they'll work from the high order bits down (as yours currently does).
Also, you need to 'push out' the last bits of the CRC after you've run all the data bits through.
Keep in mind that CRC algorithms were designed to be implemented in hardware, so some of how bit ordering is handled may not make so much sense from a software point of view.
If you want to match the CRC16 with polynomial 0x8005 as shown on the lammertbies.nl CRC calculator page, you need to make the following changes to your CRC function:
a) run the data bits through the CRC loop starting from the least significant bit instead of from the most significant bit
b) push the last 16 bits of the CRC out of the CRC register after you've finished with the input data
c) reverse the CRC bits (I'm guessing this bit is a carry over from hardware implementations)
So, your function might look like: