Pro Tip: Perl’s Digest::MD5 hates Unicode (and so should you).
Here’s what I recently learned from perldoc Digest::MD5 recently (the hard way, of course):
Perl 5.8 support Unicode characters in strings. Since the MD5 algorithm is only defined for strings of bytes, it can not be used on strings that contains chars with ordinal number above 255. The MD5 functions and methods will croak if you try to feed them such input data.
Yes, that’s exactly what happend. I got a semi-cryptic error message. How to fix it?
What you can do is calculate the MD5 checksum of the UTF-8 representation of such strings. This is achieved by filtering the string through encode_utf8() function.
Of course! The exact opposite of what I’d done while trying to be a good Unicode Boy.
I have a much longer blog post brewing in my head about how they never tell you in Computer Science classes that 80-90% of your “programming” time in the real world it dealing with failures, exceptional cases, and general debugging.