I ran across a very interesting story yesterday. Apparently genetic researchers are having some real trouble with their spreadsheets: important data is being wrongly interpreted by Excel as specific data formats and ends up getting mangled irreversibly, leading to data corruption. For example, the gene identifier “2310009E13” got converted to the floating point value “2.31E+13,” and the tumor suppressor DEC1 [Deleted in Esophageal Cancer 1] was being converted to ‘1-DEC.’
It brings to mind the floating point multiplication bug from a few years back, where again things got implicitly converted from one data type to another because the system had no type information to rely on.
Granted, this is Excel, and it would be beyond silly to force spreadsheet users to define every cell as “integer,” “float,” “date,” etc., but the point here is that there’s no good way to set a type definition when one would be useful. That can certainly be done; DWS allows you to define types for your variables, Delphi-style, or to use type inference instead. Likewise, in Boo, types are inferred by default, or defined explicitly when you define them. But in Excel, there’s no simple way to say “these values are strings and not anything else.” And before anyone says that Excel is not a programming language, the folks at Microsoft consider it to be one.
So remember, next time you hear someone talking about how horrible a “verbose” language like Delphi is compared to a “powerful, elegant” dynamic language, that all that syntax is there for a reason. It carries meaning, both for the compiler and also for human beings who go to read the code, and when that meaning isn’t available, we’re left with a computer program trying to read the user’s mind, which never works well.