maanantai 5. tammikuuta 2015

Compounds are too hard for people

English is a nice international language for most of its features (apart from spelling vs. phonology). One of the very nice things indeed is that when you need to combine existing nouns into words meaning new things you just chain them together using spaces: printers become laser printers and later on 3d printers without worries or problems for machines processing these words (I wish, actually google blogger does draw red squiggly under my 3d). This is not so notably for German already, which smashes these words together without spaces, maybe add some letters in between or hyphen who knows. And then it becomes a word that is unknown to a computer. Finnish is similar to German in this respect. Now one might guess that it would be easy to just add spaces and be done with it. That is not so always, as with what prompted me to write this post made me realise again. In fact whether you write the words with or without space in Finnish is a difference between generic and specific term: talon mies (house+gen man+nom) is the man of the house but talonmies is just a janitor. Most literate Finnish users will of course get this distinction correct, but it gets harder for a whole lot of word combinations: is salad's dressing a single term for salad dressing or just dressing of salad (it usually is a single term) and does it change if you add dressing to ice cream instead (it does), and why is salad dressing not a dressing made of salad. But in these the distinction doesn't matter, if you get the less probable variant it's all the same. However, cases where it does matter, even good writers will get it wrong, as exemplified by an email in my inbox about a trip to city X, written with space X:n matka (X+gen trip+nom) is not semantically plausible since it can only refer to a trip made by city X–as in Y:n matka, where Y is a person–, it must be X:n-matka, a construct that will usually tackle most writers for sure, so I am not surprised that a fellow linguist had written so. (It is noteworthy that also autocorrect will only allow you to write wrong forms as usual, so it may as well be the culprit).