AI to help solve simple problems
The other day, my colleague wondered whether artificial intelligence could be used to determine the gloomiest Christmas song ever. I gave the question some thought and concluded that it ought to be possible. The question called for a text sentiment analysis, which is a routine task for modern AI applications. This convinced me that AI could be applied for the purpose. Despite its grandiose name, artificial intelligence mostly boils down to plain and simple statistical modelling. AI applications are, of course, highly automated and therefore well suited to the processing of large sets of data. They are also capable of identifying subtle interdependencies in data, which easily go undetected by humans. This was a great chance to test artificial intelligence for finding a solution to a rather simple problem!
Since Innofactor operates in the Nordic countries (Finland, Sweden, Norway and Denmark), we decided to also make a Nordic comparison of Christmas songs while we were at it. Which country had the most cheerful songs, and which country the gloomiest songs? My colleague, by the way, was pretty sure that “Tonttujen jouluyö” (the “tip-tap” song) would prove to be the most cheerful song.
Sentiment-analysis of the words with the help of IMDd movie reviews and a group of tweets
We got down to business. First, we collected a number of popular Christmas songs from the different Nordic countries, looked up their lyrics and translated them into English using Google Translate. The reason for this was that sentiment analysis does not work equally well for the different Nordic languages.
I first tested an existing sentiment-analysis application, but it had trouble analysing some of the texts. Perhaps the words used in Christmas songs aren’t familiar enough for normal AI? I then decided to build up my analysis from smaller components. My model was based on the idea of determining the sentiment (either positive or negative) for each word in a song. The overall sentiment of the song would then equal the average of the sentiments of individual words.
To do this, I needed a list of all the words in the songs and the sentiments attached to them. Not surprisingly, I found a solution to this, too, online. Various databases of texts, which have been evaluated as positive or negative, are openly available for text analysis. In this case, I used two such databases: IMDd movie reviews and a group of tweets, which had also been evaluated in this way. I extracted the words from my material and assigned a positive or negative value to each word in a movie review or tweet in accordance with the overall evaluation of the text. Depending on the source, the result for individual words was sometimes unexpected. For example, if a fan of gothic horror described a movie as being “wonderfully grim”, this resulted in the word “grim” getting a positive evaluation. However, since I used thousands of reviews, each individual word got a number of different results, so I expect that the average of these results closely corresponds to the typical sentiment of the words.
Norway has the gloomiest songs – Finland the most cheerful
As for the Nordic comparison, the result of my analysis was something of a surprise. According to the model, the most cheerful Christmas songs, on average, come from Finland. An analysis based exclusively on the IMDd classification put Sweden slightly ahead of Finland. However, irrespective of the calculation method, Norway was home to the gloomiest songs. The following tables contain a list of the gloomiest and the most cheerful Christmas songs. You can also view the results interactively in this Power BI report.
The lists of both gloomy and cheerful songs are topped by tunes from the other Nordic countries, which I’m not familiar with. As concerns the Finnish songs, the model classified “Tonttu” (“The Elf”) as the gloomiest one. The lyrics translate as something like “The roofs are topped with sleet, the elf cannot get sleep”, which, admittedly, doesn’t sound particularly upbeat. Fourth on the list is, quite surprisingly. “No onkos tullut kesä” (“Is summer suddenly here?”), which is hardly a gloomy Christmas song. The lyrics apparently include individual words which, taken out of context, leave the listener with a sour feeling. The AI model I used was simple, and probably gets stuck on individual word evaluations instead of the overall evaluation of texts. It is unable to identify words used in unusual contexts. The model should be further developed to interpret word clusters, in addition to which the tempo and melody of songs might offer more insight into the sentiment.
The list of the most cheerful songs is less surprising. The “tip-tap” song tops the list together with “En etsi valtaa loistoa”. In other words, my colleague was right!
The most joyful and depressing Christmas songs
"Score" measures the mood of the song. The bigger the number, the more joyful the song.
Tapio työskentelee Innofactorilla data-analyytikkona ja mallittajana. Aiemmin urallaan hän on tutkinut mm. kasvien vuosirytmiä, ja Mustikkaan-malli on jatkoa niille töille.