Big Data Analytics on student surveys

It’s the new “thing” – analytics applied to student responses to courses. And it is really quite scary.

To give an example, I will share my own results from a recently taught course of 22 students of which 10 filled out the survey. This is “small data”. It takes about 5-10 minutes (generously) to read and reflect upon the student feedback. Since I am sharing, they generally liked the course including guest lectures and excursions, but felt that one topic didn’t need as much time and that my Moodle page wasn’t well organised. All very helpful for the next time I run the course (note to self to start my Moodle page earlier and tweak the class schedule).

The problem is no longer the feedback, it is the “analytics” which now accompany it. The worst is the “word clouds”. I look at the word cloud for my course and see big words (these generally reflect the feedback, subject to an exception discussed below) and then smaller words and phrases. Now the smaller ones in a word cloud are obviously meant to be “less” important but these are really quite concerning, so much so that I initially panicked. They include “disrespectful/rude”, “unapproachable”, “not worthwhile”, “superficial” and “unpleasant”. Bear in mind the word cloud precedes the actual comments in my report. None of these terms (nor their synonyms) were used by ANY of the students (unless an organised Moodle page could count as “unapproachable”). And they are really horrible things to say about someone, especially when there is no basis for these kinds of assertions in the actual feedback received.

The problem here is applying a “big data” tool to (very) small data. It doesn’t work, and it can be actively misleading. One of the word clouds (there are different ones for different “topics”) had the word “organised”. That came up because students were telling me my Moodle page was NOT well organised, but it would be easy to think at a quick glance that this was praise.

So what is the point of this exercise? One imagines it might be useful if you have a course with hundreds of students (so that reading the comments would take an hour, say). But the fact that the comments can be actively misleading (as in “organised” above) demonstrates, you still need to read the comments to understand the context. Further, students often make subtle observations in comments (like the fact that too much time was spent on a particular topic) that are difficult to interpret in a word cloud where the phrases are aggregated and sprinkled around the place. So, it doesn’t really save time. The comments still need to be read and reflected on.

Big Data tools always sound very exciting. So much buzz! Imagine if we could predict flu epidemics from Google searches (that no longer works, by the way) or predict crime before it happens (lots of jurisdictions are trying this, particularly in the US). But the truth is more like the word cloud on student feedback – inappropriately applied, error prone, poorly understood by those deploying the tool, and thus often unhelpful. Data analytics CAN be good tool – but it is a bit like a hammer in the hands of those who don’t understand its function and limitations, everything looks like a nail.

Lyria Bennett Moses