Can Teaching be Measured?

4175299981_614e7d9dc5 (1)


By Justine Rogers

Last week UNSW had its second ‘Great Debate’, introduced last year as a fun, accessible way for the UNSW community to explore a serious and stirring topic. (For a post on last year’s, click here)

Each team: professor-manager, non-prof academic, and student.

The topic: Of Course Teaching Can be Measured (it’s a 5.3!).

I was on the affirmative (which I knew going in would be tough).

Given it was a private event for staff and students, I’ve written this assuming some version of the Chatham House Rule applies.

The affirmative’s arguments were:

  1. Teaching can be measured, albeit imperfectly, and certainly better and more reliably than it is now.
  2. Teaching needs to be measured to enhance the quality, rewards and status of teaching.

The negative’s arguments were:

  1. Teaching cannot be measured, only learning experiences and learning outcomes can. 
  2. Teaching measures are flawed and unreliable.

The negative committed to the empirical questions, whereas I tried (unsuccessfully in the 4 or so mins we had) to engage both sides in the wider empirical and normative argument suggested in affirmative point 2: whether there is some positive correlation between measurement, and motivation, quality and status, and therefore whether a more robust measurement of teaching is worthwhile.

I wish we’d had the format and time to examine this: whether this is true, or whether, using research measures as example, such measures have too many biases, perverse incentives, and inefficient and/or demoralising effects to be of real value (even if it entails superficial value). 

I will share my main arguments here, some of which I am fairly convinced, many posed as part of my role on the affirmative side, and some raised in the spirit of fun and provocation. Above all, I think the topic raised several questions left that need to be contemplated, many of which I’ve posted below – so please share your thoughts!

[This is not a transcript of what was said, but these were my notes on me.]

“This debate is actually bigger than the narrow methodological approach of the negative. I’ve taken some of this from Gunn and Fisk’s HEA paper: it’s about whether or not measurement supports in good faith motivation, teaching quality, rewards and status or whether it is ultimately trivialising of a mosaic human activity, as part of a flawed and potentially harmful performance review policy; for management to discipline staff, and market themselves to students and against other universities, regardless of what types of motivation and levels of quality actually exist.

If it is for the worst, or turns into these pernicious things then of all the panel members, I will be the worst affected. As a woman early career academic on the lowest status rung, I am the one on the panel who currently has the most to lose from any methodological flaws and misguided uses of measuring teaching excellence. But, if I can get on board with the affirmative case that good teaching can be measured, then you can too!

I want to state that flawed measures of teaching are a concern. And flawed uses of teaching measures are another concern. 

Flawed measures: There are repeat, controlled experiments to show that student evaluations are unreliable and significantly biased against women who get lower marks from male and female students (and potentially this could apply to other under-represented, professional groups). Student evaluation results are also significantly correlated to students’ expectations of their course performance, what grades they’re sitting on at that point, as distinct from the learning outcomes or actual final performance. [Click here for sources.]

I agree that students have a role here, and are mostly very good judges of teaching. But they’re not allowed to be great. Most of them don’t know what [the student evaluation of teaching (SET) form] does or how the scale works. They don’t think we get to access the feedback, unfiltered. Some write like they’re anonymously trolling online. Some seem see it as a way to purge pre-exam anxiety or feel generally put upon that they’re being asked to fill in a form during one of the final classes. We’re asking for feedback when the learning cycle has not been completed. We’re asking for considered assessment in a very stressful time. In many ways, it’s like asking someone to write a restaurant review on Yelp while they’re in the middle of signing divorce papers.

And there is real concern about flawed uses: Staff is worried about it creating a two-tiered system. I’ve heard ‘second class citizen’ used many times to describe academics who want to spend more time on their teaching, teaching innovation, and teaching scholarship. And there’s concern that these will be too divided – when, now, research feeds into teaching and vice versa. [See related Law School Vibe posts, here, here, and here].

[By now the negative side were vigorously scribbling down, and getting ready to claim me as their “fourth speaker”!]

BUT a lot of this is about misunderstanding of what good measuring of teaching is, and is and should be used, for. We’re not asking for teaching excellence to be measured:

  1. By [the SET] as it is now, let alone [the current SET form] only
  2. To decide hiring, firing, and staff rewards as its only function
    and finally, we’re not asking for teaching excellence to be measured:
  3. In a system – where there is no teaching infrastructure and culture to make teaching excellence, and its measurement and recognition within and outside UNSW, possible and sustainable

So, no, it can’t just be [SET] and, during the summation or questions, I will hopefully be able to outline a better possible range of teaching excellence modalities and measures. But, for now, they might include data from peers and former student, publications, effective collaborations and uptake of innovations, leadership, and other impact metrics, showing a range of outcomes for students, disciplines and institutions. [I had with me, but no time to share, examples of operationalised teaching modalities, found here and here.]

We need more robust and recognisable measures than the ones we have, for promotion, improvement and innovation, pride and commitment, job security and mobility. 

The negative team may say or suggest that academics enjoy and achieve these things anyway, without the measures, but it is at the very least beset by research pressure or a feeling that your time would be better spent researching than putting extra time into ensuring quality teaching. That is the truth of the experiences and conversations of academics.

Now, most lecturers are probably fairly to very conscientious about their teaching anyway, because we’re obsessive approval-seekers who don’t get warmth and affection anywhere else in our lives, but why should it be left to that? Why shouldn’t it be recognised as hard work and properly rewarded? Why should it be a painful choice on your holiday, when others are swimming down at the beach, whether to spend the time on a paper or time flipping your course? Why should there be only one legitimate means of neglecting your family, friends and health? You should be able to ruin your holiday in any number of career-affirming ways!

Right now, there is less incentive than there should be for academic staff to spend extra time introducing blended or individualised learning or other goals of [UNSW’s new strategy], especially once you have designed the course or your lecture notes or lesson plans to a certain level. These innovations take time, effort, collaboration, an evidence-base, and the money and support to test their effectiveness.

So our argument is that good teaching can be measured, not perfectly and certainly better than it is currently, and that it also must be measured for the benefit of the University, staff and students.

But the bigger questions are around how this can be done in a transparent, robust and fair way, and for what uses?

To create and sustain a wider, enabling culture, how do we ensure financial parity between teaching and research, and the same funding entitlements and opportunities, so that academic teachers can be excellent? What cultural changes can we make to reduce biases in our measures?

These seem like big questions, but very similar questions have been raised and addressed – and are still being addressed and need to be resolved – in the research excellence context. So there seems no reason why they can’t be resolved regarding this, other central part of the academic pursuit, and role of the university.”

By clap-o-meter, the negative convincingly won. They had a tighter set of arguments, and made better use of the debate format. Given at least two colleague-audience members (of some 150 or more in attendance) leapt to their feet in ovation, I wondered after whether at least some of the energy behind the applause was about the wider context, even though it wasn’t ever addressed by them: the anxiety about and resentment towards performance culture generally and the prospect of another part of the academic pursuit being subject to potentially highly disciplinary and divisive measures. This might be simply my ego defences talking, but I’m not convinced it was primarily in support of the gold standard of experimental research design, ever elusive for most social science research, and rarely achieved in science…

What are your thoughts? What do you think about the moves in higher education towards formally measuring teaching excellence? Can and should it be done? How and for what ends? (Here is a case for its introduction and workings). Is any Uni getting this right, in which measurement is related to teaching quality and status? Or is this a dreadful development?

One thought on “Can Teaching be Measured?

Add yours

  1. In my opinion, much less can be measured than we think can be measured. For example, how do you measure a “potentially good future law student”? At UNSW we are switching from an answer that says “HSC rank” to one that includes another (better targeted) test. In the US they look at CVs, essays and sometimes interviews – which leads to a “CV point scoring” culture in many ways as students (and parents) try to give kids the perfect life history for entry. So the truth – we cannot measure it perfectly or even well. Similarly, with teaching quality. We can probably get the extremes – the brilliant, inspiring teacher versus the teacher who drones from written notes, but measurement usually implies a way to get at the in between. That is where we fail. And it isn’t just teaching, it is everything.

    Liked by 1 person

Leave a comment on this post

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at

Up ↑

%d bloggers like this: