29 November 2019

Student Evaluations In Promotion And Tenure

Stewart McKelvey


Stewart McKelvey logo
Stewart McKelvey is Atlantic Canada’s largest full-service law firm. We provide the legal expertise and innovative solutions clients need to move forward confidently.
Tenure and promotion applications are most often concerned with two main criteria: scholarship and teaching effectiveness. This then begs the question of how something as subjective...
Canada Litigation, Mediation & Arbitration
To print this article, all you need is to be registered or login on

Tenure and promotion applications are most often concerned with two main criteria: scholarship and teaching effectiveness. This then begs the question of how something as subjective as "teaching effectiveness" can be accurately assessed. Student evaluation surveys have long been administered by post-secondary education institutions across Canada and are often considered as part of the assessment of an academic's teaching. However, the use of student evaluation surveys was readily challenged in a recent arbitration between Ryerson University and its Faculty Association. That award resulted in Ryerson losing the ability to use student evaluations as evidence of a professor's effectiveness (or lack thereof) in the classroom.

Arbitrator William Kaplan's decision has called into question both the reliability of student evaluations as well as whether they may be used in this context at all. Universities and colleges must therefore ensure that the language contained in their collective agreements explicitly allows for that data to be used if they intend to rely upon it when making promotion decisions.


Faculty members at Ryerson University ("University") had been expressing concerns about the use of student survey data for at least 15 years prior to Kaplan issuing his decision. Several discussions took place over the years and an online pilot project was rolled out to replace the traditional survey system. Nonetheless, faculty continued to take issue with the University's use of the data which resulted in grievances being filed in 2009 and 2015, the latter of which led to a mediationarbitration.

While the mediation resolved many of the issues, the practice of using student survey data to help measure teaching effectiveness remained outstanding. The dispute then proceeded to an interest arbitration, a process in which the parties agree to have an arbitrator decide terms of their collective agreement.

The Faculty's position was that the use of scoring averages was ineffective and inaccurate because student surveys failed to provide reliable data. They alleged a significant bias in many of the surveys and even possible violations of the Human Rights Code. Ultimately, they believed that student evaluations had no place in the evaluation of teaching effectiveness.

Ryerson argued that although student surveys were not solely determinative of the teaching effectiveness of a faculty member, the questionnaires did allow common issues and concerns to be identified alongside the other methods of evaluation. In addition, Ryerson felt that changes to evaluative tools should be gradual and left to the internal workings of the University to figure out.

Kaplan weighed the strengths and weaknesses of Ryerson's student survey system in arriving at his conclusion. In doing so, he relied heavily on the expert evidence of Professors Philip Stark and Richard Freishtat of UC Berkeley. Stark and Freishtat's evidence was that student surveys were biased based on an array of immutable personal characteristics including race, gender, accent, age and even a professor's attractiveness. This evidence led Kaplan to conclude that Ryerson's student surveys were "imperfect at best and downright biased and unreliable at worst."1

The most controversial use of the survey data was the practice of aggregating an average score which was then used to compare individual professors with other faculty as well as the University more broadly. Kaplan held that this practice was entirely inappropriate, stating, "The evidence is clear, cogent, and compelling that averages establish nothing relevant or useful about teaching effectiveness. Averages are blunt, easily distorted (by bias) and inordinately affected by outlier/extreme responses. Quite possibly their very presence results in inappropriate anchoring."2

How, then, can teaching effectiveness be measured if not by the students who attend the classes week in and week out?

Kaplan ultimately determined that the evidence before him demonstrated that teaching effectiveness was more reliably evaluated through a combination of assessing the applicant's teaching dossier in conjunction with their in-class peer evaluations. However, he also recognized that measuring teaching effectiveness is a difficult process and that student surveys have some place in the broader context of faculty assessment. As a result, he found a compromise to be in order. Presently, Ryerson student surveys can continue to be administered, but the data has to be presented as frequency distributions and the relevant decision-makers have to be educated regarding the inherent limitations of student survey data in order to minimize any bias or unreliability. The compromise also came with a caveat attached: Ryerson student evaluations can no longer be used to determine the specific issue of teaching effectiveness.


In 2016, Memorial University of Newfoundland ("MUN") encountered this issue in an arbitration with their Faculty Association over a tenure denial. In that case, the collective agreement stated that a professor's application could include course evaluations should they choose to do so and the grievor had done so voluntarily. In addition, a sample teaching dossier guide from the Canadian Association of University Teachers was attached in the appendix of the collective agreement, which, while acknowledging the inherent limitations of student evaluations, went on to say that the data they produced was nonetheless capable of demonstrating impressions of the workload and the instructor's characteristics.

Dr. Philip Stark was also tendered as an expert witness in that arbitration. He was hired by the Faculty Association and produced a report very similar to the one he provided to Ryerson. However, due to the wording of the collective agreement and sample teaching dossier, MUN was able to demonstrate that Mr. Stark's evidence was neither relevant nor necessary. Consequently, the arbitration panel found it to be inadmissible and had it excluded. The panel specifically noted that both documents clearly allowed for the use of student evaluations in measuring teaching effectiveness. Ultimately, the arbitration panel upheld the denial of tenure.3


One thing is certain - unions will reference the Ryerson decision in an effort to exclude student survey data if it was not conducive to the success of their member's tenure application. While decisions of arbitrators are not binding on one another, Kaplan's reasoning may be influential on subsequent decisions should another arbitrator find his analysis to be persuasive. Consequently, it is imperative that universities and colleges take a close look at the wording of their collective agreements and consider whether they are at risk of being unable to use student evaluations as part of their assessment of a tenure applicant's teaching effectiveness.


1. Ryerson University v Ryerson Faculty Association, 2018 CanLII 58446 (ON LA) at page 5.

2. Ibid at page 7.

3. Memorial University of Newfoundland v Memorial University of Newfoundland Faculty Association (Rolland), (2016) unpublished. Stewart McKelvey was counsel on this case.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

See More Popular Content From

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More