Liu et al. Development and Validation of a Formative Evaluation Instrument
considerations on the measurement of teaching
quality from different perspectives. Zeitschrift Für
Pädagogik, 66, 138–155.
Fredricks, J. A., & McColskey, W. (2012). Handbook
of research on student engagement. Springer.
Galbraith, C. S., Merrill, G. B., & Kline, D. M. (2012).
Are student evaluations of teaching effectiveness
valid for measuring student learning outcomes in
business related classes? A neural network and
bayesian analyses. Research in Higher Education,
53(3), 353-374.
Gravestock, P., & Gregor-Greenleaf, E. (2008). Student
course evaluations: Research, models and trends.
Higher Education Quality Council of Ontario.
Handelsman, J., Miller, S., & Pfund, C. (2007).
Scientific teaching. Macmillan.
Hativa, N., Barak, R., & Simhi, E. (2001). Exemplary
university teachers: Knowledge and beliefs
regarding effective teaching dimensions and
strategies. The Journal of Higher Education, 72(6),
699-729.
Henard, F., & Leprince-Ringuet, S. (2008). The path to
quality teaching in higher education.
https://www.oecd.org/education/imhe/41692318.p
df
Janssen, R., & De Boeck, P. (1999). Confirmatory
analyses of componential test structure using
multidimensional item response theory.
Multivariate Behavioral Research, 34(2), 245-268.
Kahu, E. R. (2013). Framing student engagement in
higher education. Studies in Higher Education,
38(5), 758-773.
Lehrer-Knafo, O. (2019). How to improve the quality
of teaching in higher education? The application of
the feedback conversation for the effectiveness of
interpersonal communication. EDUKACJA
Quarterly, 149(2).
Linacre, J. M. (2020). Winsteps® Rasch measurement
computer program. Winsteps.com
Liu, X. (2020). Using and developing measurement
instruments in science education: A Rasch
Modeling Approach (2nd ed.). IAP.
Mitchell, K. M., & Martin, J. (2018). Gender bias in
student evaluations. PS: Political Science &
Politics, 51(3), 648-652.
Nasser, F., & Fresko, B. (2002). Faculty views of
student evaluation of college teaching. Assessment
& Evaluation in Higher Education, 27(2), 187-
198.
Pounder, J. (2007). Is student evaluation of teaching
worthwhile? An analytical framework for
answering the question. Quality Assurance in
Education, 15(2), 178–191.
Robitzsch, A., Kiefer, T., & Wu, M. (2019). TAM: Test
analysis modules. R package version 3.3-10.
https://CRAN.R-project.org/package=TAM
Rogaten, J., Rienties, B., Sharpe, R., Cross, S.,
Whitelock, D., Lygo-Baker, S., & Littlejohn, A.
(2019). Reviewing affective, behavioural and
cognitive learning gains in higher education.
Assessment & Evaluation in Higher Education,
44(3), 321-337.
RStudio Team (2018). RStudio: Integrated
development for R. RStudio, Inc.
http://www.rstudio.com/.
Shevlin, M., Banyard, P., Davies, M., & Griffiths, M.
(2000). The validity of student evaluation of
teaching in higher education: love me, love my
lectures? Assessment & Evaluation in Higher
Education, 25(4), 397-405.
Smith, A. B., Rush, R., Fallowfield, L. J., Velikova,
G., & Sharpe, M. (2008). Rasch fit statistics and
sample size considerations for polytomous data.
BMC Medical Research Methodology, 8(1), 33.
Spooren, P., Brockx, B., & Mortelmans, D. (2013). On
the validity of student evaluation of teaching: The
state of the art. Review of Educational Research,
83(4), 598-642.
Stowell, J. R., Addison, W. E., & Smith, J. L. (2012).
Comparison of online and classroom-based
student evaluations of instruction. Assessment &
Evaluation in Higher Education, 37(4), 465-473.
Strauss, M. E., & Smith, G. T. (2009). Construct
validity: Advances in theory and methodology.
Annual Review of Clinical Psychology, 5, 1-25.
Torres Irribarra, D. & Freund, R. (2014). Wright Map:
IRT item-person map with ConQuest integration.
http://github.com/david-ti/wrightmap
Trowler, V. (2010). Student engagement literature
review. The Higher Education Academy,11(1), 1-
15.
Uttl, B., White, C. A., & Gonzalez, D. W. (2017).
Meta-analysis of faculty's teaching effectiveness:
Student evaluation of teaching ratings and student
learning are not related. Studies in Educational
Evaluation, 54, 22-42.
Van Zile-Tamsen, C. (2017). Using Rasch analysis to
inform rating scale development. Research in
Higher Education, 58(8), 922-933.
Wang, W. C., Chen, P. H., & Cheng, Y. Y. (2004).
Improving measurement precision of test
batteries using multidimensional item response
models. Psychological methods, 9(1), 116-136.
Worthington, A. C. (2002). The impact of student
perceptions and characteristics on teaching
evaluations: a case study in finance education.
Assessment & Evaluation in Higher Education,
27(1), 49-64.
Yao, Y., & Grady, M. L. (2005). How do faculty make
formative use of student evaluation feedback? A
multiple case study. Journal of Personnel
Evaluation in Education, 18(2), 107.