Equating Principles in Assessment: A Literature Review in the Context of Education and Assessment

Authors

  • Muh. Fitrah Graduate School, Research and Educational Evaluation, State University of Yogyakarta
  • Ilyas Ilyas Graduate School, Research and Educational Evaluation, State University of Yogyakarta
  • Nur Rahmi Akbarini Graduate School, Research and Educational Evaluation, State University of Yogyakarta
  • Oscar Oscar Graduate School, Research and Educational Evaluation, State University of Yogyakarta
  • Edi Istiyono Graduate School, Research and Educational Evaluation, State University of Yogyakarta
  • Widihastuti Widihastuti Graduate School, Research and Educational Evaluation, State University of Yogyakarta

DOI:

https://doi.org/10.31764/ijeca.v6i3.19381

Keywords:

Equating principles, Educational measurement, Assessment analysis.

Abstract

This literature review explores the fundamental concepts and methods related to equating principles in the context of educational assessment. Equating, as a complex statistical technique, plays an increasingly crucial role in ensuring fairness and accuracy in test-based assessments. The literature review methodology begins with the identification of key themes regarding equating principles in education and assessment. It utilizes keywords and scholarly databases to search for relevant sources, selecting those that are up-to-date and possess robust methodologies. The result is a deeper understanding of the concept of equating in the context of educational assessment. The study highlights the significant relevance of equating in education, where test results are often used for critical decision-making. Extensive discussions on the practical implications of implementing equating in educational policy and assessment, as well as its impact on students, teachers, and educational institutions, are presented. In conclusion, a critical understanding of equating is essential to ensure fair, consistent, and meaningful assessments in an ever-evolving educational landscape.

 

 

References

Aminah, N. S. (2012). Karakteristik metode penyetaraan skor tes untuk data dikotomos. Jurnal Penelitian Dan Evaluasi Pendidikan, 16, 88–101. http://dx.doi.org/10.21831/pep.v16i0.1107

Ardiyaningrum, M., Kusuma, C., & Trisniawati, T. (2018). Analisis Butir Try Out Ujian Nasional Matematika Sekolah Dasar Di Daerah Istimewa Yogyakarta Tahun 2017. Taman Cendekia: Jurnal Pendidikan Ke-SD-An, 2(2), 206–211. https://doi.org/10.30738/tc.v2i2.2819

Bayo-Moriones, A., Galdon-Sanchez, J. E., & Martinez-de-Morentin, S. (2020). Performance appraisal: dimensions and determinants. The International Journal of Human Resource Management, 31(15), 1984–2015. https://doi.org/10.1080/09585192.2018.1500387

Blanco-Vogt, A., & Schanze, J. (2014). Assessment of the physical flood susceptibility of buildings on a large scale–conceptual and methodological frameworks. Natural Hazards and Earth System Sciences, 14(8), 2105–2117. https://doi.org/10.5194/nhess-14-2105-2014

Cassidy, S. (2007). Assessing ‘inexperienced’’ ability to selfâ€assess: Exploring links with learning style and academic personal control.’ Assessment & Evaluation in Higher Education, 32(3), 313–330. https://doi.org/10.1080/02602930600896704

Dorans, N. J. (2004). Equating, concordance, and expectation. Applied Psychological Measurement, 28(4), 227–246. https://doi.org/10.1177/0146621604265031

Dorans, N. J., & Cook, L. L. (2016). Fairness in educational assessment and measurement. Taylor & Francis.

Duarte, M. E., & Rossier, J. (2008). Testing and assessment in an international context: Cross-and multi-cultural issues. In International handbook of career guidance (pp. 489–510). Springer. https://doi.org/10.1007/978-1-4020-6230-8_24

Elkhoury, E., Ali, A., & Sutherland-Harris, R. (2023). Exploring Faculty Mindsets in Equity-Oriented Assessment. Journal of University Teaching & Learning Practice, 20(5), 13. https://doi.org/10.53761/1.20.5.12

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists multivariate. London, UK: Erlbaum Publishers.

Erfianti, L., Istiyono, E., & Kuswanto, H. (2019). Developing lup instrument test to measure higher order thinking skills (HOTS) Bloomian for senior high school students. International Journal of Educational Research Review, 4(3), 320–329. https://doi.org/10.24331/ijere.573863

Felan, G. D. (2002). Test Equating: Mean, Linear, Equipercentile, and Item Response Theory.

Friyatmi, F. (2018). Estimasi parameter tes dengan penskoran politomus menggunakan graded response model pada sampel kecil. Jurnal Inovasi Pendidikan Ekonomi (JIPE), 8(1), 22–31. https://doi.org/10.24036/01104490

Gaytan, J., & McEwen, B. C. (2007). Effective online instructional and assessment strategies. The American Journal of Distance Education, 21(3), 117–132. https://doi.org/10.1080/08923640701341653

Gronlund, N. E. (2003). Assessment of Student Achievement. 7. útgáfa. Bandaríkin: Allyn and Bacon.

Hagell, P. (2014). Testing rating scale unidimensionality using the principal component analysis (PCA)/t-test protocol with the Rasch model: the primacy of theory over statistics. Open Journal of Statistics, 4(6), 456–465. https://doi.org/10.4236/OJS.2014.46044

Heinrichs-Graham, E., Walker, E. A., Taylor, B. K., Menting, S. C., Eastman, J. A., Frenzel, M. R., & McCreery, R. W. (2022). Auditory experience modulates fronto-parietal theta activity serving fluid intelligence. Brain Communications, 4(2), fcac093. https://doi.org/10.1093/braincomms/fcac093

Himelfarb, I. (2019). A primer on standardized testing: History, measurement, classical test theory, item response theory, and equating. Journal of Chiropractic Education, 33(2), 151–163. https://doi.org/10.7899%2FJCE-18-22

Hoe, S. L. (2008). Issues and procedures in adopting structural equation modelling technique. Journal of Quantitative Methods, 3(1), 76. https://ink.library.smu.edu.sg/sis_research/5168

Humphry, S. (2006). The impact of differential discrimination on vertical equating. ARC Report.

Hung, Y. (2019). Bridging assessment and achievement: Repeated practice of self-assessment in college English classes in Taiwan. Assessment & Evaluation in Higher Education, 44(8), 1191–1208. https://doi.org/10.1080/02602938.2019.1584783

Immonen, K., Oikarainen, A., Tomietto, M., Kääriäinen, M., Tuomikoski, A.-M., KauÄiÄ, B. M., Filej, B., Riklikiene, O., Vizcaya-Moreno, M. F., & Perez-Canaveras, R. M. (2019). Assessment of nursing students’ competence in clinical practice: a systematic review of reviews. International Journal of Nursing Studies, 100, 103414. https://doi.org/10.1016/j.ijnurstu.2019.103414

Johnson, S. N., Gallagher, E. D., & Vagnozzi, A. M. (2021). Validity concerns with the revised study process questionnaire (R-SPQ-2F) in undergraduate anatomy & physiology students. Plos One, 16(4), e0250600. https://doi.org/10.1371/journal.pone.0250600

Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking.

Lucey, C. R., Hauer, K. E., Boatright, D., & Fernandez, A. (2020). Medical education’s wicked problem: achieving equity in assessment for medical learners. Academic Medicine, 95(12S), S98–S108. https://doi.org/10.1097/acm.0000000000003717

Lyngsø, A. M., Godtfredsen, N. S., Høst, D., & Frølich, A. (2014). Instruments to assess integrated care: a systematic review. International Journal of Integrated Care, 14. https://doi.org/10.5334/ijic.1184

Mardapi, D. (2017). Pengukuran Penilaian dan Evaluasi Pendidikan Edisi 2. Yogyakarta: Parama Publishing.

Margono, G. (2013). Aplikasi analisis faktor konfirmatori untuk menentukan reliabilitas multidimensi. Statistika, 13(1). https://doi.org/10.29313/jstat.v13i1.1069

Mau, S. (2020). Numbers matter! The society of indicators, scores and ratings. International Studies in Sociology of Education, 29(1–2), 19–37. https://doi.org/10.1080/09620214.2019.1668287

McDonald, M. (2002). Systematic assessment of learning outcomes: Developing multiple-choice exams. Jones & Bartlett Learning.

Meijer, R. R., & Tendeiro, J. N. (2018). Unidimensional item response theory. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, 413–443. https://doi.org/10.1002/9781118489772.ch15

Muhdar, R. (2023). Assesmen Kompetensi Minimum Numerasi Program Merdeka Belajar. Jurnal Ilmiah Wahana Pendidikan, 9(12), 407–411. https://doi.org/10.5281/zenodo.8079162

Naga, D. S. (1992). Pengantar teori sekor pada pengukuran pendidikan. Jakarta: Gunadarma.

Nayir, F., Brown, M., Burns, D., Joe, O., Mcnamara, G., Nortvedt, G., Skedsmo, G., Gloppen, S. K., & Wiese, E. F. (2019). Assessment with and for migration background students-cases from Europe. Eurasian Journal of Educational Research, 19(79), 39–68. https://doi.org/10.14689/ejer.2019.79.3

Papastephanou, M. (2005). Globalisation, globalism and cosmopolitanism as an educational ideal. Educational Philosophy and Theory, 37(4), 533–551. https://doi.org/10.1111/j.1469-5812.2005.00139.x

Popham, W. J. (2008). Transformative Assessment: Association for Supervision and Curriculum Development. 1703 North Beauregard Street, Alexandria, VA 22311-1714. Tel.

Putri, F. S., & Istiyono, E. (2017). The Development of Performance Assessment of STEM-Based Critical Thinking Skill in the High School Physics Lessons. International Journal of Environmental And Science Education, 12(5), 1269–1281. http://www.ijese.net/makale/1894.html

Retnawati, H. (2016). Validitas reliabilitas dan karakteristik butir (Validity, reliability and item charactheristic). Yogyakarta, Indonesia: Parama Publishing.

Retnawati, Heri. (2014). Teori respons butir dan penerapannya: Untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Yogyakarta: Nuha Medika.

Samritin, S. (2022). Identifikasi Muatan Differential Item Functioning Pada Data Ujian Nasional Matematika. Journal on Education, 4(4), 1675–1684. https://doi.org/10.31004/joe.v4i4.2508

Sarea, M. S., & Ruslan, R. (2019). Karakteristik Butir Soal: Classical Test Theory Vs Item Response Theory? Didaktika: Jurnal Kependidikan, 13(1), 1–16. http://dx.doi.org/10.30863/didaktika.v13i1.296

Scott, E. E., Wenderoth, M. P., & Doherty, J. H. (2019). Learning progressions: An empirically grounded, learner-centered framework to guide biology instruction. CBE—Life Sciences Education, 18(4), es5. https://doi.org/10.1187/cbe.19-03-0059

Shofwanthoni, M. A., Ridlo, S., & Elmubarok, Z. (2019). The development of authentic assessment instrument of Hajj Manasik practices of IX grade of SMP PGRI 10 Candi in Sidoarjo Regency. Journal of Research and Educational Research Evaluation, 8(1), 14–21. https://doi.org/10.15294/jere.v8i1.28361

Singh, J., Steele, K., & Singh, L. (2021). Combining the best of online and face-to-face learning: Hybrid and blended learning approach for COVID-19, post vaccine, & post-pandemic world. Journal of Educational Technology Systems, 50(2), 140–171. https://doi.org/10.1177/00472395211047865

Smith Jr, E. V. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2), 205–231. https://europepmc.org/article/med/12011501

Sudaryono, S. (2011). Implementasi Teori Responsi Butir (Item Response Theory) Pada Penilaian Hasil Belajar Akhir di Sekolah. Jurnal Pendidikan Dan Kebudayaan, 17(6), 719–732. https://doi.org/10.24832/jpnk.v17i6.62

Syahrul, Mansyur, & Rosdiyanah. (2016). Pengaruh Jumlah Butir Anchor Terhadap Hasil Penyetaraan Tes Berdasarkan Teori Respon Butir. Jurnal Kependidikan, 46(2). http://dx.doi.org/10.21831/jk.v46i2.10935

Syamsuddin, S. (2023). Implementasi Classic Test dan Item Respon Theory Pada Penilaian Tes Pembelajaran Matematika. EDUSCOPE: Jurnal Pendidikan, Pembelajaran, Dan Teknologi, 8(2), 28–43. https://doi.org/10.32764/eduscope.v8i2.3488

Umar, A.-T., & Majeed, A. (2018). The Impact of Assessment for Learning on Students’ Achievement in English for Specific Purposes: A Case Study of Pre-Medical Students at Khartoum University: Sudan. English Language Teaching, 11(2), 15–25. http://doi.org/10.5539/elt.v11n2p15

von Davier, A. (2010). Statistical models for test equating, scaling, and linking. Springer Science & Business Media.

Wahhab, K. A., & Rizko, N. J. (2019). The importance of evaluating the environmental design and performance of student projects as a product of architecture departments: A case study. Periodicals of Engineering and Natural Sciences, 7(3), 1286–1299. http://dx.doi.org/10.21533/pen.v7i3.666

Were, M. C., Sinha, C., & Catalani, C. (2019). A systematic approach to equity assessment for digital health interventions: case example of mobile personal health records. Journal of the American Medical Informatics Association, 26(8–9), 884–890. https://doi.org/10.1093/jamia/ocz071

Wiliam, D., & Thompson, M. (2017). Integrating assessment with learning: What will it take to make it work? In The future of assessment (pp. 53–82). Routledge.

Ziegler, M., & Hagemann, D. (2015). Testing the unidimensionality of items. In European Journal of Psychological Assessment. Hogrefe Publishing.

Published

2023-12-01

Issue

Section

Articles