The recent development of MOOCs has provided instructors with exciting opportunities to teach to a massive and diverse student population through platforms such as Coursera, EdX, and Udacity. However, the large-scale participation also presents many pedagogical problems. One major problem is assessment of MOOC assignments (e.g. design projects, art works, and essays), as enrollment of a MOOC can be as large as hundreds of thousands of students thus exceeds the grading capacity of a single instructor.
In an attempt to solve this assessment problem, Coursera has introduced a peer grading system that guides MOOC students in using grading rubrics to evaluate and provide feedback on each other's work. However, the reliability and validity of such peer grading system have yet to be verified, and little is known regarding how the peer grading activity can affect students' MOOC learning experience. To address this research need, this study systematically investigates the peer grading results from a Coursera-based MOOC. The study findings are expected to provide empirical evidence on the reliability, validity, and perceived effects of MOOC-scale peer grading, and will inform our decisions to design, implement, and improve the existing peer grading system for future MOOCs.
REVIEW OF LITERATURES
Peer grading is hardly a novel concept and has been practiced in a wide range of subject domains. In addition to reducing instructors' workload, peer grading is also believed to bring many potential benefits to learning, including increased motivation (Bostock, 2000), enhanced social presence (Strijbos & Sluijsmans, 2010), and development of higher-order thinking and metacognition skills (Bostock, 2000; Mok, 2011; Strijbos & Sluijsmans, 2010).
The reliability and validity of peer grading have been supported by the literature. The findings in general have shown a good consistency among peer-assigned grades and a strong correlation between peer grading and instructor grading, indicating peer grading is a reliable and valid learning assessment (Bouzidi & Jaillet, 2009; Cho et al. 2006; Falchikov & Goldfinch, 2000; Sadler & Good, 2006). However, it is important to note that such findings are based on the context of college degree courses with small or moderate enrollments therefore their applicability in the MOOC context remains largely unknown and is in need of further research.
RESEARCH CONTEXT AND QUESTIONS
This study investigates Coursera's peer grading system based on the data collected from "Maps and the Geospatial Revolution(MGR)", a MOOC offered by the Penn State University in 2013. The final assignment of MGR is a map-design project. Each submitted assignment was graded by 5 randomly selected MOOC students. More specifically, we seek to answer the following two research questions:
RQ1. Is peer grading a reliable and valid assessment of student work in a MOOC?
RQ2. What are the perceived effects of peer grading on students' MOOC learning experience?
The primary data source is the relational database used internally by Coursera that contains all of the instructor- and student- generated content. Peer grading data such as assignment id, individual peer-grades, final grade, and self-grade can be found in the "hg_assessment_metadata" portion of the data base. The MOOC instructor will also review and assign grades to 10% randomly selected assignments. Another data source is the end-of-course survey that includes six Likert-scale questions to collect students' opinions about the peer grading activity in their MOOC.
To answer RQ1, we calculated the reliability by measuring the general agreement among peer graders, who are randomly selected by the Coursera system. As a result, Case1 Intra-Class Correlation (ICC ) was selected as the statistical method for calculation (Shrout & Fleiss, 1979). The validity of peer grading was measured by the similarity between peer-assigned and instructor-assigned grades using Pearson correlation coefficient (r). We also measured the Pearson r to examine the similarity between peer-assigned and self-assigned grades. RQ2 is answered by the end-of-course survey data, which provided a tallied summary of students' perceptions regarding the usefulness, fairness and effects of peer grading on their MOOC learning experience.
The ICC  statistics for peer-assigned grades are presented in Table 1, proving Coursera' peer grading system to be a reliable assessment of MOOC assignments. The small single-measure ICC (0.26) and high average-measure ICC (0.64) indicate that while a single peer might not be able to provide reliable grades, averaging the five peer-assigned grades (as Coursera does) can produce fairly reliable grading results.
The MOOC instructor is still in the process of assigning grades to the final MGR assignment. As a result, we cannot compare the peer-assigned grades with the instructor-assigned grades at this time. However, we will be able to share the results on validity during our presentation. Interested in the relationship between peer- and self-grading, we also calculated the correlation between the two and found a small but significant correlation (r=0.26, p<.01, N=1820). Interestingly, the data also suggest peer-grades tend to be lower than self-grades (x?=9.0 and 9.7 respectively, with a maximum grade of 12).
The survey results show that MOOC students in general liked the peer-grading activity, believing they have received fair grades and useful feedback. Consistent with what the literature suggests, the students also believe peer grading benefited their MOOC learning in terms of motivation, social presence, and higher-order learning. The descriptive data of students' responses (N=2592) for each survey question are summarized in Table 2.
The initial statistical results show that Coursera's peer grading system can provide a reliable assessment of complicated MOOC assignments. However, the inter-rater reliability among peer graders is lower than we expected, and more analysis is needed to identify the causes. In addition, peer grading is found to be significantly correlated with self-grading, but tend to be lower. The peer grading activity is also well-received by the MOOC students, who believe it is useful, fair, and has positively influenced their learning experience in various ways.