Assisting assessment of online collaborative student work

The basic essence of our project is we have observed an increase in quality of work produced in a wiki (where our students work together) compared to their normal essays. However, we would like to quantify and qualify this rather than going with impressions. Our simplistic view had been to look at comparing the 2 sources by hand using defined indicators of say evidence of using critical thinking, use if primary sources etc. These can be tested by checking for the presence of certain words or word patterns.

 

 

Who am I?: 
Dr Jessie Paterson & Dr Christian Lange, School of Divinity

Assisting assessment of online collaborative student work

Project Outcomes: 
We have produced a report describing possible techniques that could be used to create a practical, automated tool for giving formative feedback on student written work. As part of this research, we produced a formalised list of quality criteria. This will be of direct practical use to students and staff in clarifying the criteria used for assessment.
Do the outcomes match the objectives: 
The original aim had been to use informatics techniques to investigate the apparent quality differences between student work in the form of wikis and essays. However, this was intended only as a practical starting point to explore the broader issues of applying Informatics technology to teaching and learning in the humanities. As the project progressed it became clear that the wikis and essays were totally different, each being driven by different quality criteria. This made a meaningful comparison impractical. However, it also became clear that the techniques being studied would be capable of being used in a practical tool which could provide valuable formative feedback on student work. This became the focus of the remainder of the project, and has produced some very promising results.
Benefits for the future: 
The exploratory nature of the project has allowed to to investigate possible applications of informatics techniques to teaching and learning in the humanities. Both partners now have a much better understanding of the requirements and possibilities. In concrete terms, we have identified a specific application area which is currently very important (formative feedback), and convinced ourselves that these techniques could make a valuable contribution to creating a practical tool.
Opportunities for further research: 
We would like to take the research forward by attempting to create a prototype of a practical tool, based on the techniques studied. We currently think that something like a one year project might be an appropriate length to explore this initially - perhaps as a Master by Research. However, potential sources of funding for this are not currently clear.
Has the project created any new shared resources?: 
Only the full report on the findings.
Future research plans: 
We plan to explore possible funding resources to take this work forward
Publications and presentations: 
Poster at e-assessment Sept 2010 in Dundee. Abstract submissions for virtual paper to ICERI 2010 (International Conference of Education, Research and Innovation) Madrid (Spain) November, 2010 and paper to Online Educa Berlin December 2010 (still awaiting to hear about acceptance)
Collaborations: 
Links between School of Divinity and Informatics. Links with Francisco Iacobelli, Northwestern University and Alastair Gill (now at University of Surrey) on the computational linguistic techniques – these links will be important on taking the project forward.
Use of funding: 
The funding was used as planned, except that the monies allocated for dissemination could not be spent because the appropriate conferences fell beyond the end-of-year financial cutoff. Some of this money was used instead to add additional expertise to the project by commissioning a background technology report from Francisco Iacobelli. This proved extremely valuable.
How is it novel? What is exciting about it?: 
Automated essay marking has been on the agenda for years but all the methods require training the systems and really need a high input of the same essay - something our student numbers don't merit! What this is proposing is a more "crude" but still valid approach of assessing quality using linguistic techniques to search for known indicators of quality in the Humanities. this project is to look at the comparison between things for the same student but could be applied more widely
What will I do next? What opportunities will it open up?: 
If successful we could try to get further funding externally (so far we have failed!)
What constitutes success? How risky is it?: 
having an automated technique of assessing the quality attributes of the work.
What resources do I bring to the project?: 
Subject and educational knowledge
What resources and expertise do I need?: 
We need some linguistic person to help build the system?
What shared resources, if any, will the project create?: 
A methodology and approach others could use or adapt?
What is the timescale?: 
We don't have one but we currently have students working on a wiki project that will be also producing essays as well (we also have the same data from last year!)

Question 10

It probably means, what kind of resources do you need -- how much money? And perhaps help in other ways, like contacting the right kinds of people to advise on techniques etc.

Progess Month 1

During the first month of the project I collected and cleaned up the essay and wiki data for development and testing purposes. Because the essays and wikis contain many Arabic words, Iqbal is creating a Arabic-English lexicon, which can be used in our NLP system at a later stage.
I also started looking at related work for automated essay marking and identified a number of essay marking systems that are either rule based, statistical or both. I will have a closer look at NLP systems such as ETS I and BETSY next to see what has been done previously and if we can benefit from the existing systems.
I'm also looking at annotation tools we can use for creating training data for our system.

iDEA Lab Lunch 23.2.10

Here are our project poster and the slide for our talk:
http://tinyurl.com/yh7v9t5
http://tinyurl.com/yekh4cv

Rule based NLP system

In order to develop our rule based NLP system we will start running the essay and Wiki data through the LT-TTT2 pipeline (http://www.ltg.ed.ac.uk/software/lt-ttt2), which is a XML-based software for shallow linguistic processing of text developed at the University of Edinburgh.

The TTT2 tool turns text into XML format, tokenizes the text, performs sentence splitting, chunking and rule-based named entity recognition. Third party tools perform part-of-speech tagging and lemmatizing.

The results of the part-of-speech tagging process are important for us to recognize the type of words in our data, e.g. nouns or adjectives, which plays an important role for assessing quality in our list of criteria. For example, a sentence of an encyclopedic style Wiki entry would feature a succession of adjectives, e.g. “luscious, ripe fruit found in paradise”.

Shallow parsing or chunking gives us an insight into what the constituents (noun groups, verbs, verb groups, etc.) of a sentence are. The structure of the sentences is also one of our criteria for assessing quality. Chunking doesn’t specify the internal structure of the constituents or their role in the main sentence, though. This means that full parsing might be necessary at a later stage if the information given by chunking isn’t sufficient to determine quality.

In order to perform the named-entity recognition correctly, the lexicon of Arabic expressions we identified at the start of the project will be incorporated into the NER module so that the Arabic words will be picked up and identified correctly, because as a default the module is trained on English data only. NER is important to identify the use of names, locations, references and such.

The following post shows the full list of essay quality criteria we have developed.

Marking Criterion for Project (submitted for Iqbal Akhtar)

1.) References
a. Emphasis on primary sources is valuable
b. Web references < peer reviewed academic journal references
c. Introductory texts, class notes, and class readings < book references
d. Wikipedia and basic sites like Encyclopedia Britannica are less valuable
e. Number of unique books references in footnotes and bibliography are valuable
f. Quality of referencing using social science conventions are valuable
g. Many quotations and extensive quotations are less valuable

2.) Terminology
a. Use of higher order and unique words in text such as: framework, criticism, illuminates, criticism, Orientalism, etc… (word list to be created)
b. Variety in referencing is valuable
c. Number and level of foreign (Arabic) terminology (word list to be created)
d. Number of unique proper nouns is valuable
e. Word repetition as less valuable

3.) Grammar
a. Run on sentences are less valuable
b. Dependent clauses and passive voice which explains using commas and semicolons are valuable

Three papers used as examples- excellent, average, and poor marks from 2009-2010 class.

Status Update

Status Update

This project started off as an exploration of possible applications for Informatics techniques to e-learning in the School of Divinity. As a concrete example, we wanted to look at whether we could derive anything interesting from trying to automatically compare student work in the form of traditional essays, with collaborative wikis. But this was always intended as a starting point, and we were hoping to find some interesting potential applications that might be suitable for future collaboration.

On the pedagogical side, we started with the observation that the quality of the student wiki work appeared to be better than that of the individual essays. But it has been very interesting to try and derive explicit criteria for the "quality" of the work. Interesting questions have been raised about the different styles and approaches when using wikis, as opposed to traditional essays - and it isn't clear how easily these can be compared.

We have identified a number of criteria, and we are currently refining these, attempting to clarify their importance, and to see how they might be identified automatically.

On the technical side, we have been dealing with the pre-processing (attempting to handle embedded Arabic phrases!) and researching potential tools and approaches. We are still hoping to demonstrate the automation of some criteria, but it is clear that a significant practical tool is beyond our resources at this stage.

However, as we had hoped, the collaboration has highlighted a few promising directions for the future - in particular, we have become very interested in the possibility of building a tool which would provide automatic formative feedback on student writing. This would be extremely valuable in practical terms. It appears that there has been enough work in this area to demonstrate the feasibility (eg. [1]), at least in a more scientific subject. And we would be very interested in exploring the issues of creating a practical tool in the humanities context.

So, we now have a much clearer focus of where the collaboration might go next. For the remainder of the project, we want to explore the specific technical and pedagogical issues of this particular application. And we now have a final aim of creating a report on this work, together with a clear proposal for further funding. With this in mind, we are hoping to broaden the collaboration a little to include people with appropriate experience, so that we understand the context and related work properly.

It is still rather too early to be able to disseminate results, but we are planning to submit a short presentation to e-Assessment Scotland [2], and we have submitted a proposal to the Online Education Conference [3]

[1] http://www.informaworld.com/smpp/content~db=all~content=a725291193~tab=c...
[2] http://www.rsc-ne-scotland.org.uk/eas/
[3] http://www.online-educa.com/

Syndicate content