VALIDATION OF A RUBRIC TO ASSESS INNOVATION COMPETENCE

This paper addresses the development and validation of rubrics, materials and situations for the assessment of innovation competence. Research was carried out to verify the viability of the first draft of the assessment criteria, which led to refinement of the criteria and proposals to enhance the ensuing validation process that will include students and raters of different language backgrounds.


Introduction
The European model of curriculum design is based on competences, although little research can be found on the key competences that a person should possess to be considered innovative.Therefore, it would be desirable to have models that enable the description of how innovation competences are developed and attained (Marín-García et al., 2011).
The Community Innovation Surveys used in the European Union (EU) derived from the guidelines to coordinate innovation research in the EU that were defined in the Oslo Manual, first published in 1993 by the Organization for Economic Cooperation and Development (OECD).The surveys highlight the importance that the concept of innovation has in the professional contexts of the majority of university degree programs.Innovation should generate competitiveness in business through the development and management of knowledge.
This paper stems from the conception of innovation as a construct (Figure 1) that is based on individual, interpersonal and networking aspects, following the model proposed by Penttilä et al. (2011Penttilä et al. ( , 2012)).Penttilä et al. (2011;2012) In order to acquire innovation competence it seems necessary to experiment with new teaching alternatives associated with active methodologies.However, it is also necessary to develop new assessment methods that are valid and that produce reliable results; this calls for the definition of the learning objectives and outcomes that are to be achieved.The first step is to define the skills and capacities that make up innovation competence and, subsequently, create an instrument that will measure those skills and capacities.
With regard to the certification of the components of innovation competence, no formal system has been regulated to acquire and assess the skills and capacities involved.For example, within the higher education framework, records of the acquisition and development by students of generic competences are in want of systematic evidence.In the majority of cases, the acquisition of skills, knowledge and attitudes are inferred but not assessed appropriately, even when included as part of summative assessment.
The lack of a formal system of identification and validation of innovation competences can probably apply to other generic competences that are included in current new degree programs.For this reason, the main objective of this paper is to present a first draft of an innovation competence barometer, i.e., a rubric that comprises the cataloging and assessment of the skills and capacities that constitute innovation competence.

Methodology
The proposal and validation of the assessment rubric for innovation competence presented in this paper follows the steps set forth in the Instrument Development and Construct Validation (IDCV) methodology informed by Onwuegbuzie et al. (2010).The first five of the ten steps posed have been completed and are addressed here.
1. Conceptualize the construct of interest 2. Identify and describe behaviors that underlie the construct 3. Develop initial instrument 4. Pilot-test initial instrument

Design and field-test revised instrument
To define the construct, the items in the rubric were initially extracted from in-depth interviews with three human resource managers from different firms well known for their innovation.In the second and third steps, the items were expanded after meeting with a focus group of 12 academics.The criteria were then revised taking into account the 9 generic competences defined by the Accreditation Board for Engineering and Technology (ABET), which have been adopted in the OECD initiative for the Assessment of Higher Education Learning Outcomes (AHELO), competences that have been described by many authors (McGourty et al., 2002;Parsons et al., 2005;Passow, 2012;Shuman et al., 2005;Villa et al., 2007;Marin et al., 2011;Garzón, 2010;Penttilä et al. 2011Penttilä et al. , 2012;;or Montero, in press).
In the fourth step, the first list with 39 items was sent for review to 20 academic practitioners with experience in assessment.In a subsequent plenary session with 19 raters, the analysis, annotation and filtering of the items took place.After discussion of the quality and operability of the criteria, the number of items was reduced from 39 to 25, grouped in the three dimensions shown in Figure 1.There were thus 12 items in the individual dimension, 8 items in the interpersonal dimension and 5 items in the networking dimension.
All 19 raters, of varying expertise, simultaneously participated in the trial rating of a video recording of three students who had been placed in a situation with a task that would require a display of innovation in finding a solution.Raters had 3 choices; yes, student behaviour is observed; no, student behaviour is not observed; or not applicable (n/a), student behaviour cannot be assessed because there is no evidence.In the ensuing statistical analysis, items that were left blank were counted as missing marks.
This first testing session of the instrument produced 57 ratings, i.e., 3 students were assessed by 19 raters (R01…R19).Descriptive analysis of missing marks and score frequencies led to an initial filtering of items and raters (Doval Dieguez y Viladrich Segués, 2011; Onwuegbuzie et al., 2010;Viladrich Segués y Doval Dieguez, 2011).Multiple correspondence analyses identified possible groupings of variables according to score frequency.The objective of this technique is the reduction of the dimension of a set of categorical variables, each of which consists of two or more categories (in this case, three categories: yes, no or n/a).The technique analyzes the relationship between the different categories of variables and yields as a result a bidimensional diagram.The position of the variable category in the diagram is essential, since proximity indicates relationship or association between variable categories, while distance or separation indicates the lack of a relationship or association (Greenacre, 2008;Lizasoain & Joaristi, 2012;SPSS-inc., 1990).

Results
In the analysis of missing ratings (Figure 2), it can be observed that raters R09, R18, R05, R14 and R06 systematically left a high number of items blank when assessing the three students.Upon analyzing the results by item (Table 1), the items with the most missing ratings all belonged to the individual dimension.
-05 Critically evaluates the fundaments of ideas/actions -11 Uses resources ingeniously -12 Foresees how events will develop -16 Takes intelligent risks -17 Orients task towards target However, regarding the items that could not be assessed in the situation that was used in the video recording (n/a), the items belonged to the networking dimension, with the exception of item 37 "Speaks foreign languages".The items with the most n/a were: -32 Applies ethical values -33 Is able to work in cooperation in multidisciplinary/multicultural contexts -38 Maintains relationships with all actors engaged in a local, regional or international endeavor -39 Knows where to go or whom to involve to overcome difficulties and to solve problems The data in Table 1 show that the items can be grouped in three categories (Table 2).First, there are items that cannot be observed in any of the students in the situation that was assessed.Those items should not be discarded but rather could be used in sessions for selfassessment.Or the items could be used in testing situations that favored the appearance of the behavior described in the items.
The second category contains items that are observable but do not discriminate among the individual students.This could be because the items assess behavior that is present in any student, in which case it makes no sense to include these items in the criteria.Another possibility is that the three student participants happen to be similar in these aspects.Some of these items will have to be eliminated in further analyses because they are constant.
The third category includes the rest of the items, which proved to be observable in the testing situation that was used and at the same time were discriminatory in the score totals of the students rated.02, 21,25, 37 03, 04,14,17,18, 23, 24 Observable and discriminate among the 3 students 05,06,10,15,16,26,27,29,11,12,33,39 The multiple correspondence analyses were carried out with the three types of possible response (yes, no, n/a), after eliminating the 5 raters with a high number of missing ratings.As items 21, 32 and 37 were constant and thus not used in the analyses, 22 valid items remained, resulting in a bidimensional representation with projections of 66 response categories (Figure 3 and Table 3).The solution with three dimensions explains 71.4% of the variance in the data, with high internal consistency (Table 4).In general, the greater the distance from origin that the point representing the item is, the greater is its weight in the definition of the factor.Item 38, situated almost at the origin, is an item with little variability (almost constant) in the cases rated.The networking items have low factor loads in dimension 2 and moderate loads in dimensions 1 and 3. We suppose that with another set of data with more variability and situations more favorable to these items, the tendency would be more evident.The loads of the interpersonal items were moderate in the three dimensions, although especially so in Dimensions 1 and 3.The items termed individual have low loads in Dimension 3 and are identified as those that have high loads in Dimension 2 and almost none in Dimension 1 (items 02, 03 and 11), those that have loads to the contrary (items 12, 14 and 06) and those that are evenly distributed in both dimensions (10,05,04,15,17).

Discussion
This paper has presented a three-category assessment model to serve as a barometer of innovation competence.An assessment rubric was created based on expert judgment and a review of the literature; the items were filtered after review by academics with experience in assessment.The paper has presented the results of the analyses carried out with the data obtained from a trial rating session with 19 raters.Results seem to indicate that there are differences between the three categories, individual, interpersonal and networking.Nevertheless, these results should be interpreted with caution since it is a pilot experience with only one recording.The need for rater training is apparent because of the missing ratings and fall-back on option n/a.
In future ratings with larger samples, items 21, 32, 37 and 38 should be monitored to see if they are constant, in order to decide on their elimination in the definitive version of the barometer.Items 12 (Foresees how events will develop) and 16 (Takes intelligent risks) seem difficult to assess, at least in short testing situation such as the one used for the trial ratings here, and may warrant elimination.Similarly, several items may be redundant.For example, item 15 (Is tenacious) seems to associate closely with item 14 (shows enthusiasm); item 18 (Transmits ideas coherently/effectively) is similar to item 23 (Uses dialogue to establish constructive relationships).Item 24 (Collaborately actively) is similar to item 25 (Contributes to group functioning).Likewise, item 37 (Speaks foreign languages) may be redundant because, in the end, for the behavior reflected in item 37 (Is able to work in cooperation in multidisciplinary/multicultural contexts), it is necessary to speak other languages to work in multicultural environments.
In conclusion, these results must be complemented with data obtained through experimentation in the classroom.For this reason, future research will focus on further field testing with students and raters of different language backgrounds in order to be able to complete the five remaining steps of the IDCV.Those steps are the following: Through the application of the teaching and learning methodology which has been coined "Research Hatchery" (see Genco et al., 2012, or Lehto et al., 2011), a larger, longitudinal sample will be obtained, thus allowing for the completion of the validation of the rubric.

Figure 2 .
Figure 2. Number of missing ratings classified by rater

Figure 3 .
Figure 3. Bidimensional representation of the multiple correspondence analyses (solution with 3 axes)

6 .
Validate revised instrument: Quantitative analysis phase 7. Validate revised instrument: Qualitative analysis phase 8. Validate revised instrument: Mixed analysis phase: Qualitative-dominant crossover analyses 9. Validate revised instrument: Mixed analysis phase: Quantitative-dominant crossover analyses 10.Evaluate the instrument development/construct evaluation process and product

Table 2 .
Classification of items

Table 3 .
Item discrimination

Table 4 .
Internal consistency of the axes in the solution with three dimensions