[Tlhingan-hol] Certification Test Woes

Mon Mar 31 08:44:05 PDT 2014

On Mon, Mar 31, 2014 at 1:43 AM, Lieven <levinius at gmx.de> wrote:

>
>
>  classification number, not knowing the number would only lose people a
>> point or two on the question, and that would be a truly minor revision
>> to the test.
>>
>
> That's what makes it hard for grading. Does the half answer deserve a half
> point? Then why does an entirely translated sentence only deserve one
> point? But thats another topic :-)

This does expose what might be a weakness in the test design.  For Level 1
and 2 there are 20 questions, each question worth 5 points.  The answers to
some questions can be rather long (multiple words, multiple affixes)
compared to other questions (a single word or affix).  If the point of the
question is about affixes and you got the right affixes in the right
positions, but missed the root word, then you get partial credit.  But if
the answer is only a single word ("What is the suffix for augmentation" or
whatever; not sure if that's a real question), then you either get it or
you don't.  In effect, mistakes are not worth the same amount, the amount
is dependent upon the length of response and objective of the question, and
the student doesn't know what the test grader might actually be looking
for, beyond the specific wording of the question.

This is addressed by (a) encouraging the test taker to provide as much
information in their response as possible.  Another way of saying this is
that if you don't know the answer, there's no penalty for guessing.  You'll
lose all of the question points for a blank, but you might get some partial
credit if you guess (or you might get lucky and get full credit if you get
it right).  And secondly (b) by having a large test bank that populates
questions at random for any given test.  Over a range of tests the
proportion of low-value vs. high-value questions should be consistent, and
therefore fair.

As I recall we did discuss this at the time we were developing the KLCP.
 The alternative is to balance the questions/answers so they all have the
same (or nearly the same) amount of content, and thus the same value.
 Sounds nice on paper, but in practice this proved to be unworkable (how do
you measure "content" of a question; how do you limit all questions to a
similar range of content, etc.), so we went with the mitigations described
above.  I can't decide if the mitigation is valid or a rationalization, but
that's why we get to revisit these issues with the benefit of hindsight.

(As an aside: I'm assuming it's clear, but let me state it explicitly: I am
not opposed to discussing the weaknesses of the tests or finding ways to
improve them; I do not intend any of my statements to come off as
defensive, and I apologize if they do; and I think this type of review is
healthy for us as a community to better support and encourage new speakers
of the language, which is in line with the objectives of the KLCP overall.
 I am grateful to Qov for starting this discussion, which I find very
interesting.  If we decide it would be in our interests to make
improvements, I'm happy to see the community pulling together in a positive
way.  In a way, the fact that we're having this discussion is evidence of
our success; when the tests were being developed the number of speakers who
could participate in discussions about the test was very small.  So yay!)

--Holtej
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kli.org/pipermail/tlhingan-hol/attachments/20140331/01d55159/attachment.html>