[Tlhingan-hol] Fwd: RE: Klingon Scrabble

Robyn Stewart robyn at flyingstart.ca
Fri Jan 4 21:27:14 PST 2013


I followed Felix's advice deleted all occurrences 
of the most common character names, then deleted 
all X, Z and F and then redid the search. It's 
still a little skewed towards the letters that 
are the same in xifan as tlhIngan--I didn't strip 
out the pIqaD titles, so for example m is M in 
xifan but y is y, meaning that yay will be 
counted where it occurs in an 'ay' per but may won't be.

qaghwI' is still the second most frequent letter, 
so a good Scrabble set should have as many 
qaghwI'mey as 'atmey, or at least as  many as 'etmey.

Freq.   Letter
72543
39022   a
38237   '
21465   e
20702   o
19952   I
19760   H
18060   u
15987   j
13557   D
12305   m
11565   l
9979    v
9892    q
9647    ch - this was the frequency for c, with h 
at 9633, the discrepancy explained by a few non-Klingon words with c or h
9530    b
9359    S
9271    Z --> gh
9145    p
8695    t
8105    n
7648    y
7422    X --> tlh
6611    w
5793    Q
4978    r
3744    F --> ng
-------
Here are the other symbols and non-Klingon 
letters. (I moved the mevwI' and yevwI' down to 
here). They are uninteresting, but just give you 
an idea of what other junk is in the file. I 
honestly think most of it is from xifan (the way 
I have to write in order to use my pIqaD font). 
There is a more interesting list following this one.

14365   .
5067    ,
1122    ?
691     !
171     -
127     O
126     @
121     A
94      U
80      E
69      &
66      M
63      L
62      T
62      :
59      G
57      #
57      $
57      %
55      ^
54      N
51      (
51      )
47      s
47      *
42      J
41      P
40      C
39      K
39      B
38      g
35      k
30      Y
30      R
29      W
27      i
26      V
18      d
6       "
6       ;
5       1
5       2
4       5
3       9
3       0
3       3
3       /
3       6
2       8
2       7
1       4
1       [
1       ]
1
1       ©
3152    «
4       ­
3129    »

And now I have a theory why j is the third most 
common consonant after ' and H.  There's a lot of 
dialogue in the story, so a lot of jatlh, but 
that didn't do it, because tlh is one of the 
least common letters. It's all the first person 
statements: jIlegh and jIjaH and jIyaj.  Here are the top words used:

Unique words:22058  Total words:71912
Freq.   Word
2152    jatlh
1751    .
1736    'ej
1265    'e'
1118    ,
1089    'ach
513     je
429     HoD.  - three of my major characters are captains.
410     jatlh
394     tlhIngan - interesting. I really wasn't 
aware I used the word that much. I guess the aliens do.
341     HoD
295     'a
277     je.
273     Duj
265     ?
263     ghIq - pretty funny considering the word didn't exist a few years ago.
258     qImyal - a character name I didn't take out
254     DaH
210     chaq
204     yaS
201     wa'
185     latlh
185     Hoch
183     law'
180     vay'
177     neH
169     cha'
158     qaStaHvIS
156     ghaH
150     jatlh.
147     HoD,
147     loQ
143     wej
139     Hol
137     pa'
135     SIbI'
135     lojmIt
133     pagh
133     ghaytan
132     De'
127     neH.
126     Hung
122     vaD - from vajarvaD and 'eSSImvaD, but I 
deleted the character names stranding the suffix.
115     vaj
110     meH
105     qama'
103     QumwI'
101     net - interesting!  Never thought I used that that much.
100     be'
95      'avwI'
94      Sa'
94      nov
93      pay'
93      wo'
93      quv - although the character counting is 
case sensitive, the word-counting isn't, so this is Quv and quv
93      tIr - ha ha, if you've read the story you 
know why this one, love ya QeS.
93      ghop
91      mIch
90      qatlh
89      Qel/qel
87      not
85      'ungya - oops, another character name
85      qImyal.
83      ravDaq - okay that's just funny.  A lot 
of things take place on the floor in this novel.
81      Qa'bar - another character, didn't think he was that important
80      nom
80      loD
80      bIQ - that does make sense
79      beq
78      HoS
78      chu'
78      'oH
78      ghu'
77      -
77      wa'DIch
77      mu'mey
75      tlhoS
75      nuq/nuQ
75      ghaH.
74      reH
74      naDev
72      Qa'bar.
70      Qu'/qu'
70      nuH
69      'oy' - ha! Did I really hurt people that much?
69      Sa'.
68      Dugh - the name of one of the ships.
68      HIq - pretty hard drinking Klingons, 
considering that the aliens who take up half the story hardly touch the stuff
67      vay'
67      loS
67      muD
67      bom
67      qab/Qab - yep I shoehorned Qab in somewhere
67      nISwI' - lots of those
66      retlhDaq - wow, describing locations = more common than I thought
66      DeghwI'
66      DIr
66      ...
64      jang
64      jagh
63      Hagh
63      taj
62      QeDpIn.
62      qImyal,
62      DeS
62      lo'
61      SaH
61      «
60      meHDaq
60      vagh - these numbers might be from chapter headings
60      tugh
59      veS - another ship name
58      chay'
58      'op
58      puS
57      Qel.
57      'Iw  - heh, it turned up later than Hagh
56      motlh
56      DIvI'
55      «DaH
54      ghogh
54      nach
54      may'
54      DaHjaj
54      pe'vIl
54      Sov
53      be'tlhar - another name
53      ben
53      Soj
52      batlh
52      pIj
52      Daq/DaQ
52      potlh
52      jonwI'
52      'oH.
52      Hegh  - this is such a Klingon list
51      nID
51      nuv
51      QeDpIn
51      ghaHvaD
50      naQ
50       HungpIn

Most of the words are adverbs or conjunctions 
because they are the only words that don't take 
affixes. Nouns and verbs so often have affixes on 
them that they don't have high frequency in any 
one configuration. Here's a chunk of  lower 
frequency, all the same verb, and of course not 
including the versions starting with ma- vI- mu- orcetera.

5         FAZ
1       FAZBE'.
1       FAZBE'PU'
1       FAZCHUZ
1       FAZDI'
1       FAZLAH
2       FAZMEH
1       FAZPU'WI'PU'
1       FAZPU'›
1       FAZQA'LI'.
1       FAZQAF.
1       FAZQAFBE'WI',
1       FAZQO'.
1       FAZRUP
1       FAZRUPDI'
1       FAZRUPQU'CHOH
1       FAZRUPWI'PU'.
1       FAZTA'BOZ
1       FAZTA'MO'
1       FAZTAH
1       FAZTAHVIS

If someone wants more data I can work more on this.

- Qov

At 17:36 '?????' 1/4/2013, you wrote:
>Kind of a rookie solution, but what you could do 
>to check the frequencies of ng/gh/tlh is to make 
>a cooy of your document and then do a Find & Replace thusly:
>
>tlh -> X
>gh -> Z
>ng -> F
>
>
or some other letters that aren't much used 
>already (do a search first so you can subtract 
>pre-existing ones from the total.
>It's important to do gh before ng, because 
>otherwise ngh will become Fh, rather than nZ.
>
>If you'd like, you can also do something similar 
>with alien words/names like Mahoun that you don't want altering the results.
>________________________________________
>From: Robyn Stewart [robyn at flyingstart.ca]
>Sent: Saturday, January 05, 2013 02:18
>To: tlhingan-hol at kli.org
>Subject: Re: [Tlhingan-hol] Fwd: RE: Klingon Scrabble
>
>I can, but I don't have a platform in which I can write a clever
>script, so this counts each character for itself not as part of its
>Klingon letter. Here's what I get, with my comments.
>
>72543       - That's the space character, what you'd expect for a 75k
>word novel.
>44671   a  - Our existing distribution gets that right. I wonder if
>this is biased by character names. The main character is named vajar.
>I'll do a version stripped of character and ship names once I have a
>better system.
>40793   '    - I told you there weren't enough qaghwI'mey in the
>game. It beats out all but one vowel!
>28488   h   - This combines the letter's presence in tlh and gh, but
>excludes H.
>23699   o
>22652   e
>21140   I
>20469   H - I expected this to be more common in text than in the
>dictionary, because it's in -taH and -Ha' and -laH and -moH and -meH ...
>20213   u  - last of the vowels
>19024   l   - biased because this includes l and tlh
>17380   t   - biased by t + tlh
>17291   j - interesting. One of the ship names has a j and so does
>the main character's name. That might be a factor. But the main
>character also has a v and and r, so I don't think so.
>14365   .  - Heh. Short sentences, eh?
>13634   g - A combination of gh + ng
>13627   m
>13557   D
>13455   n - includes n and ng
>11737   S
>11226   v
>9892    q
>9647    c
>9530    b
>9145    p
>7685    y
>7653    r
>6611    w
>5793    Q
>
>So it looks like yay ray way and Qay should be the high-scoring
>letters.  Whoda thunk there were over three times as many Haymey as Qaymey.
>
>As an indication of the cleanliness of the data, here's the rest.
>
>5067    ,
>1702    M - two alien characters, one of whom is a main character,
>have names starting in M. The names of alien ships and persons is
>also the explanation for most of the non-Klingon alphabetic characters below.
>1122    ?
>691     !
>374     s
>172     T
>171     -
>141     i
>132     O
>126     A
>126     @ - The pIqaD 'ay' titles are typed in xifan hol, which
>renders the numbers as cartoon swear words.
>120     x
>94      U
>81      E
>76      F
>70      R
>69      &
>63      L
>62      :
>59      G
>57      #
>57      $
>57      %
>55      ^
>54      N
>51      )
>51      (
>47      *
>43      J
>41      P
>40      C
>39      K
>39      B
>35      k
>30      Y
>29      W
>27      V
>24      X
>18      d
>16      f
>
>At 22:52 '?????' 1/2/2013, you wrote:
> >Robyn,
> >     Could you analyze your own writings? I bet that would give a good
> >letter frequency representation.
> >
> >Tim Stoffel
> >
> >--
> >
> >On Tue, 2013-01-01 at 14:09 -0800, Robyn Stewart wrote:
> > > That's an interesting question. Is the letter frequency distribution
> > > of a large piece of text different than the frequency distribution in
> > > a complete wordlist of that language?  I think a list compiled just
> > > from TKD affix and vocabulary lists might competitively
> > > under-represent qaghwI', as it's in so many affixes.
> > >
> > > I found a shortage of qaghwI'mey during game play, but the
> > > artificiality of the arbitrarily high scores for tlh and ng didn't
> > > bother me much. It was just a luck thing.
> > >
> > > - Qov
> > >
> > > At 13:21 '?????' 1/1/2013, Felix Malmenbeck wrote:
> > > > At the risk of showcasing my ignorance with regards to Scrabble:
> > > >
> > > > Does one actually need a corpus to decide character values for
> > > > Scrabble? I imagine that a lexicon along with the rules for
> > > > appending affixes would suffice, as the deciding factor is what
> > > > words can be formed, rather than what words are most commonly used
> > > > (or do rare/difficult words weigh more heavily in that
> > > > calculation?).
> > > >
> > > >
> > > > ____________________________________________________________________
> > > > From: David Holt [kenjutsuka at live.com]
> > > > Sent: Tuesday, January 01, 2013 22:14
> > > > To: tlhIngan Hol mailing list
> > > > Subject: Re: [Tlhingan-hol] Fwd: RE: Klingon Scrabble
> > > >
> > > > > On Mon, Apr 14, 2008 at 11:57 PM, Alan Anderson
> > > > <aranders at insightbb.com> wrote:
> > > > > > I got it from DloraH, who got it from janSIy, who I believe
> > > > originated it.
> > > >
> > > > I didn't originate it, but I may have been the first one to bring a
> > > > converted set to the qep'a'.  I got the frequencies and values off
> > > > this very list and I no longer remember who did the calculations or
> > > > came up with the values.  It was probably 15 years ago.  The game is
> > > > fun, but the scores are somewhat artificial since the point values
> > > > were based on rarity of English letters and so it's weird to have
> > > > common letters like <tlh> be worth so many points.  I think any new
> > > > calculations should be based on Qov's <nuq bop bom>, since that is a
> > > > large piece of original tlhIngan Hol writing.
> > > >
> > > > janSIy
> > > > _______________________________________________
> > > > Tlhingan-hol mailing list
> > > > Tlhingan-hol at stodi.digitalkingdom.org
> > > > http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol
> > > _______________________________________________
> > > Tlhingan-hol mailing list
> > > Tlhingan-hol at stodi.digitalkingdom.org
> > > http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol
> >
> >
> >
> >_______________________________________________
> >Tlhingan-hol mailing list
> >Tlhingan-hol at stodi.digitalkingdom.org
> >http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol
>
>
>_______________________________________________
>Tlhingan-hol mailing list
>Tlhingan-hol at stodi.digitalkingdom.org
>http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol




More information about the Tlhingan-hol mailing list