Fixed: $ count.py kjv.xml Code point Character Name Count 000020 SPACE 1669596 000022 " QUOTATION MARK 1661832 00006F o LATIN SMALL LETTER O 1330866 000072 r LATIN SMALL LETTER R 1307266 000073 s LATIN SMALL LETTER S 1172801 000065 e LATIN SMALL LETTER E 1156121 00006E n LATIN SMALL LETTER N 1092384 00006D m LATIN SMALL LETTER M 1029125 000074 t LATIN SMALL LETTER T 901465 00003C < LESS-THAN SIGN 864037 00003E > GREATER-THAN SIGN 864037 00003D = EQUALS SIGN 830916 000061 a LATIN SMALL LETTER A 776214 000077 w LATIN SMALL LETTER W 772641 000068 h LATIN SMALL LETTER H 625029 00003A : COLON 609087 000067 g LATIN SMALL LETTER G 560652 00006C l LATIN SMALL LETTER L 497519 00002F / SOLIDUS 469056 000069 i LATIN SMALL LETTER I 406801 000030 0 DIGIT ZERO 393184 000070 p LATIN SMALL LETTER P 370919 000031 1 DIGIT ONE 350731 000048 H LATIN CAPITAL LETTER H 312386 000032 2 DIGIT TWO 290358 000038 8 DIGIT EIGHT 283469 000033 3 DIGIT THREE 263960 000064 d LATIN SMALL LETTER D 257239 00002E . FULL STOP 220707 000035 5 DIGIT FIVE 209066 000062 b LATIN SMALL LETTER B 204056 000034 4 DIGIT FOUR 197713 000063 c LATIN SMALL LETTER C 197400 000037 7 DIGIT SEVEN 193701 000036 6 DIGIT SIX 183464 000047 G LATIN CAPITAL LETTER G 175932 000039 9 DIGIT NINE 172006 00002D - HYPHEN-MINUS 152074 000049 I LATIN CAPITAL LETTER I 133127 00004D M LATIN CAPITAL LETTER M 126782 000044 D LATIN CAPITAL LETTER D 121721 00004E N LATIN CAPITAL LETTER N 115182 000076 v LATIN SMALL LETTER V 114636 000054 T LATIN CAPITAL LETTER T 113384 000075 u LATIN SMALL LETTER U 111775 000079 y LATIN SMALL LETTER Y 109108 000050 P LATIN CAPITAL LETTER P 107290 000041 A LATIN CAPITAL LETTER A 94242 000053 S LATIN CAPITAL LETTER S 85226 000066 f LATIN SMALL LETTER F 84923 00002C , COMMA 74768 000043 C LATIN CAPITAL LETTER C 73229 00004A J LATIN CAPITAL LETTER J 39531 000056 V LATIN CAPITAL LETTER V 36203 00006B k LATIN SMALL LETTER K 35707 00000A not found 34899 000045 E LATIN CAPITAL LETTER E 25991 000052 R LATIN CAPITAL LETTER R 24737 000046 F LATIN CAPITAL LETTER F 23948 00004F O LATIN CAPITAL LETTER O 20676 000078 x LATIN SMALL LETTER X 18179 00004C L LATIN CAPITAL LETTER L 16367 00003B ; SEMICOLON 10159 00007A z LATIN SMALL LETTER Z 6930 00004B K LATIN CAPITAL LETTER K 5389 000042 B LATIN CAPITAL LETTER B 5047 00003F ? QUESTION MARK 3421 000058 X LATIN CAPITAL LETTER X 3283 002026 … HORIZONTAL ELLIPSIS 3115 0000B6 ¶ PILCROW SIGN 2970 00006A j LATIN SMALL LETTER J 2596 000057 W LATIN CAPITAL LETTER W 2489 000071 q LATIN SMALL LETTER Q 2334 000027 ' APOSTROPHE 2040 00005A Z LATIN CAPITAL LETTER Z 1776 002013 – EN DASH 920 000055 U LATIN CAPITAL LETTER U 797 000059 Y LATIN CAPITAL LETTER Y 551 000021 ! EXCLAMATION MARK 313 000028 ( LEFT PARENTHESIS 240 000029 ) RIGHT PARENTHESIS 240 000051 Q LATIN CAPITAL LETTER Q 199 0000E6 æ LATIN SMALL LETTER AE 93 00007B { LEFT CURLY BRACKET 5 00007D } RIGHT CURLY BRACKET 5 0000C6 Æ LATIN CAPITAL LETTER AE 3 0005D1 ב HEBREW LETTER BET 1 0005D5 ו HEBREW LETTER VAV 1 0005D9 י HEBREW LETTER YOD 1 0005E1 ס HEBREW LETTER SAMEKH 1 0005E9 ש HEBREW LETTER SHIN 1 0005D2 ג HEBREW LETTER GIMEL 1 0005D6 ז HEBREW LETTER ZAYIN 1 0005DE מ HEBREW LETTER MEM 1 0005E2 ע HEBREW LETTER AYIN 1 0005E6 צ HEBREW LETTER TSADI 1 0005EA ת HEBREW LETTER TAV 1 0005D3 ד HEBREW LETTER DALET 1 0005D7 ח HEBREW LETTER HET 1 0005DB כ HEBREW LETTER KAF 1 0005E7 ק HEBREW LETTER QOF 1 002015 ― HORIZONTAL BAR 1 0005D0 א HEBREW LETTER ALEF 1 0005D4 ה HEBREW LETTER HE 1 0005D8 ט HEBREW LETTER TET 1 0005DC ל HEBREW LETTER LAMED 1 0005E0 נ HEBREW LETTER NUN 1 0005E4 פ HEBREW LETTER PE 1 0005E8 ר HEBREW LETTER RESH 1
--Greg On Mon, Jul 4, 2011 at 10:41 AM, David Haslam <dfh...@googlemail.com> wrote: > Output is a tad less descriptive than that from BabelPad. > > Here's the first 25 lines from a file I was working on. > > /For files with long character names, best to use a wider tab setting in > one's editor./ > > Code point Character Character Name Count > 000020 SPACE 609,105 > 000021 ! EXCLAMATION MARK 2,009 > 000022 " QUOTATION MARK 2,245 > 000027 ' APOSTROPHE 199 > 000028 ( LEFT PARENTHESIS 93 > 000029 ) RIGHT PARENTHESIS 93 > 00002A * ASTERISK 3,500 > 00002B + PLUS SIGN 66 > 00002C , COMMA 73,327 > 00002D - HYPHEN-MINUS 901 > 00002E . FULL STOP 22,991 > 000030 0 DIGIT ZERO 2,822 > 000031 1 DIGIT ONE 14,709 > 000032 2 DIGIT TWO 10,486 > 000033 3 DIGIT THREE 6,626 > 000034 4 DIGIT FOUR 4,786 > 000035 5 DIGIT FIVE 3,897 > 000036 6 DIGIT SIX 3,478 > 000037 7 DIGIT SEVEN 3,230 > 000038 8 DIGIT EIGHT 3,062 > 000039 9 DIGIT NINE 2,920 > 00003A : COLON 10,445 > 00003B ; SEMICOLON 11,513 > 00003F ? QUESTION MARK 3,010 > > > -- > View this message in context: > http://sword-dev.350566.n4.nabble.com/Character-Frequency-tp3642222p3643921.html > Sent from the SWORD Dev mailing list archive at Nabble.com. > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page