Re: [Chicago] Getting ASCII encoding where unicode wanted under Py3k

2013-05-13 Thread Carl Karsten
On Mon, May 13, 2013 at 10:59 AM, Jonathan Hayward
 wrote:

That is way too much code for me to try and dig into.

Remove everything not needed to demo it.  Replace big strings with
little strings.

My guess is it should be 1-3 lines, like

>>> print('123%(a)s' % {'a': u'\u0161' } )
123š

But that works.  may need a few other lines, or something.
It is also possible that there is a setting in your OS that has an effect.

What OS?

--
Carl K
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [DB-SIG] dbf files and compact indices

2010-09-18 Thread Carl Karsten
On Sat, Sep 18, 2010 at 11:16 AM, Ethan Furman  wrote:
> Carl Karsten wrote:
>>
>> On Sat, Sep 18, 2010 at 1:11 AM, Ethan Furman  wrote:
>>
>>> Does anybody have any pointers, tips, web-pages, already written
>>> routines,
>>> etc, on parsing *.cdx files?  I have found the pages on MS's sight for
>>> Foxpro, but they neglect to describe the compaction algorithm used, and
>>> my
>>> Google-fu has failed to find any sites with that information.
>>>
>>> Any and all help greatly appreciated!
>>>
>>
>>
>> "Compound Index File Structure (.cdx)"
>>
>> http://msdn.microsoft.com/en-us/library/k35b9hs2%28v=VS.80%29.aspx
>>
>> which basiclly links to:
>> http://msdn.microsoft.com/en-us/library/s8tb8f47%28v=VS.80%29.aspx
>>
>> Is that what you need?
>
> Thanks for the link, unfortunately I am already familiar with the page.
>  What I need help with is the first sentence of the note at the bottom:
>
> Each entry consists of the record number, duplicate byte count and
> trailing byte count, all compacted. The key text is placed at the
> logical end of the node, working backwards, allowing for previous key
> entries.
>
> Here's a dump of the last interior node:
>
> -
> node type: 2
> number of keys: 57
> free space: 1 (or 256) (and is this bits, bytes, keys, what?)
> --
> record number mask: c8 0e 40 b0
> duplicate byte count mask: 28
> trailing byte count mask: 00
> --
> bits used for record number: 178
> bits used for duplicate count: 29
> bits used for trail count: 64
> bytes used for rec num, dup count, trail count: 192
> -
> 12 00 ff 3f 00 00 1f 1f 0e 05 05 03 01 00 c8 0e 40 b0 28 00
> b2 1d 40 c0 29 00 d0 42 40 d0 54 80 c0 43 40 a8 14 40 b8 40
> 40 c8 02 40 d0 08 00 b0 4c 80 b0 3a 40 a0 50 80 d0 3b 40 a8
> 09 40 b8 0a 80 88 3c 80 c0 2a 00 d8 21 c0 c0 3d 40 c0 4a 80
> b0 26 40 b8 2b 40 c0 2c 00 c0 41 40 b8 4d 80 c8 37 00 c0 04
> 40 c8 44 80 c0 1b 40 c8 15 80 c8 27 40 c8 16 00 a8 2d c0 c8
> 51 80 b8 2e 40 c0 1e 00 b0 17 40 b8 46 40 b0 2f 80 c8 4f 80
> a8 13 00 c8 59 00 c8 31 00 c8 1f 00 a8 3e 40 c0 22 40 a8 07
> 00 c8 23 80 d0 32 80 b0 52 80 c0 34 80 b0 20 40 b0 24 40 c0
> 47 80 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 4e 44 45 4e 49 44 53 4f 4e 43 43 41 4d 4d 4f 4e 54 54 48
> 45 57 53 53 4c 45 4e 52 54 49 4e 45 5a 4e 4e 4d 41 47 45 45
> 49 45 42 45 52 4d 41 4e 45 57 49 4e 53 4c 41 56 45 4e 42 45
> 52 47 4b 41 56 41 4e 4a 4f 4e 45 53 49 52 49 53 48 53 54 45
> 54 4c 45 52 52 41 4e 4f 4c 53 54 45 49 4e 45 41 44 4c 45 59
> 48 41 54 48 41 57 41 59 52 49 4d 45 53 45 41 53 4f 4e 53 53
> 47 4c 41 44 53 54 4f 4e 45 55 52 52 59 4f 53 54 52 49 4e 4b
> 52 42 45 53 4f 4c 45 59 46 49 4c 45 4e 45 4e 49 53 4e 47 4c
> 55 4e 44 45 42 45 52 4c 45 4f 44 53 4f 4e 49 4e 47 4c 45 52
> 4d 41 52 45 53 54 45 43 4b 45 52 54 4f 4e 44 41 59 57 47 45
> 52 52 4e 45 49 4c 2d 53 55 4e 44 54 4f 4f 4b 53 45 59 4c 45
> 4e 44 45 4e 49 4e 55 4e 48 49 41 50 50 45 54 54 41 52 4e 41
> 48 41 4e 43 41 4c 44 57 45 4c 4c 55 54 54 52 55 43 45 4f 43
> 41 52 44 45 4c 4f 4f 4d 42 45 52 47 4e 53 45 4c 45 45 52 42
> 41 43 48 55 47 55 53 54 4e 44 45 52 53 4f 4e 41 4c 4c 41 4e
> -
>
> The last half (roughly) consists of last names compressed together,
> while the first half consists of 57 (in this case) entries of the record
> number, duplicate byte count and trailing byte count, all compacted --
> how do I uncompact them?
>

huh, I see what you mean.

What are you working on?

I know a few people that may have the answer, but it would help to
explain why it is being worked on.


-- 
Carl K
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [DB-SIG] dbf files and compact indices

2010-09-18 Thread Carl Karsten
On Sat, Sep 18, 2010 at 11:23 PM, Ethan Furman  wrote:
> Vernon Cole wrote:
>>
>> Ethan:
>> I cannot see where you mentioned your operating system, I am assuming
>> Windows.
>>
>> Perhaps you have already investigated this ... I have no way to test it
>> ... but you might try:
>> ADO can access almost any data source, and a quick look seems to show that
>> .dbf is supported using the JET driver or a FoxPro driver.
>>
>> 1) upload pywin32
>> 2) import adodbapi
>> 3) find an appropriate connection string for your data source
>> http://connectionstrings.com suggests that perhaps:
>> Driver={Microsoft Visual FoxPro
>> Driver};SourceType=DBF;SourceDB=c:\myvfpdbfolder;Exclusive=No;
>> Collate=Machine;NULL=NO;DELETED=NO;BACKGROUNDFETCH=NO;
>> may be a good sample to start with -- there are other variations, check
>> their site.
>> 4) do your data input/output using standard Python db-api calls.
>>
>> see python\lib\site-packages\adodbapi\test\ for usage examples
>>
>> You can get pywin32 from http://sourceforge.net/projects/pywin32
>
> Thanks for the suggestion, but I don't want to be tied to Foxpro, which
> means I need to be able to parse these files directly.  I have the dbf
> files, now I need the idx and cdx files.


What do you mean "tied" ?

-- 
Carl K
-- 
http://mail.python.org/mailman/listinfo/python-list