parse text files in a directory?
hi everybody im a newbie in python, i have a question how do u parse a bunch of text files in a directory? directory: /dir files: H20080101.txt , H20080102.txt,H20080103.txt,H20080104.txt,H20080105.txt etc.. i already got a python script to read and insert a single text files into a postgres db. is there anyway i can do it in a batch, cause i got like 2000 txt files. thanks in advance joe -- http://mail.python.org/mailman/listinfo/python-list
linecache and glob
hi everyone happy new year! im a newbie to python i have a question by using linecache and glob how do i read a specific line from a file in a batch and then insert it into database? because it doesn't work! i can't use glob wildcard with linecache >>> import linecache >>> linecache.getline(glob.glob('/etc/*', 4) doens't work is there any better methods??? thank you very much in advance jo3c -- http://mail.python.org/mailman/listinfo/python-list
Re: linecache and glob
i have a 2000 files with header and data i need to get the date information from the header then insert it into my database i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt') to get the date on line 4 in the txt file i use linecache.getline('/mydata/myfile.txt/, 4) but if i use linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work i am running out of ideas thanks in advance for any help jo3c -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's great, in a word
On Jan 7, 9:09 pm, [EMAIL PROTECTED] wrote: > I'm a Java guy who's been doing Python for a month now and I'm > convinced that > > 1) a multi-paradigm language is inherently better than a mono-paradigm > language > > 2) Python writes like a talented figure skater skates. > > Would you Python old-timers try to agree on a word or two that > completes: > > The best thing about Python is ___. > > Please, no laundry lists, just a word or two. I'm thinking "fluid" or > "grace" but I'm not sure I've done enough to choose. skimpythong!! -- http://mail.python.org/mailman/listinfo/python-list
use fileinput to read a specific line
hi everybody im a newbie in python i need to read line 4 from a header file using linecache will crash my computer due to memory loading, because i am working on 2000 files each is 8mb fileinput don't load the file into memory first how do i use fileinput module to read a specific line from a file? for line in fileinput.Fileinput('sample.txt') -- http://mail.python.org/mailman/listinfo/python-list
Re: linecache and glob
On Jan 4, 5:25 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > jo3c wrote: > > i have a 2000 files with header and data > > i need to get the date information from the header > > then insert it into my database > > i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt') > > to get the date on line 4 in the txt file i use > > linecache.getline('/mydata/myfile.txt/, 4) > > > but if i use > > linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work > > glob.glob returns a list of filenames, so you need to call getline once > for each file in the list. > > but using linecache is absolutely the wrong tool for this; it's designed > for *repeated* access to arbitrary lines in a file, so it keeps all the > data in memory. that is, all the lines, for all 2000 files. > > if the files are small, and you want to keep the code short, it's easier > to just grab the file's content and using indexing on the resulting list: > > for filename in glob.glob('/mydata/*/*/*.txt'): > line = list(open(filename))[4-1] > ... do something with line ... > > (note that line numbers usually start with 1, but Python's list indexing > starts at 0). > > if the files might be large, use something like this instead: > > for filename in glob.glob('/mydata/*/*/*.txt'): > f = open(filename) > # skip first three lines > f.readline(); f.readline(); f.readline() > # grab the line we want > line = f.readline() > ... do something with line ... > > thank you guys, i did hit a wall using linecache, due to large file loading into memory.. i think this last solution works well for me thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: use fileinput to read a specific line
On Jan 8, 2:08 pm, "Russ P." <[EMAIL PROTECTED]> wrote: > > Given that the OP is talking 2000 files to be processed, I think I'd > > recommend explicit open() and close() calls to avoid having lots of I/O > > structures floating around... > > Good point. I didn't think of that. It could also be done as follows: > > for fileN in files: > > lnum = 0 # line number > input = file(fileN) > > for line in input: > lnum += 1 > if lnum >= 4: break > > input.close() > > # do something with "line" > > Six of one or half a dozen of the other, I suppose. this is what i did using glob import glob for files in glob.glob('/*.txt'): x = open(files) x.readline() x.readline() x.readline() y = x.readline() # do something with y x.close() -- http://mail.python.org/mailman/listinfo/python-list
windows active directory ldap output encoding
Hi.. Im trying to get some information out of a windows sever 2003 chinese active directory system so let's say encoding is probably big5 or utf-8 what im doing is simliar to ldapsearch in shell with my python script using python ldap module the result is not the correct encoding.. i've look many places and tried many different encoding on the top of the script #coding=big5 etc.. below is the wrong encoding output .. any help will be much appreciated.. *** ldap://2134.localhost.com:389 - SimpleLDAPObject.set_option ((17, 3),{}) CN=江,OU=2134,DC=localhost,DC=com {'accountExpires': ['9223372036854775807'], 'badPasswordTime': ['128566014672343750'], 'badPwdCount': ['0'], 'cn': ['\xe6\xb1\x9f\xe6\x9f\x8f\xe5\xa3\x95'], 'codePage': ['0'], 'company': ['\xe8\x8f\xaf\xe8\x81\xaf\xe7\x94\x9f \xe7\x89\xa9\xe7\xa7\x91\xe6\x8a\x80'], 'countryCode': ['0'], 'department': ['\xe7\x94\x9f\xe7\x89\xa9\xe7\xa7\x91\xe6\x8a \x80\xe8\x99\x95'], 'displayName': ['\xe6\xb1\x9f\xe6\x9f\x8f\xe5\xa3\x95'], 'distinguishedName': ['CN=\xe6\xb1\x9f\xe6\x9f\x8f \xe5\xa3\x95,OU=300\xe7\xa7\x91\xe6\x8a \x80\xe8\x99\x95,DC=localhost,DC=com'], 'givenName': ['\xe6\x9f\x8f\xe5\xa3\x95'], 'homeMDB': ['CN=\xe4\xbf\xa1\xe7\xae\xb1\xe5\x84\xb2\xe5\xad \x98\xe5\x8d\x80 (MAIL),CN=\xe9\xa0\x90\xe8\xa8\xad\xe5\x84\xb2\xe5\xad \x98\xe7\xbe\xa4\xe7\xb5\x84,CN=InformationStore,CN=MAIL,CN=Servers,CN= \xe9\xa0\x90\xe8\xa8\xad\xe7\xb3\xbb\xe7\xb5\xb1\xe7\xae \xa1\xe7\x90\x86\xe7\xbe\xa4\xe7\xb5\x84,CN=Administrative Groups,CN=localhost,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=localhost,DC=com'], 'homeMTA': ['CN=Microsoft MTA,CN=MAIL,CN=Servers,CN= \xe9\xa0\x90\xe8\xa8\xad\xe7\xb3\xbb\xe7\xb5\xb1\xe7\xae \xa1\xe7\x90\x86\xe7\xbe\xa4\xe7\xb5\x84,CN=Administrative Groups,CN=localhost,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=localhost,DC=com'], 'instanceType': ['4'], 'lastLogoff': ['0'], 'lastLogon': ['128598965066718750'], 'legacyExchangeDN': ['/o=localhost/ou=ExchangAdmin/cn=Recipients/ cn=joechiang'], 'logonCount': ['33'], 'mDBUseDefaults': ['TRUE'], 'mail': ['[EMAIL PROTECTED]'], 'mailNickname': ['joechiang'], 'memberOf': ['CN=AllHQStaff,CN=Users,DC=localhost,DC=com'], 'msExchALObjectVersion': ['60'], 'msExchHomeServerName': ['/o=localhost/ou=ExchangAdmin/ cn=Configuration/cn=Servers/cn=MAIL'], 'msExchMailboxGuid': ['2\x04\x116^\xfc%J\x87yi\xbdj^\x1bl'], 'msExchMailboxSecurityDescriptor': ['\x01\x00\x04\x80x \x00\x00\x00\x94\x00\x00\x00\x00\x00\x00\x00\x14\x00\x00\x00\x04\x00d \x00\x01\x00\x00\x00\x00\x02\x14\x00\x03\x00\x02\x00\x01\x01\x00\x00\x00\x00\x00\x05\n \x00\x00\x00a\x00n\x00x\x00/\x00C\x00N\x00=\x00C\x00o\x00n\x00f\x00i \x00g\x00u\x00r\x00a\x00t\x00i\x00o\x00n \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x05\x00\x00\x00\x00\x00\x05\x15\x00\x00\x00\xd6\xb2i \x1a^\xa7P\xcb\xday \x88\xa9\xf4\x01\x00\x00\x01\x05\x00\x00\x00\x00\x00\x05\x15\x00\x00\x00\xd6\xb2i \x1a^\xa7P\xcb\xday\x88\xa9\xf4\x01\x00\x00'], 'msExchPoliciesIncluded': ['{C96E41C5-C5D5-411B-8672-1A3B6602437F}, {3B6813EC-CE89-42BA-9442-D87D4AA30DBC}', '{C96E41C5-C5D5-411B-8672-1A3B6602437F}, {26491CFC-9E50-4857-861B-0CB8DF22B5D7}'], 'msExchUserAccountControl': ['0'], 'name': ['\xe6\xb1\x9f\xe6\x9f\x8f\xe5\xa3\x95'], 'objectCategory': ['CN=Person,CN=Schema,CN=Configuration,DC=localhost,DC=com'], 'objectClass': ['top', 'person', 'organizationalPerson', 'user'], 'objectGUID': ['\x13\xfa\xc2\xbb\x9e\xee|C\x9d\xa8_\xea]\xef \xc6\x90'], 'objectSid': ['\x01\x05\x00\x00\x00\x00\x00\x05\x15\x00\x00\x00\xd6\xb2i\x1a^\xa7P \xcb\xday\x88\xa9u\x0c\x00\x00'], 'primaryGroupID': ['513'], 'proxyAddresses': ['X400:c=TW;a= ;p=localhost;o=Exchange;s=joechiang;', 'SMTP:[EMAIL PROTECTED]'], 'pwdLastSet': ['128587670396562500'], 'sAMAccountName': ['joechiang'], 'sAMAccountType': ['805306368'], 'showInAddressBook': ['CN=\xe5\x85\xa8\xe5\x9f\x9f\xe9\x80\x9a \xe8\xa8\x8a\xe6\xb8\x85\xe5\x96\xae,CN=All Global Address Lists,CN=Address Lists Container,CN=localhost,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=localhost,DC=com', 'CN=\xe7\x87\x9f\xe9\x81\x8b\xe8\x99\x95,CN=All Address Lists,CN=Address Lists Container,CN=localhost,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=localhost,DC=com', 'CN=\xe7\x94\x9f\xe7\x89\xa9\xe7\xa7\x91\xe6\x8a \x80\xe8\x99\x95,CN=All Address Lists,CN=Address Lists Container,CN=localhost,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=localhost,DC=com', 'CN=\xe6\x89\x80\xe6\x9c\x89\xe4\xbd\xbf \xe7\x94\xa8\xe8\x80\x85,CN=All Address Lists,CN=Address Lists Container,CN=localhost,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=localhost,DC=com'], 'sn': ['\xe6\xb1\x9f'], 'textEncodedORAddress': ['c=TW;a= ;p=localhost;o=Exchange;s=joechiang;'], 'uSNChanged': ['22943844'], 'uSNCreated': ['2