streaming a file object through re.finditer
Hello, I've been looking for a while for an answer, but so far I haven't been able to turn anything up yet. Basically, what I'd like to do is to use re.finditer to search a large file (or a file stream), but I haven't figured out how to get finditer to work without loading the entire file into memory, or just reading one line at a time (or more complicated buffering). For example, say I do this: cat a b c > blah Then run this python script: >>> import re >>> for m in re.finditer('\w+', buffer(file('blah'))): ... print m.group() ... Traceback (most recent call last): File "", line 1, in ? TypeError: buffer object expected Of course, this works fine, but it loads the file completely into memory (right?): >>> for m in re.finditer('\w+', buffer(file('blah').read())): ... print m.group() ... a b c So, is there any way to do this? Thanks, -e -- http://mail.python.org/mailman/listinfo/python-list
Re: Easy Q: dealing with object type
Ah, you're running into the "old-style classes vs. new style classes". Try subclassing from "object". For example: >>> class A(object): ... pass ... >>> a=A() >>> type(a) >>> type(a) == A True >>> type(a) is A True >>> b=A() >>> type(a) == type(b) True >>> type(a) is type(b) True Check out the following article, it should answer your questions: http://www.python.org/doc/2.2.3/whatsnew/sect-rellinks.html#SECTION00031 -e -- http://mail.python.org/mailman/listinfo/python-list
Re: streaming a file object through re.finditer
Ack, typo. What I meant was this: cat a b c > blah >>> import re >>> for m in re.finditer('\w+', file('blah')): ... print m.group() ... Traceback (most recent call last): File "", line 1, in ? TypeError: buffer object expected Of course, this works fine, but it loads the file completely into memory (right?): >>> for m in re.finditer('\w+', file('blah').read()): ... print m.group() ... a b c -- http://mail.python.org/mailman/listinfo/python-list
Re: Easy Q: dealing with object type
Ah, you're running into the "old-style classes vs. new style classes". Try subclassing from "object". For example: >>> class A(object): ... pass ... >>> a=A() >>> type(a) >>> type(a) == A True >>> type(a) is A True >>> b=A() >>> type(a) == type(b) True >>> type(a) is type(b) True Check out the following article, it should answer your questions: http://www.python.org/doc/2.2.3/whatsnew/sect-rellinks.html#SECTION00031 -e -- http://mail.python.org/mailman/listinfo/python-list
Re: streaming a file object through re.finditer
True, but it doesn't work with multiline regular expressions :( -e -- http://mail.python.org/mailman/listinfo/python-list
Re: streaming a file object through re.finditer
I did try to see if I could get that to work, but I couldn't figure it out. I'll see if I can play around more with that api. So say I did investigate a little more to see how much work it would take to adapt the re module to accept an iterator (while leaving the current string api as another code path). Depending on how complicated a change this would be, how much interest would there be in other people using this feature? From what I understand about regular expressions, they're essentially stream processing and don't need backtracking, so reading from an interator should work too (right?). Thanks, -e -- http://mail.python.org/mailman/listinfo/python-list
Fwd: bug in download
Begin forwarded message: > From: Erick Willum > Date: 16 December 2020 at 15:53:40 GMT > To: python-list@python.org > Subject: bug in download > > Hallo and good afternoon, > > Having installed python (big thank you) and sublime text, i get the next > message when trying to download numpy or matplotlib in sublime text: > > fails to pass a sanity check due to a bug in the windows runtime. See this > issue for more information: > https://tinyurl.com/y3dm3h86I > > I got the next message from tiny.url, the long url is: > > https://developercommunity.visualstudio.com/content/problem/ > 1207405/fmod-after-an-update-to-windows-2004-is-causing-a.html > > These next lines appear in the sublime window as well: > > File > "C:\Users\Bill\AppData\Local\Programs\Python\Python39\lib\site-packages\numpy\__init__.py", > line 305, in > _win_os_check() > File > "C:\Users\Bill\AppData\Local\Programs\Python\Python39\lib\site-packages\numpy\__init__.py", > line 302, in _win_os_check > raise RuntimeError(msg.format(__file__)) from None > > I would appreciate your help. > Thanks again. > Bill. > > -- https://mail.python.org/mailman/listinfo/python-list
python-ldap reading an OU with more than 1000 objects
Hi, I have a MS Windows AD domain, and have one OU with more tan 1000 users objects. When I try to read it, I hit the 1000 limit of AD while returning objects, so I'm asking for advice as to how to read them. Here is my actual code, it is not the cleanest as I am learning python. Suggestions are welcomed :) Runnig this script on RedHat 5.x with "python zimbra2.py" returns: {'info': '', 'desc': 'Size limit exceeded'} The script: #!/usr/bin/python #--- --- # Variables can be changed here: import ldap, string, os, time, sys base = 'ou=usuarios con papel tapiz,dc=organojudicial,dc=gob,dc=pa' scope = ldap.SCOPE_SUBTREE ZimbraEmail = "CN=ZimbraEmail,CN=Users,DC=organojudicial,DC=gob,DC=pa" domain = "organojudicial.gob.pa" # "example.com" ldapserver="ancon" port="389" emaildomain="organojudicial.gob.pa" ldapbinddomain="organojudicial" ldapbind="zimbrasync" ldappassword="" pathtozmprov="/opt/zimbra/bin/zmprov" #--- --- #--- --- #output the list of all accounts from zmprov gaa (get all accounts) # this is related to the Zimbra Mail System f = os.popen(pathtozmprov +' gaa') zmprovgaa= [] zmprovgaa = f.readlines() #--- --- #--- --- # Let's connect to the Windows AD Domain l=ldap.initialize("ldap://"+ldapserver+"."+domain+":"+port) try: l.simple_bind_s(ldapbinddomain+"\\"+ldapbind,ldappassword) except ldap.INVALID_CREDENTIALS: print "Your username or password to bind to AD is incorrect." sys.exit() except ldap.LDAPError, e: if type(e.message) == dict and e.message.has_key('desc'): print e.message['desc'] else: print e sys.exit() # end of connection procedure to AD #--- --- #--- --- # If connection to AD is ok # Lets find only enabled users in a specific OU controlled by the variable named base # and get the login username the first name, the last name and what groups this # user belongs to as well as the email field. #userAccountControl 512 = normal , 514 = disabled account. We only want enabled accounts try: res = l.search_s(base,scope, "(&(ObjectCategory=user) (userAccountControl=512))", ['sAMAccountName','givenName','sn','memberOf', 'mail']) for (dn, vals) in res: samaccount = vals['sAMAccountName'][0].lower() accountname = vals['sAMAccountName'][0].lower() try: alias1 = vals['mail'][0].lower() except: alias1 = 'none' try: sirname = vals['sn'][0] except: sirname = vals['sAMAccountName'][0] try: givenname = vals['givenName'][0] except: givenname = vals['sAMAccountName'][0] try: groups = vals['memberOf'] except: groups = 'none' # this code is not working. Python chokes. #initial = givenname[:1].upper() #sirname = sirname.replace(' ', ) #sirname = sirname.replace('\\', ) #sirname = sirname.replace('-', ) #sirname = sirname.capitalize() name = givenname + " " + sirname accountname = accountname + "@" + emaildomain password = " \'\' " sys.stdout.flush() # If the Active Directory user is a member of the AD group called ZimbraMail, we begin processing this user. if ZimbraEmail in groups: print "SAM ACCOUNT: " + samaccount print "accountname: " + accountname print "name: " + name print "Alias de zimbra " + alias1 if accountname +"\n" not in zmprovgaa: print accountname," exists in active directory but not in zimbra, the account is being created\n" time.sleep(1) os.system(pathtozmprov +' ca %s %s displayName "%s"' % (accountname,password,name)) print "Creando Alias" os.system(pathtozmprov +' aaa %s %s' % (accountname,alias1)) time.sleep(1) else: print accountname, alias1, " user is not a member of the ZimbraMail AD Group. Will not be processed\n" #--- --- except ldap.LDAPError, error_message: print error_message l.unbind_s() thanks all for your comments. Erick. -- http://mail.python.org/mailman/listinfo/python-list