EuroPython 2017: Talk voting results
Thank you all for participating in last week’s talk voting: https://ep2017.europython.eu/en/speakers/talk-voting/ We have again broken a record, with more than 13,353 cast votes, a 22% increase compared to last year. We had almost 400 submissions to vote on. Users voted on 47 sessions on average, with more than 50% of the users casting 24.5 or more votes (median). A total of 284 users participated in the talk voting, compared to 254 users last year, so more than 20% of our attendees do like to actively participate in the selection of the talks. A pretty good indicator of how vibrant our community is. The program work group will now evaluate the voting results and select the first set of highest rated talks before going into the review phase to work on the remaining talk submissions. We plan to announce this first batch in the coming days. Enjoy, -- EuroPython 2017 Team http://ep2017.europython.eu/ http://www.europython-society.org/ PS: Please forward or retweet to help us reach all interested parties: https://twitter.com/europython/status/862268301992448001 Thanks. -- https://mail.python.org/mailman/listinfo/python-list
Re: Practice Python
Hello, 1.) a short example for Python 3, but not exactly what they want. def square(numbers): yield from sorted(n**2 for n in numbers) numberlist = [99, 4, 3, 5, 6, 7, 0] result = list(square(numberlist)) To solve this tutorial, you need a different way. I'm just showing how sexy Python 3 is ;-) Python 2.7 feels old... it is old. Please learn Python 3. Greetings Andre Am 08.05.2017 um 08:52 schrieb gyrhgyrh...@gmail.com: > Python - Exercise 5 > 1. Write a function that gets a list (list) of numbers. The function returns > a new list of ordered square numbers from the smallest to grow. > For example, for the list [2, 4, 5, 3, 1] the function returns > [25, 16, 9, 4, 1]. > > 2. Write a function that receives a list (list) and a number. The function > returns the number of times the number appears in the list. > For example, for list [2, 4, 2, 3, 2] and number 2 will return 3 because > number 2 is listed 3 times. > > The answers here: > > https://www.youtube.com/watch?v=nwHPM9WNyw8&t=36s signature.asc Description: OpenPGP digital signature -- https://mail.python.org/mailman/listinfo/python-list
Re: Practice Python
On Wed, May 10, 2017 at 10:11 PM, Andre Müller wrote: > 1.) a short example for Python 3, but not exactly what they want. > > def square(numbers): > yield from sorted(n**2 for n in numbers) > > numberlist = [99, 4, 3, 5, 6, 7, 0] > result = list(square(numberlist)) If you're going to use sorted(), why not simply return the list directly? This unnecessarily iterates through the list and builds a new one. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
ANN: Wing Python IDE 6.0.5 released
Hi, We've just released Wing 6.0.5 which simplifies remote externally launched debugging, improves remote development documentation, adds a How-To for remote web development, documents how to debug extension scripts written for the IDE, adds syntax highlighting for Markdown, solves some problems saving or debugging remote files, fixes several auto-editing issues, fixes git pull branch, speeds up Mercurial status updates, corrects documentation links for Python 2, and makes about 40 other minor improvements. For details, see http://wingware.com/pub/wingide/6.0.5/CHANGELOG.txt Wing 6 is the latest major release in Wingware's family of Python IDEs, including Wing Pro, Wing Personal, and Wing 101. Wing 6 adds many new features, introduces a new annual license option for Wing Pro, and makes Wing Personal free. New Features * Improved Multiple Selections: Quickly add selections and edit them all at once * Easy Remote Development: Work seamlessly on remote Linux, OS X, and Raspberry Pi systems * Debugging in the Python Shell: Reach breakpoints and exceptions in (and from) the Python Shell * Recursive Debugging: Debug code invoked in the context of stack frames that are already being debugged * PEP 484 and PEP 526 Type Hinting: Inform Wing's static analysis engine of types it cannot infer * Support for Python 3.6 and Stackless 3.4: Use async and other new language features * Optimized debugger: Run faster, particularly in multi-process and multi-threaded code * Support for OS X full screen mode: Zoom to a virtual screen, with auto-hiding menu bar * Added a new One Dark color palette: Enjoy the best dark display style yet * Updated French and German localizations: Thanks to Jean Sanchez, Laurent Fasnacht, and Christoph Heitkamp For a more detailed overview of new features see the release notice at http://wingware.com/news/2017-05-08 Annual Use License Option Wing 6 adds the option of purchasing a lower-cost expiring annual license for Wing Pro. An annual license includes access to all available Wing Pro versions while it is valid, and then ceases to function until it is renewed. Pricing for annual licenses is US$ 179/user for Commercial Use and US$ 69/user for Non-Commercial Use. Perpetual licenses for Wing Pro will continue to be available at the same pricing. The cost of extending Support+Upgrades subscriptions on Non-Commercial Use perpetual licenses for Wing Pro has also been dropped from US$ 89 to US$ 39 per user. For details, see https://wingware.com/store/ Wing Personal is Free Wing Personal is now free and no longer requires a license to run. It now also includes the Source Browser, PyLint, and OS Commands tools, and supports the scripting API and Perspectives. However, Wing Personal does not include Wing Pro's advanced editing, debugging, testing and code management features, such as remote development, refactoring, find uses, version control, unit testing, interactive debug probe, multi-process and child process debugging, move program counter, conditional breakpoints, debug watch, framework-specific support (for Jupyter, Django, and others), find symbol in project, and other features. Links Release notice: http://wingware.com/news/2017-05-08 Downloads and Free Trial: http://wingware.com/downloads Buy: http://wingware.com/store/purchase Upgrade: https://wingware.com/store/upgrade Questions? Don't hesitate to email us at supp...@wingware.com. Thanks, -- Stephan Deibel Wingware | Python IDE The Intelligent Development Environment for Python Programmers wingware.com -- https://mail.python.org/mailman/listinfo/python-list
Out of memory while reading excel file
Hello, The following code which uses openpyxl and numpy, fails to read large Excel (xlsx) files. The file si 20Mb which contains 100K rows and 50 columns. W = load_workbook(fname, read_only = True) p = W.worksheets[0] a=[] m = p.max_row n = p.max_column np.array([[i.value for i in j] for j in p.rows]) How can I fix that? I have stuck at this problem. For medium sized files (16K rows and 50 columns) it is fine. Regards, Mahmood -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Mahmood Naderan via Python-list wrote: > Hello, > > The following code which uses openpyxl and numpy, fails to read large > Excel (xlsx) files. The file si 20Mb which contains 100K rows and 50 > columns. > > > > W = load_workbook(fname, read_only = True) > > p = W.worksheets[0] > > a=[] > > m = p.max_row > > n = p.max_column > > > np.array([[i.value for i in j] for j in p.rows]) > > > > How can I fix that? I have stuck at this problem. For medium sized files > (16K rows and 50 columns) it is fine. The docs at https://openpyxl.readthedocs.io/en/default/optimized.html#read-only-mode promise "(near) constant memory consumption" for the sample script below: from openpyxl import load_workbook wb = load_workbook(filename='large_file.xlsx', read_only=True) ws = wb['big_data'] for row in ws.rows: for cell in row: print(cell.value) If you change only the file and worksheet name to your needs -- does the script run to completion in reasonable time (redirect stdout to /dev/null) and with reasonable memory usage? If it does you may be wasting memory elsewhere; otherwise you might need to convert the xlsx file to csv using your spreadsheet application before processing the data in Python. -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Thanks for your reply. The openpyxl part (reading the workbook) works fine. I printed some debug information and found that when it reaches the np.array, after some 10 seconds, the memory usage goes high. So, I think numpy is unable to manage the memory. Regards, Mahmood On Wednesday, May 10, 2017 7:25 PM, Peter Otten <__pete...@web.de> wrote: Mahmood Naderan via Python-list wrote: > Hello, > > The following code which uses openpyxl and numpy, fails to read large > Excel (xlsx) files. The file si 20Mb which contains 100K rows and 50 > columns. > > > > W = load_workbook(fname, read_only = True) > > p = W.worksheets[0] > > a=[] > > m = p.max_row > > n = p.max_column > > > np.array([[i.value for i in j] for j in p.rows]) > > > > How can I fix that? I have stuck at this problem. For medium sized files > (16K rows and 50 columns) it is fine. The docs at https://openpyxl.readthedocs.io/en/default/optimized.html#read-only-mode promise "(near) constant memory consumption" for the sample script below: from openpyxl import load_workbook wb = load_workbook(filename='large_file.xlsx', read_only=True) ws = wb['big_data'] for row in ws.rows: for cell in row: print(cell.value) If you change only the file and worksheet name to your needs -- does the script run to completion in reasonable time (redirect stdout to /dev/null) and with reasonable memory usage? If it does you may be wasting memory elsewhere; otherwise you might need to convert the xlsx file to csv using your spreadsheet application before processing the data in Python. -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Mahmood Naderan via Python-list wrote: > Thanks for your reply. The openpyxl part (reading the workbook) works > fine. I printed some debug information and found that when it reaches the > np.array, after some 10 seconds, the memory usage goes high. > > > So, I think numpy is unable to manage the memory. Hm, I think numpy is designed to manage huge arrays if you have enough RAM. Anyway: are all values of the same type? Then the numpy array may be kept much smaller than in the general case (I think). You can also avoid the intermediate list of lists: wb = load_workbook(filename='beta.xlsx', read_only=True) ws = wb['alpha'] a = numpy.zeros((ws.max_row, ws.max_column), dtype=float) for y, row in enumerate(ws.rows): a[y] = [cell.value for cell in row] -- https://mail.python.org/mailman/listinfo/python-list
Re: Practice Python
Am 10.05.2017 um 14:18 schrieb Chris Angelico: > On Wed, May 10, 2017 at 10:11 PM, Andre Müller wrote: >> 1.) a short example for Python 3, but not exactly what they want. >> >> def square(numbers): >> yield from sorted(n**2 for n in numbers) >> >> numberlist = [99, 4, 3, 5, 6, 7, 0] >> result = list(square(numberlist)) > If you're going to use sorted(), why not simply return the list > directly? This unnecessarily iterates through the list and builds a > new one. > > ChrisA You're right. This can handle infinite sequences: def square(numbers): yield from (n**2 for n in numbers) My example before can't do this. Sorted consumes the whole list, there is no benefit. *Ontopic* Make a new empty list and iterate over your input_list, do inside the loop the math operation and append the result to the new list. Return the new list. Hint1: The text above is longer as the resulting Code. Hint2: Write this as a function (reusable code) Hint3: Write this as a list comprehension. Greetings Andre signature.asc Description: OpenPGP digital signature -- https://mail.python.org/mailman/listinfo/python-list
import docx error
Hi, I am very new to Python, have only done simple things >>>print("hello world") type things. I've really been looking forward to using Python. I bought Two books, downloaded Python 3.6.1 (32 & 64) and each time I try this: >>> import docx I get errors. Traceback (most recent call last): File "", line 1 in ModuleNotFoundError: No module named docx I read a thread somewhere saying it wasn't needed to do import docx anymore but if I try doc = docx.document, I again get an error. I'm using Window 7 (but it also happens on 10). I've searched for online help and seen nothing that I can follow, references to PIP are over my head. It's very frustrating. Can somebody help? Really appreciate it. Thanks -John -- https://mail.python.org/mailman/listinfo/python-list
Re: Practice Python
On Mon, May 8, 2017 at 3:50 AM, Lutz Horn wrote: > A strange way to publish code. Not if your goal is to drive traffic toward your YouTube channel. -- https://mail.python.org/mailman/listinfo/python-list
Re: import docx error
On 2017-05-10, RRS1 via Python-list wrote: > I am very new to Python, have only done simple things >>>print("hello > world") type things. I've really been looking forward to using Python. I > bought Two books, downloaded Python 3.6.1 (32 & 64) and each time I try this: > > import docx > > I get errors. > > Traceback (most recent call last): > File "", line 1 in > ModuleNotFoundError: No module named docx You need to install the docx module: https://pypi.python.org/pypi/docx https://pypi.python.org/pypi -- Grant Edwards grant.b.edwardsYow! PARDON me, am I at speaking ENGLISH? gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Well actually cells are treated as strings and not integer or float numbers. One way to overcome is to get the number of rows and then split it to 4 or 5 arrays and then process them. However, i was looking for a better solution. I read in pages that large excels are in the order of milion rows. Mine is about 100k. Currently, the task manager shows about 4GB of ram usage while working with numpy. Regards, Mahmood On Wed, 5/10/17, Peter Otten <__pete...@web.de> wrote: Subject: Re: Out of memory while reading excel file To: python-list@python.org Date: Wednesday, May 10, 2017, 3:48 PM Mahmood Naderan via Python-list wrote: > Thanks for your reply. The openpyxl part (reading the workbook) works > fine. I printed some debug information and found that when it reaches the > np.array, after some 10 seconds, the memory usage goes high. > > > So, I think numpy is unable to manage the memory. Hm, I think numpy is designed to manage huge arrays if you have enough RAM. Anyway: are all values of the same type? Then the numpy array may be kept much smaller than in the general case (I think). You can also avoid the intermediate list of lists: wb = load_workbook(filename='beta.xlsx', read_only=True) ws = wb['alpha'] a = numpy.zeros((ws.max_row, ws.max_column), dtype=float) for y, row in enumerate(ws.rows): a[y] = [cell.value for cell in row] -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
On 10-5-2017 17:12, Mahmood Naderan wrote: > So, I think numpy is unable to manage the memory. That assumption is very likely to be incorrect. >> np.array([[i.value for i in j] for j in p.rows]) I think the problem is in the way you feed your excel data into the numpy array constructor. The code above builds many redundant python lists from the data you already have in memory, before even calling the numpy array function. I strongly suggest finding a proven piece of code to read large excel files like the example from Peter Otten's reply. Irmen -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Mahmood Naderan via Python-list wrote: > Well actually cells are treated as strings and not integer or float > numbers. May I ask why you are using numpy when you are dealing with strings? If you provide a few details about what you are trying to achieve someone may be able to suggest a workable approach. Back-of-the-envelope considerations: 4GB / 5E6 cells amounts to >>> 2**32 / (10 * 50) 858.9934592 about 850 bytes per cell, with an overhead of >>> sys.getsizeof("") 49 that would be 800 ascii chars, down to 200 chars in the worst case. If your strings are much smaller the problem lies elsewhere. -- https://mail.python.org/mailman/listinfo/python-list
Iterating through a list. Upon condition I want to move on to next item in list
Good afternoon, I have a list that I'm iterating thorough in Python. Each item in the list will have between 1-200 urls associated with it. My goal is to move on to the next. I have a variable "associationsCount" that is counting the number of urls and once it gets to 0 i want to move on to the next item on the list and do the same thing. As my code stands right now, it counts through the stateList[0] and then when associationsCount gets to 0 it just stops. I know I need to add another couple lines of code to get it back going again- but I'm not no good at no Python. Please advise. Here's an example of the code: stateList = ['AL','AK','AR','AZ',etc] associationsCount = 1 for state in stateList: while associationsCount > 0: print(counter) url = 'url?dp={0}&n=&s={1}&c=&z=&t1=&g='.format(counter,state) print(url) page = requests.get(url) tree = html.fromstring(page.text) s = s + counter counter +=1 associations = tree.xpath('//td//strong/text()') associationsCount = len(associations) print(associationsCount) for x in associations: print(x) xUrl = 'https://bing.com/search?q={0}'.format(x) xPage = requests.get(xUrl) xTree = html.fromstring(xPage.text) try: link = xTree.xpath('//li[@class="b_algo"]//a/@href')[0] print(link) associationInfo = state, x,link associationInfoList.append(associationInfo) except: pass -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Hi I will try your code... meanwhile I have to say, as you pointed earlier and as stated in the documents, numpy is designed to handle large arrays and that is the reason I chose that. If there is a better option, please let me know. Regards, Mahmood On Wed, 5/10/17, Peter Otten <__pete...@web.de> wrote: Subject: Re: Out of memory while reading excel file To: python-list@python.org Date: Wednesday, May 10, 2017, 6:30 PM Mahmood Naderan via Python-list wrote: > Well actually cells are treated as strings and not integer or float > numbers. May I ask why you are using numpy when you are dealing with strings? If you provide a few details about what you are trying to achieve someone may be able to suggest a workable approach. Back-of-the-envelope considerations: 4GB / 5E6 cells amounts to >>> 2**32 / (10 * 50) 858.9934592 about 850 bytes per cell, with an overhead of >>> sys.getsizeof("") 49 that would be 800 ascii chars, down to 200 chars in the worst case. If your strings are much smaller the problem lies elsewhere. -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Iterating through a list. Upon condition I want to move on to next item in list
On 10/05/17 20:25, aaron.m.weisb...@gmail.com wrote: As my code stands right now, it counts through the stateList[0] and then when associationsCount gets to 0 it just stops. I know I need to add another couple lines of code to get it back going again It's very difficult to see what you're trying to do without any information on what the input format is etc, but from your description my *complete guess* at what might help is to look at putting this line: associationsCount = 1 *AFTER* this line: for state in stateList: Regards, E. -- https://mail.python.org/mailman/listinfo/python-list
Re: Iterating through a list. Upon condition I want to move on to next item in list
On 2017-05-10 20:25, aaron.m.weisb...@gmail.com wrote: Good afternoon, I have a list that I'm iterating thorough in Python. Each item in the list will have between 1-200 urls associated with it. My goal is to move on to the next. I have a variable "associationsCount" that is counting the number of urls and once it gets to 0 i want to move on to the next item on the list and do the same thing. As my code stands right now, it counts through the stateList[0] and then when associationsCount gets to 0 it just stops. I know I need to add another couple lines of code to get it back going again- but I'm not no good at no Python. Please advise. Here's an example of the code: stateList = ['AL','AK','AR','AZ',etc] associationsCount = 1 for state in stateList: while associationsCount > 0: What is 'counter'? It only ever increases. print(counter) I wonder whether it's supposed to increase from, say, 1, for each state, to get each of a number of pages for each state. (Pages 1, 2, etc. for 'AL', then pages 1, 2, etc. for 'AK', and so on? If that's the case, then how do you know when you have seen the last page?) url = 'url?dp={0}&n=&s={1}&c=&z=&t1=&g='.format(counter,state) print(url) page = requests.get(url) tree = html.fromstring(page.text) s = s + counter counter +=1 associations = tree.xpath('//td//strong/text()') You're not doing much with 'associationsCount'. Once set, you never change it, so the 'while' loop will repeat forever - unless it happens to be set to 0, in which case the 'while' loop will never run again! associationsCount = len(associations) print(associationsCount) for x in associations: print(x) xUrl = 'https://bing.com/search?q={0}'.format(x) xPage = requests.get(xUrl) xTree = html.fromstring(xPage.text) try: link = xTree.xpath('//li[@class="b_algo"]//a/@href')[0] print(link) associationInfo = state, x,link associationInfoList.append(associationInfo) NEVER use a 'bare except' to suppress exceptions! It'll catch _all_ exceptions, even NameError (if you've misspelled a name, it'll catch that too). Catch only those exceptions that you're prepared to deal with. except: pass -- https://mail.python.org/mailman/listinfo/python-list
Low level I/O: because I could
Sorry, but I'm just too proud of this. Given that you have: class RegisterLayout(ctypes.Structure): ...yadayadayada... You can then: fh = os.open('/dev/devicethingy', os.O_RDWR) mm = mmap.mmap(fh, ctypes.sizeof(RegisterLayout)) registers = RegisterLayout.from_buffer(mm) And it just works. Behaves exactly the same way memory-mapping that struct in C would. Sure the accesses take dict lookups, and that definitely slows you down a bit. If you REALLY really needed that speed you'd be writing C. But it works. -- Rob Gaddi, Highland Technology -- www.highlandtechnology.com Email address domain is currently out of order. See above to fix. -- https://mail.python.org/mailman/listinfo/python-list
Re: Iterating through a list. Upon condition I want to move on to next item in list
On Wed, May 10, 2017 at 1:46 PM, MRAB wrote: > NEVER use a 'bare except' to suppress exceptions! It'll catch _all_ > exceptions, even NameError (if you've misspelled a name, it'll catch that > too). Catch only those exceptions that you're prepared to deal with. When writing many kinds of applications, this is great advice. But is it good when writing REST API's? You don't want one buggy API call to bring down the whole service. Or am I missing something? -- https://mail.python.org/mailman/listinfo/python-list
Re: Iterating through a list. Upon condition I want to move on to next item in list
On 10/05/17 23:41, Dan Stromberg wrote: On Wed, May 10, 2017 at 1:46 PM, MRAB wrote: NEVER use a 'bare except' to suppress exceptions! It'll catch _all_ exceptions, even NameError (if you've misspelled a name, it'll catch that too). Catch only those exceptions that you're prepared to deal with. When writing many kinds of applications, this is great advice. But is it good when writing REST API's? You don't want one buggy API call to bring down the whole service. Or am I missing something? Yes. You are missing that you now have no idea what the code will go on to do if it ignores the error. You now have a program that is running in an entirely unexpected and unaccounted for state. Perhaps it will be benign. But, perhaps it will wipe an important part of the disk of your server because a string that is expected to contain the name of a subdirectory to be deleted actually still contains the root directory of your application. That's a *much* harder way of "bringing down the whole service" than having to just re-start the server process (which should be an automatic thing anyway) ... E. -- https://mail.python.org/mailman/listinfo/python-list
Re: Iterating through a list. Upon condition I want to move on to next item in list
On Thu, May 11, 2017 at 8:41 AM, Dan Stromberg wrote: > On Wed, May 10, 2017 at 1:46 PM, MRAB wrote: > >> NEVER use a 'bare except' to suppress exceptions! It'll catch _all_ >> exceptions, even NameError (if you've misspelled a name, it'll catch that >> too). Catch only those exceptions that you're prepared to deal with. > > When writing many kinds of applications, this is great advice. > > But is it good when writing REST API's? You don't want one buggy API > call to bring down the whole service. > > Or am I missing something? 1) You should usually use "except Exception:" rather than a bare except, even there. 2) When you create a boundary, *you log the exception*. That's what you're missing. Simply *suppressing* all exceptions is a terrible thing to do. It's common to have a web service boundary that says "except Exception as e: log_exception(e); return HTTP(500)", but that's not suppressing it, it's handling it. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Iterating through a list. Upon condition I want to move on to next item in list
On Wednesday, May 10, 2017 at 8:25:25 PM UTC+1, aaron.m@gmail.com wrote: You've all ready had some answers, but are you after something like this? for elem in mylist: if someThing(elem) is True: continue. Kindest regards. Mark Lawrence. -- https://mail.python.org/mailman/listinfo/python-list
Repeatedly crawl website every 1 min
Hi Everyone, Thanks for stoping by. I am working on a feature to crawl website content every 1 min. I am curious to know if there any good open source project for this specific scenario. Specifically, I have many urls, and I want to maintain a thread pool so that each thread will repeatedly crawl content from the given url. It could be a hundreds thread at the same time. Your help is greatly appreciated. ;) -- https://mail.python.org/mailman/listinfo/python-list
Re: Low level I/O: because I could
On Wed, May 10, 2017 at 10:30 PM, Rob Gaddi wrote: > Sorry, but I'm just too proud of this. > > Given that you have: > > class RegisterLayout(ctypes.Structure): > ...yadayadayada... > > You can then: > > fh = os.open('/dev/devicethingy', os.O_RDWR) > mm = mmap.mmap(fh, ctypes.sizeof(RegisterLayout)) > registers = RegisterLayout.from_buffer(mm) > > And it just works. Behaves exactly the same way memory-mapping that struct > in C would. Sure the accesses take dict lookups, and that definitely slows > you down a bit. If you REALLY really needed that speed you'd be writing C. > But it works. To clarify, the dict lookup here is to bind the CField data descriptor from the class dict. It isn't using the instance dict. -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Hi, I am confused with that. If you say that numpy is not suitable for my case and may have large overhead, what is the alternative then? Do you mean that numpy is a good choice here while we can reduce its overhead? Regards, Mahmood -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
>a = numpy.zeros((ws.max_row, ws.max_column), dtype=float) >for y, row in enumerate(ws.rows): > a[y] = [cell.value for cell in row] Peter, As I used this code, it gave me an error that cannot convert string to float for the first cell. All cells are strings. Regards, Mahmood -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Hi, I used the old fashion coding style to create a matrix and read/add the cells. W = load_workbook(fname, read_only = True) p = W.worksheets[0] m = p.max_row n = p.max_column arr = np.empty((m, n), dtype=object) for r in range(1, m): for c in range(1, n): d = p.cell(row=r, column=c) arr[r, c] = d.value However, the operation is very slow. I printed row number to see how things are going. It took 2 minutes to add 200 rows and about 10 minutes to add the next 200 rows. Regards, Mahmood -- https://mail.python.org/mailman/listinfo/python-list
Re: Out of memory while reading excel file
Mahmood Naderan via Python-list wrote: >>a = numpy.zeros((ws.max_row, ws.max_column), dtype=float) >>for y, row in enumerate(ws.rows): >> a[y] = [cell.value for cell in row] > > > > Peter, > > As I used this code, it gave me an error that cannot convert string to > float for the first cell. All cells are strings. For string values you have to adapt the dtype: a = numpy.empty((ws.max_row, ws.max_column), dtype=object) for y, row in enumerate(ws.rows): a[y] = [cell.value for cell in row] If that completes and is fast enough, fine. But again, for non-numeric data numpy doesn't make much sense IMHO -- if you tell us what you're up to we may be able to suggest a better approach. -- https://mail.python.org/mailman/listinfo/python-list