The difference is, from my modest knowledge about the scheduler, the following.
The scenario is: - Users have to be able to import a csv to one of the tables. csv's may be big 8mb, 40k rows. (worst case) - Users may do this whenever they want, so concurrency would occur. Implementation problems: i cannot process the rows in my controller because responses longer than 5 minutes are timeout by the hosting service (this is pythonanywhere). Ok this make sense, i launch a background process then. Scheduler problems: - Enabling the scheduler adds overhead to the database, a lot i might say. - I have to manually run the scheduler on the server. This is bad because after a few days it becomes unresponsive and i have to kill it and restart it again, manually. PythonAnywhere stops their servers 30mins once in a month for mantainance, so i have to watch for it as well. - The scheduler is writing each few seconds for worker's heartbeats, in order to know how many workers are avalaible. The more workers the more overhead. - Since many users should be able to import at the same time i have to declare multiple workers beforehand Even if no one is importing anything, the db is continuosly doin io operations. Ok i dont know how many users would be importing at the same time so i declare like 3 or 4 workers to be running just in case. So only 4 users would be able to import data at the same time and the db is writing 4 times each few seconds, all the time. Going on with all these, i have to make a progress bar each process. So i write the task's run output to indicate the percentage done each 5%. This way im not writing the percentage each time a row is inserted. A lot more overhead to the db. Not only that, the client's browser has to ask the server for the percentage. So it has to ask the db as well, lets say each 5 seconds (progress bar update interval). x number of clients importing at the same time. To sum up, while im performing an intensive db operation: - Scheduler is writing heartbeats each few seconds for each available worker. - Running Tasks are writing percentages each 5%. - Browser is asking the db each 5 seconds for task's progress. No wonder why is slower in comparison. The thread and all statistics are in memory just while its running, Im not limited to a fixed amount of users importing at the same time. They are launched on demand instead all the time. The times are those more or less, same importing function of course, bot using DAL. Thread DAL: 4 mins aprox Threads mysql.connector: 2/3mins Scheduler: 20/30+mins Of course ill stick with DAL The scheduler might be good for mailing operations or mantainance but importing bulk data, not so much. -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.