hard disk activity

2006-02-13 Thread VSmirk
I have a task that involves knowing when a file has changed.  But while
for small files this is an easy enough task, checking the modification
dates, or doing a compare on the contents, I need to be able to do this
for very large files.

Is there anything already available in Python that will allow me to
check the hard-disk itself, or that can make my routines aware when a
disk write has occurred?

Thanks for any help,

V

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hard disk activity

2006-02-13 Thread VSmirk
I'm working primarily on Windows XP, but my solution needs to be cross
platform.

The problem is that I need more than the fact that a file has been
modified.  I need to know what has been modified in that file.

I am needing to synchronize the file on a remote folder, and my current
solution, which simply copies the file if a date comparison or a
content comparison, becomes a bit unmanageable for very large files.
Some of the files I'm working with are hundreds of MB in size, or
larger.

So I need to skip copying a hundred MB file that has had only a few
bytes changed and instead identify which few bytes have changed and
where those changes are.  I was thinking having a module that worked
below the file system level, at the device level, might be a place to
look for a solution.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hard disk activity

2006-02-13 Thread VSmirk
I agree with you wholeheartedly, but the large files is part of the
business requirements.

Thanks for the link.  I'll look into it.

V

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hard disk activity

2006-02-13 Thread VSmirk
Pretty much, yeah.  Except I need diffing a pair of files that exist on
opposite ends of a network, without causing the entire contents of the
file to be transferred over that network.

Now, I have the option of doing this:  If I am able to determine that
(for instance) bytes 10468 to 1473 in a 849308 byte file are the only
segment that has changed, I can send that range over the network and
insert it into the right place; and then, with a downtime overnight, I
can do a file-copy synchronization to ensure there were no errors
during the day.  (I'm reading this and wondering if it even makes
sense, sorry if it doesn't.)

But the trick in my mind is figuring out which specific bytes have been
written to disk.  That's why I was thinking device level.  Am I going
to have to work in C++ or Assembler for something like this?

Sorry if this sounds like a newbie question.  I've been working with
Python long enough to know that someone out there has already solved
one or another of a really obscure problem.  So I thought I'd take a
stab at it.

Thanks everyone for the great links. 

V

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hard disk activity

2006-02-13 Thread VSmirk
Aweseme!!!  I got as far as segmenting the large file on my own, and I
ran out of ideas.  I kind of thought about checksum, but I never put
the two together.

Thanks.  You've helped a lot

V

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hard disk activity

2006-02-13 Thread VSmirk
Thanks for the head's up.  I was so giddy with the simplicity of the
solution, I stopped trying to poke holes in it.

I agree with your philosophy of not "reinventing the wheel", but I did
notice two things:  First, the link you provided claims in the features
section that rsync if for *nix systems, so I am assuming I'll need a
port of it for windows systems; however looking at a Python rsync
module I found, it looks like it's just doing file-copy (which I have
already solved).

So I'm wondering if you know off-hand which windows port does this
checksum validation you outlined.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hard disk activity

2006-02-13 Thread VSmirk
Of course that was the first thing I tried.

But what I meant to say was that at least one port, the python one,
didn't have the checksum validation that Paul was talking about, so I
was wondering if he knew of one that was faithful to the unix port of
it.

Thanks much for the links, though, and all the help.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hard disk activity

2006-02-14 Thread VSmirk
Terry,

Yeah, I was sketching out a scenario much like that.  It does break
things down pretty well, and that gets my file sync scenario up to much
larger files.  Even if many changes are made to a file, if you keep
track of the number of bytes and checksum over from 1 to the number of
bytes different by shifting the sequence ( that is [abcd]ef, a[bced]f,
ab[cdef]), until a checksum is a match again, you should be able to
find some point where the checksums match again and you can continue up
(or down) doing only the checksums again without all the overhead.

The question in my mind that I will have to test is how much overhead
this causes.

One of the business rules underlying this task is to work with files
that are being continuously written to, say by logging systems or
database servers.  This brings with it some obvious problems of file
access, but even in cases where you don't have file access issues, I am
very concerned about race conditions where one of the already-handled
blocks of data are written to.  The synched copy on the remote system
now no longer represents a true image of the local file.

This is one of the reasons I was looking into a device-level solution
that would let me know when a hard disk write had occurred.  One
colleagues suggested I was going to have to write assembler to do this,
and I may have to ultimately just use the solutions described here for
files that don't have locking and race-condition issues.

Regardless, it's a fun project, and I have to say this list is one of
the more polite lists I've been involved with.  Thanks!

V

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python compilation ??

2007-07-03 Thread VSmirk
On Jul 3, 10:42 pm, Frank Swarbrick <[EMAIL PROTECTED]> wrote:
> John Nagle wrote:
> > Evan Klitzke wrote:
> >> On 7/2/07, Cathy Murphy <[EMAIL PROTECTED]> wrote:
>
> >>> Is python a compiler language or interpreted language. If it is
> >>> interpreter
> >>> , then why do we have to compile it?
>
> >   Iron Python compiles to Microsoft's byte code as used by their
> > ".NET" common language runtime.  This is then compiled to machine
> > code by a just-in-time compiler.
>
> Does Iron Python compile to free-standing executables, or is there an
> Iron Python interpreter that is always necessary?
>
> Frank

Here's what I understand:

As with Java, which compiles to ByteCode which only runs with the Java
Virtual Machine, IronPython compiles to MSIL code, which will run on
any machine which has the .Net framework installed.

However, MS set up .Net so it appears in the file system as a .exe,
so, IronPython should be in experience no different from any other
application.  The trick is to handle whether or not the user has the
correct (or any) .Net framework, which is easily solved by adding an
installer project which will install required files.

This answer is not based on my personal experience with IronPython,
which I haven't personally played with much yet, but is based on my
day-job experience with C# and VB.Net, both of which compile to MSIL
code and work as I describe, and with Python after hours and my
knowledge of how Jython and IronPython work in theory.

I would be very interested in learning if IronPython is not
implemented as I describe.

VSmirk

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for a good Python environment

2007-11-07 Thread VSmirk
On Nov 7, 9:16 am, Gerhard Häring <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > Hey, I'm looking for a good Python environment. That is, at least an
> > editor and a debugger, and it should run on Windows. Does anyone have
> > any idea?
>
> I like ERIC. You can get it 
> athttp://www.die-offenbachs.de/eric/eric4-download.html
>
> Or just download and install PyQt4, which includes it:
>
> http://www.riverbankcomputing.com/Downloads/PyQt4/GPL/PyQt-Py2.5-gpl-...
>
> There's also a list of Python IDEs on the Python 
> wiki:http://wiki.python.org/moin/IntegratedDevelopmentEnvironments
>
> -- Gerhard

WingIDE all the way.  After trying a number of deve environments, Wing
was the first I used that actually allowed me to be productive.

They offer a free version, but it's worth getting the professional
version, too.

http://www.wingide.com/

VSmirk

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Looking for a good Python environment

2007-11-12 Thread VSmirk
On Nov 11, 4:39 pm, Paul Rubin  wrote:
> Russell Warren <[EMAIL PROTECTED]> writes:
> > Wing now has multi-threaded debugging.
>
> Cool, is it windows-only?  I'm using Linux.
>
> > A quick look at the current state of SPE shows that it now has multi-
> > threaded debugging via WinPDB (what I used to use for debugging thread
> > issues).  Interesting.  Worth a look to see if it is integrated well.
>
> Same issue: this also sounds windows-specific.  Thanks though.

Wing is actually not windows-specific.  They are Linux based as well,
and I believe a number of users are also MacOSX users.

The multi-threading debugging is a new feature with it's latest
release, but I have heard of no platform-specific issues related to it.

-- 
http://mail.python.org/mailman/listinfo/python-list