On Sun, Jun 20, 2010 at 09:53:29PM +0200, Daniel Näslund wrote: > Hi stsp! > > Attaching a patch showing a suggestion on how to split up > patch_target_t. Maybe you have some insights on this, before I rush out > and start grooking around and making changes to your design. > > What 'svn patch' does > ---------------------- > 1) Match hunks > 2) Apply hunks to a stream backed up by a tmp file > 3) Install. copy the tmp file to the target location > 4) Notify the user on what has happened, e.g. fuzz, offset and rejects > > The problem > ------------- > Each property is essentially like it's own text file. 'svn patch' must > be generic enough to be able to perform the above actions on both > properties and texts. > > Alternatives > -------------- > * Simplify. Don't use the unidiff format but just copy the content > straight off and apply it.
I don't think I understand the above. What do you mean? > * Duplicate. Create special code for handling props, e.g. have both > match_hunk() and match_property_hunk() and so on. > > Proposed solution > ------------------- > Use patch_target_t as a container for sub-patch_targets that can be either > file-text or a property. Store tree change info needed for step 3) > installing and step 4) notifying. Your description is quite low-level (detailing new fields in existing structs etc.) which makes it a bit hard to follow. Let's take a step back. What is the general idea of your approach? I'll try making some high-levelI suggestions about how to approach the problem. Please comment on them (I am most likely not seeing the whole picture) and we'll work something out. parse_next_hunk() currently puts both normal and property hunks in the same list (patch->hunks). Have you considered splitting this into two separate lists, e.g. patch->hunks and patch->property_hunks? property_hunks can be a hashtable with prop_name as keys and arrays of hunk_t as value: {prop_name, [hunk1, hunk2, hunk3, ...]}. Or it can be a list of structs like { prop_name; array of hunks; } if that is easier to work with in the caller. This should provide much cleaner separation, because now you can leave the original hunk handling code alone, and add extra steps that handle the property cases. Regarding parse_next_hunk(), consider splitting this up into two functions: parse_hunk() and parse_property_hunk(). The first one attempts to find a normal "@@" hunk at the current file offset, and returns a NULL *hunk on parsing failure. If it fails (the hunk returned is NULL), rewind the file to the seek position where parsing a normal hunk failed, and parse_property_hunk() to attempt parsing a "##" property hunk. These functions could probably share some code, but you should keep them independent for now so we can see how much they really differ. In the patch code, you can then add property patching as a separate step. Currently, we do something like the following in apply_one_patch(): for hunk in hunks: match(hunk) # gather some additional meta data on hunk such as actual line # offset we'll be using hunk_info = get_info(hunk) target.hunks.append(hunk_info) for hunk_info in target.hunks: if hunk.rejected: reject(hunk) else apply(hunk) Rejection and application happens on temporary files, we haven't modified the working copy yet. It's clear that you'll need similar steps for properties. Something like: for (prop, hunks) in property_hunks: target.properties.append(prop) for hunk in hunks: match(hunk) # gather some additional meta data on hunk such as actual line # offset we'll be using hunk_info = get_info(hunk) target.properties.prop.hunks.append(hunk_info) for prop in target.properties: for hunk in prop.hunks: if hunk.rejected: reject(hunk) else apply(hunk) This would also operate entirely on temporary files. Once we're here, we can worry about applying the changes to the WC. Again, I think it would help if you kept all property steps separate from the existing code in the beginning. Just add new steps to apply_one_patch() as needed, but don't modify existing code. We can worry about optimizing the integration of the new code with the existing code later. It's more important to focus on the problem itself rather than worrying how to best integrate it with the existing code. Just keep in mind to release all resources allocated for a target before processing the next target. This is really important because otherwise we use too much memory and/or file descriptors with large patches. Stefan