> On 18 Jun 2019, at 18:59, sebb <seb...@gmail.com> wrote: > > On Tue, 18 Jun 2019 at 16:01, Alex Herbert <alex.d.herb...@gmail.com > <mailto:alex.d.herb...@gmail.com>> wrote: >> >> >> On 18/06/2019 15:38, sebb wrote: >>> On Tue, 18 Jun 2019 at 12:58, Alex Herbert <alex.d.herb...@gmail.com> wrote: >>>> >>>> On 18/06/2019 11:00, sebb wrote: >>>>> On Tue, 18 Jun 2019 at 10:40, Alex Herbert <alex.d.herb...@gmail.com> >>>>> wrote: >>>>>> On 18/06/2019 09:55, sebb wrote: >>>>>>> On Tue, 18 Jun 2019 at 08:15, Julian Reschke <julian.resc...@gmx.de> >>>>>>> wrote: >>>>>>>> On 17.06.2019 23:26, sebb wrote: >>>>>>>>> Most of the files in my clone of codec have LF endings, however a few >>>>>>>>> are CRLF: >>>>>>>>> >>>>>>>>> ./README.md >>>>>>>>> ./src/assembly/bin.xml >>>>>>>>> ./src/assembly/src.xml >>>>>>>>> ./src/changes/changes.xml >>>>>>>>> ./src/main/java/org/apache/commons/codec/cli/Digest.java >>>>>>>>> ./src/main/java/org/apache/commons/codec/language/DaitchMokotoffSoundex.java >>>>>>>>> ./src/main/resources/org/apache/commons/codec/language/bm/lang.txt >>>>>>>>> ./src/test/java/org/apache/commons/codec/digest/HmacAlgorithmsTest.java >>>>>>>>> ./src/test/java/org/apache/commons/codec/digest/MessageDigestAlgorithmsTest.java >>>>>>>>> ./src/test/java/org/apache/commons/codec/digest/PureJavaCrc32Test.java >>>>>>>>> ./src/test/java/org/apache/commons/codec/language/ColognePhoneticTest.java >>>>>>>>> >>>>>>>>> >>>>>>>>> This causes spurious differences when the files are updated. >>>>>>>>> >>>>>>>>> Can these files be easily fixed without causing huge diffs to be >>>>>>>>> generated? >>>>>>>>> >>>>>>>>> Also, is there any way to prevent such files being committed to the >>>>>>>>> repo? >>>>>>>>> >>>>>>>>> S. >>>>>>>> If svn:eol-style is set to "native", it shouldn't matter. I think this >>>>>>>> can be defaulted for newly added files. >>>>>>> Thanks, but this is Git, not SVN. >>>>>>> >>>>>>>> In Jackrabbit, I regularly run a script to spot new files missing the >>>>>>>> property. >>>>>>> Are you willing to share the script? >>>>>> This was recently a problem in [statistics]. It was fixed using a >>>>>> .gitattributes file [1] containing: >>>>>> >>>>>> * text=auto >>>>>> >>>>>> You can fix all the existing files following the steps detailed on the >>>>>> git documentation: >>>>>> >>>>>> $ echo "* text=auto" >.gitattributes >>>>>> >>>>>> $ git add --renormalize . >>>>>> >>>>>> $ git status # Show files that will be normalized >>>>>> >>>>>> $ git commit -m "Introduce end-of-line normalization" >>>>> Thanks, though that did not pick up two of the files. >>>> Oh dear. >>>> >>>> When I tried this locally it misses from your list: >>>> >>>> ./src/changes/changes.xml >>>> ./src/test/java/org/apache/commons/codec/language/ColognePhoneticTest.java >>>> >>>> Those files are also ignored on my machine (linux) by dos2unix. They are >>>> not found by any of the following [1]: >>>> >>>> $ grep -IUr --color "^M" src >>>> $ find src -type f | xargs file | grep CRLF >>>> $ grep -IUlr $'\r' src >>>> >>>> So are they a problem? >>> I don't know if this causes an issue. >>> >>> I used file on macOS to detect the problem files. >>> Also my editor (BBEdit) shows the EOL as CRLF for them. > > I've since recloned the repo, and those 2 files don't have CRLF endings. > Something must have been confused in my workspace. > >> I am currently on linux. I don't have any settings for line endings >> configured for git [1], i.e. the core.autocrlf property. So if I am >> correct what I pulled from the master repo is unchanged on checkout. And >> the two spurious files seem OK for me and 9 require updating. >> >> I can try it again on MacOS later. Maybe something is different there >> and this is very platform specific. >> >>> >>>>> However it looks like the commit message will show huge diffs for each >>>>> file. >>>>> >>>>> Is that unavoidable? >>>> The diff is done line-by-line. So if each line changes then it is a big >>>> diff. I don't know a way around that. >>>> >>>> The alternative would be to leave the .gitattributes file and not commit >>>> the normalised files. The next time someone commits each of the >>>> offending files the normalisation will occur as git sends it back to the >>>> repo. So this just delays the big diff. At least if it all done at once >>>> then it makes more sense and avoids the issue of a big diff occurring >>>> some time in the future and someone has to figure it out all over again. >>> Agreed it's best done all at once. >>> >>> I remember fixing EOLs on SVN but as I recall it did not create the >>> huge diffs so long as it was done on the appropriate OS. >>> Maybe doing it on Windows won't cause the diffs to be created? I may >>> be able to try that later. >> >> Since windows is the culprit for the CRLF endings it makes sense to try. > > Using Windows does not seem to help; git show shows all lines as different.
It was worth a try. I saw the EOL commit. Are you going to commit the .gitattibutes file as well? I’m indifferent on this. It is recommended for any project which expects contributions from multiple platforms. It was done on [statistics]. On one side it will stop anyone committing new files with CRLF. On the other side Windows users of git should set their core.autocrlf property globally to prevent this. > >> In this case if you create the .gitattributes file (or configure >> core.autocrlf) git will know to send the file back to the repo >> normalised. So you may have to edit each of the offending files with a >> trivial change to force a commit. The diff should then be the trivial >> change you made and not the big diff with all the lines. >> >> I don't know what happens on the server side. If you do it in a branch >> in Github you could compare the two side by side. Either it will show >> the trivial change or the big diff because on the server side the CRLF >> was changed and locally (on windows) it was not. >> >>> >> [1] https://help.github.com/en/articles/dealing-with-line-endings >>>> [1] >>>> https://stackoverflow.com/questions/73833/how-do-you-search-for-files-containing-dos-line-endings-crlf-with-grep-on-linu >>>> >>>>>> [1] https://git-scm.com/docs/gitattributes >>>>>> >>>>>> >>>>>>>> Best regards, Julian >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> <mailto:dev-unsubscr...@commons.apache.org> >> For additional commands, e-mail: dev-h...@commons.apache.org >> <mailto:dev-h...@commons.apache.org> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > <mailto:dev-unsubscr...@commons.apache.org> > For additional commands, e-mail: dev-h...@commons.apache.org > <mailto:dev-h...@commons.apache.org>