Hi
for the mooc I'm working on a srt to vtt converter.
1
00:00:07,040 --> 00:00:10,440
Hello. This week,
we'll get to the heart of the matter,
2
00:00:10 600 --> 00:00:12,160
about syntax especially.
into
WEBVTT
00:00:07.040 --> 00:00:10.440 align:middle
Hello. This week,
we'll get to the heart of the matter,
00:00:10.600 --> 00:00:12.160 align:middle
about syntax especially.
It works more or less. Now I face the problem that the files people
provided me have different encodings. (I guess) because when I do not
treat the input (for example withLinuxLineEndings) I get some CRs after
the conversion eventhough I copy some file content and all the line
ending I output are lf (or can be customizable.
I cannot apply garbage in gabrage out because the files should work.
So I thought that I should just convert first the string I read using
withLinuxLineEndings so that all cr, crlf are converted into lf. But
since files have different encodings I end up something to issues too
many lf.
Does any of you have an idea how to handle this.
I did not find a way to know the encoding of a file (not the bom) just
the file ending.
Stef