Thank you, yes, it would be great if this could be extended to use an
index.
In our case, we're reading files from Amazon S3. S3 does offer the option
to request only a chunk out of a file, and any efficient solution would
need to use this rather than downloading the file multiple times.
On
Take a look at https://github.com/nielsbasjes/splittablegzip :D
On Tue, Dec 6, 2022 at 7:46 AM Oliver Ruebenacker <
oliv...@broadinstitute.org> wrote:
>
> Hello Holden,
>
> Thank you for the response, but what is "splittable gzip"?
>
> Best, Oliver
>
> On Tue, Dec 6, 2022 at 9:22 AM H
Hello Holden,
Thank you for the response, but what is "splittable gzip"?
Best, Oliver
On Tue, Dec 6, 2022 at 9:22 AM Holden Karau wrote:
> There is the splittable gzip Hadoop input format, maybe someone could
> extend that to use support bgzip?
>
> On Tue, Dec 6, 2022 at 1:43 PM Ol
There is the splittable gzip Hadoop input format, maybe someone could
extend that to use support bgzip?
On Tue, Dec 6, 2022 at 1:43 PM Oliver Ruebenacker <
oliv...@broadinstitute.org> wrote:
>
> Hello Chris,
>
> Yes, you can use gunzip/gzip to uncompress a file created by bgzip, but
> to s
Hello Chris,
Yes, you can use gunzip/gzip to uncompress a file created by bgzip, but
to start reading from somewhere other than the beginning of the file, you
would need to use an index to tell you where the blocks start. Originally,
a Tabix index was used and is still the popular choice, a