Hi Albretch,

This seems to be more suitable to be discussed in JIRA :
https://issues.apache.org/jira/projects/COMPRESS/issues 
(https://link.getmailspring.com/link/1994dde5-ebdc-4f24-9bc7-105cf6551...@getmailspring.com/0?redirect=https%3A%2F%2Fissues.apache.org%2Fjira%2Fprojects%2FCOMPRESS%2Fissues&recipient=ZGV2QGNvbW1vbnMuYXBhY2hlLm9yZw%3D%3D)
And I could not download the torrent from
http://torrentz.pl/search?f=articles%20enwiki&safe=0 
(https://link.getmailspring.com/link/1994dde5-ebdc-4f24-9bc7-105cf6551...@getmailspring.com/1?redirect=http%3A%2F%2Ftorrentz.pl%2Fsearch%3Ff%3Darticles%2520enwiki%26safe%3D0&recipient=ZGV2QGNvbW1vbnMuYXBhY2hlLm9yZw%3D%3D)
Not clear why, but I think you could attach this torrent in the JIRA.
Considering this is 11Gb dump file, it would cost a lot of time downloading it. 
It would be great if you could provide some smaller file to reproduce this 
problem.
Lee
On 10 14 2020, at 6:21, Albretch Mueller <lbrt...@gmail.com> wrote:
> I don't know what could there apaprently be exactly at byte offset
> 2848 in some buffer but files reporing to be fine by bzip2 --test
> can't be processed by BZip2CompressorInputStream:
> ~
> $ 
> _IFL="/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream1.xml-p1p41242.bz2"
> $ ls -l "${_IFL}"
> -r--r--r-- 1 lbrtchx lbrtchx 242624781 Sep 22 05:40
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream1.xml-p1p41242.bz2
> $ file --brief "${_IFL}"
> bzip2 compressed data, block size = 900k
> $ time bzip2 --test --verbose "${_IFL}"
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream1.xml-p1p41242.bz2:
> ok
>
> real 2m0.650s
> user 2m0.076s
> sys 0m0.256s
>
> $ 
> _IFL="/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream4.xml-p311330p558391.bz2"
> $ ls -l "${_IFL}"
> -r--r--r-- 1 lbrtchx lbrtchx 394001572 Sep 22 05:49
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream4.xml-p311330p558391.bz2
> $ file --brief "${_IFL}"
> bzip2 compressed data, block size = 900k
> $ time bzip2 --test --verbose "${_IFL}"
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream4.xml-p311330p558391.bz2:
> ok
>
> real 3m6.249s
> user 3m5.192s
> sys 0m0.628s
>
> $ 
> _IFL="/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream5.xml-p558392p958045.bz2"
> $ ls -l "${_IFL}"
> -r--r--r-- 1 lbrtchx lbrtchx 427323881 Sep 22 05:51
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream5.xml-p558392p958045.bz2
> $ file --brief "${_IFL}"
> bzip2 compressed data, block size = 900k
> $ time bzip2 --test --verbose "${_IFL}"
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream5.xml-p558392p958045.bz2:
> ok
>
> real 3m20.861s
> user 3m19.296s
> sys 0m0.988s
>
> $ 
> _IFL="/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream6.xml-p958046p1483661.bz2"
> $ ls -l "${_IFL}"
> -r--r--r-- 1 lbrtchx lbrtchx 458830618 Sep 22 05:52
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream6.xml-p958046p1483661.bz2
> $ file --brief "${_IFL}"
> bzip2 compressed data, block size = 900k
> $ time bzip2 --test --verbose "${_IFL}"
> /home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream6.xml-p958046p1483661.bz2:
> ok
>
> real 3m34.213s
> user 3m32.636s
> sys 0m1.056s
> $
>
>
> $ 
> _IFL="/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/logs/UnKmprssBZ2_02Test_20201013234903.log"
> $ tail -n 10 "${_IFL}"
> // __ Files Context of |4| files containing a total of |1522780852| bytes!
> // __ [0/4): ...(30.131%)
> |/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream6.xml-p958046p1483661.bz2|
> // __ aOFlNm: 
> |/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/REF/enwiki-20200920-pages-articles-multistream6-p958046p1483661.xml|
> // __ |2848|2848|java.io.IOException:
> // __ Read bytes and file lenght not the same! lTtlRdByts: |2848|
> (lTtlRdByts != lFlL), lFlL: |458830618|
> at UnKmprssBZ2_02Test.main(UnKmprssBZ2_02Test.java:254)
>
> real 0m1.759s
> user 0m2.920s
> sys 0m0.196s
>
> $ 
> _IFL="/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/logs/UnKmprssBZ2_02Test_20201013234826.log"
> $ tail -n 10 "${_IFL}"
> // __ Files Context of |4| files containing a total of |1522780852| bytes!
> // __ [0/4): ...(28.062%)
> |/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream5.xml-p558392p958045.bz2|
> // __ aOFlNm: 
> |/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/REF/enwiki-20200920-pages-articles-multistream5-p558392p958045.xml|
> // __ |2848|2848|java.io.IOException:
> // __ Read bytes and file lenght not the same! lTtlRdByts: |2848|
> (lTtlRdByts != lFlL), lFlL: |427323881|
> at UnKmprssBZ2_02Test.main(UnKmprssBZ2_02Test.java:254)
>
> real 0m1.669s
> user 0m2.720s
> sys 0m0.220s
>
> $ 
> _IFL="/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/logs/UnKmprssBZ2_02Test_20201013234708.log"
> $ tail -n 10 "${_IFL}"
> // __ Files Context of |4| files containing a total of |1522780852| bytes!
> // __ [0/4): ...(25.874%)
> |/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream4.xml-p311330p558391.bz2|
> // __ aOFlNm: 
> |/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/REF/enwiki-20200920-pages-articles-multistream4-p311330p558391.xml|
> // __ |2848|2848|java.io.IOException:
> // __ Read bytes and file lenght not the same! lTtlRdByts: |2848|
> (lTtlRdByts != lFlL), lFlL: |394001572|
> at UnKmprssBZ2_02Test.main(UnKmprssBZ2_02Test.java:254)
>
> real 0m1.665s
> user 0m2.752s
> sys 0m0.172s
>
> $ 
> _IFL="/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/logs/UnKmprssBZ2_02Test_20201013234602.log"
> $ tail -n 10 "${_IFL}"
> // __ Files Context of |4| files containing a total of |1522780852| bytes!
> // __ [0/4): ...(15.933%)
> |/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistream1.xml-p1p41242.bz2|
> // __ aOFlNm: 
> |/home/lbrtchx/cmllpz/prjx/kd/java/IO/compress/REF/enwiki-20200920-pages-articles-multistream1-p1p41242.xml|
> // __ |2848|2848|java.io.IOException:
> // __ Read bytes and file lenght not the same! lTtlRdByts: |2848|
> (lTtlRdByts != lFlL), lFlL: |242624781|
> at UnKmprssBZ2_02Test.main(UnKmprssBZ2_02Test.java:254)
>
> real 0m1.691s
> user 0m2.756s
> sys 0m0.216s
> $
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

Reply via email to