+1
confirmed digests
built locally with Gradle 8.13 and java 17
Integrated and built successfully with Tika 3.x -- did not run full
regression tests
Thank you, PJ and team!
On Tue, Apr 1, 2025 at 5:25 PM PJ Fanning wrote:
> Hello POI Community,
>
> This is a call for a vote to release Apache P
> > poms published to Maven Central.
> >
> >
> https://maven.apache.org/enforcer/enforcer-rules/dependencyConvergence.html
> >
> > On Wed, 8 Jan 2025 at 21:16, Dave Fisher wrote:
> >>
> >>
> >>
> >>> On Jan 8, 2025, at 11:13 AM, Tim Allison wrot
21:16, Dave Fisher wrote:
> >
> >
> >
> > > On Jan 8, 2025, at 11:13 AM, Tim Allison wrote:
> > >
> > > Thank you, all. I'm sorry for the noise.
> > >
> > > As you all point out, these are not a POI or even XMLBeans issue, and
>
-api:jar:2.24.3:compile
> > > Not sure if you’d like to address this before release, but this would
> > make our build with the dependencyConvergence rule enabled in the Maven
> > enforcer plugin unhappy. For now I have fixed it by excluding the
> log4j-api
> > d
Sorry. I'm looking at these more closely, and the problem is with the maven
dependencies brought in my xmlbeans...not something that we should fix on
POI or xmlbeans.
WDYT?
P.S. I did notice some convergence issues. I don't think these are a
> showstopper...not clear if we should fix these in XM
+1
Apologies for my delay. Looks good.
Confirmed src.tgz digest
Built locally and ran tests
Integrated with Tika's main branch.
Thank you PJ, Dominik and team!
P.S. I did notice some convergence issues. I don't think these are a
showstopper...not clear if we should fix these in XMLBeans or let
Sorry for my delay.
+1
* built from source
* confirmed sha512 of source
* built Tika successfully with expected modifications
Thank you, PJ, Dominik and team!
On Thu, Dec 12, 2024 at 6:43 PM PJ Fanning wrote:
>
> Would any of the POI PMC members have time to review this RC? We just need
> one
Sounds great to me. Thank you, PJ!
On Mon, Nov 25, 2024 at 8:56 AM PJ Fanning
wrote:
> With the log4j 2.24.2 release, I think we should have an XMLBeans 5.3.0
> release.
> The main changes are to upgrade the log4j version away from log4j 2.24.1
> which caused problems.
> I've also added a wrappe
lf4j-api is not immune to making breaking changes.
> I still argue that we could wrap the logger init so that we can avoid having
> logger init issues fail POI startup.
>
>
>
>
>
>
> On Saturday 16 November 2024 at 15:12:14 GMT+1, Tim Allison
> wrote:
>
>
Thank you, PJ, for leading this effort. I completely agree that we
can't let log4j cause problems for us, and I like your proposal to
wrap log4j. Is going back to slf4j off the table?
On Thu, Nov 14, 2024 at 5:06 PM PJ Fanning wrote:
>
> We've already migrated from our own POILogger that was disa
Checked source checksum and built with Tika's main branch* and POI's main
branch.
I had some convergence issues with the main branch of POI
(5.4.0-SNAPSHOT)'s versions of plexus-utils and plexus-classworlds, but
those are likely user error and trivial to fix?
I also noticed that rat didn't like t
Sounds great. Thank you, PJ!
On Sat, Oct 26, 2024 at 11:36 AM PJ Fanning
wrote:
> I'm wondering whether we should do new releases.
> The XMLBeans changes don't affect POI much but it would be nice for POI to
> depend on the latest XMLBeans release.
>
>
> https://issues.apache.org/jira/browse/XML
+1
Confirmed digests, built locally and integrated into a local build of
Apache Tika's main branch. Ran regression tests earlier and found
improvements on items identified in 5.2.4.
Thank you, PJ, Dominik and team!
On Sun, Nov 19, 2023 at 3:30 PM Dominik Stadler
wrote:
> Hi,
>
> Verified conte
om> wrote:
>
>
>
>
>
> The build is not stable at the moment. Looks like there are some build
> fixes needed before we can get an RC ready.
>
>
>
>
>
>
> On Thursday 16 November 2023 at 22:41:28 GMT+1, Tim Allison <
> talli...@apache.org> wrote:
update POI to use the new XMLBeans version.
>
> I think we can then create an RC1 for POI 5.2.5. I can do this. Maybe
> tomorrow.
>
> According to Tim Allison, Apache Tika are waiting for this release [1].
>
> The changes are listed here [2].
>
> [1] https://lists.apache.or
Thank you, PJ, for running the XMLBeans 5.2.0 release! We are holding the
Tika 3.0.0-BETA release for POI 5.2.5. I agree there's not a major rush,
but it would be great to get that out.
Let me know if/when I should run our regression tests with 5.2.5.
Thank you, again!
Best,
Tim
On Tue,
+1
Thank you, PJ!
I verified the checksums.
I did get two rat failures that don't concern me (user error?) when I ran
`gradle build test`:
...xmlbeans-5.2.0/javadocs/package-list
...xmlbeans-5.2.0/javadocs/script.js
On Wed, Nov 8, 2023 at 4:48 PM Dominik Stadler
wrote:
> Hi,
>
> did a check
This just bit us on Tika:
https://bz.apache.org/bugzilla/show_bug.cgi?id=67767
The fix is easy. I can patch it today. It would be great to get it into
5.2.5. I'm sorry that I didn't catch it during the earlier regression
tests...my fault.
On Sun, Oct 15, 2023 at 4:34 PM Dominik Stadler
wrote:
+1
Reports are here: There's surprisingly little difference:
https://corpora.tika.apache.org/base/reports/poi-reports.tgz
I only had time to glance briefly.
Thank you PJ and team!
On Fri, Sep 22, 2023 at 4:09 AM PJ Fanning
wrote:
> Thanks Alex. The pdfbox issue is tracked at
> https://bz.apa
All,
First, I'm not proposing any changes for 5.2.4 (many thanks PJ for
running the release!).
In looking at DirectoryNode's getEntry, I see this:
@Override
public Entry getEntry(final String name) throws FileNotFoundException {
Entry rval = null;
if (name != null) {
rval = _
Sounds great. I’ll try to make a run against our corpus as well.
Thank you!
On Thu, Sep 21, 2023 at 2:58 AM Dominik Stadler
wrote:
> Hi,
>
> Yes, I agree, a release soon would be good to get the many many
> improvements out to users.
>
> P.J., could you run the process once more and maybe updat
?)?
And of course, in the email below the characters have been modified
back to the underlying text, but they should be "alpha" "beta" "chi",
etc... see the screenshot on the issue
Thank you!
Best,
Tim
-- Forwarded message -
From: Tim All
Similar to Nick and others. I have time to pay attention, but not as
much as I'd like to contribute. Always hopeful... So, y, I'm still
interested. Thank you for calling roll and all of your work on POI
and beyond!
On Fri, Mar 3, 2023 at 12:06 PM Nick Burch wrote:
>
> On Fri, 3 Mar 2023, PJ Fa
Fellow Devs,
I recently came across this issue:
https://issues.apache.org/jira/browse/TIKA-3968. Has anyone else seen
this? Am I missing an easy way to associate embedded file names with
the actual embedded file?
I'm sure there's a reason to do this, but it feels to me like docx
is giving PD
+1
There's one new pptx exception, and a small number of fixed emf/wmf exceptions.
Reports are here:
https://corpora.tika.apache.org/base/reports/tika-2.5.0-poi-reports.tgz
Let me know if you have any questions!
Cheers,
Tim
On Fri, Sep 9, 2022 at 4:51 PM PJ Fanning wrote:
>
> Hi e
+1
I'll have time next week to run against our regression corpus, too. If
there's interest.
On Wed, Sep 7, 2022 at 4:35 PM PJ Fanning
wrote:
> Hi everyone,
>
> Is it time for new POI release? It's about 6 months since the last one and
> the change list is fairly big - https://poi.apache.org/cha
+1
I didn't have time to run any regression tests, but Tika builds with
these artifacts.
Thank you, PJ and team!
On Sat, Feb 26, 2022 at 5:15 PM Andreas Beeker wrote:
>
> Hi,
>
> thank you for preparing the release, PJ!
>
> I've done some rudimentary checks - here is my +1.
>
> Andi
>
> On 26.0
Apologies for being absent... The xsb issue is why we haven't upgraded
to 5.x on Tika yet. I _think_ we'd like to avoid the ooxml-full jar,
but if that's the most robust option, we'll have to go with that.
I'm also happy to grab new files, or run against our corpus if that'd
be of any use.
Many
Autocorrect!!! Tika
On Tue, Oct 12, 2021 at 4:42 PM Tim Allison wrote:
>
> https://www.wired.co.uk/article/pandora-papers-leak
>
> Repo:
> https://github.com/ICIJ/datashare/
-
To unsubscribe, e-mai
https://www.wired.co.uk/article/pandora-papers-leak
Repo:
https://github.com/ICIJ/datashare/
difference in build-setup is when they are created
> differently.
>
> Thanks... Dominik.
>
> On Fri, May 7, 2021 at 6:13 PM Tim Allison wrote:
>>
>> Hi All,
>>I recently tried to build with Java 11 because of [1], I found that
>> the build was modifying module
Hi All,
I recently tried to build with Java 11 because of [1], I found that
the build was modifying module-info.java and module-info.class. Is
this expected? Is the combination of the Java issue and this item a
sign I should put down the keyboard for the weekend a bit early?
Cheers,
All seems to work if I uncomment this line in build.xml:
Any objections?
On Tue, Mar 23, 2021 at 10:24 AM Tim Allison wrote:
>
> Going back to Andi's point [1]...trying this now.
>
> [1]
> https://lists.apache.org/x/thread.html/ra9ff58e6af046a51ba459915fe536a2ea1fe
Going back to Andi's point [1]...trying this now.
[1]
https://lists.apache.org/x/thread.html/ra9ff58e6af046a51ba459915fe536a2ea1fe71e85329abc4e513711e@%3Cuser.poi.apache.org%3E
On Tue, Mar 23, 2021 at 10:17 AM Tim Allison wrote:
>
> All,
> Over on Tika [1], I'm gettin
All,
Over on Tika [1], I'm getting an exception that oleobjectelement.xsb
can't be found. When I look in the ooxml-lite.jar, I see there's an
oleobjelement.xsb, but no oleobjectelement.xsb.
I tried adding the triggering document (EmbeddedDocument.docx) to a
poi unit test[2] and rebuilding 5.0.
org/apache/poi/xddf/usermodel/XDDFSolidFillProperties.java:38:
error: recursive constructor invocation
public XDDFSolidFillProperties(XDDFColor color) {
^
On Tue, Feb 23, 2021 at 7:53 AM Tim Allison wrote:
>
> ant test seems to be working (waiting for completion, but it at leas
23, 2021 at 7:43 AM Tim Allison wrote:
>
> All,
> Many apologies...it has been too long since I've worked with our
> codebase. I recently did a fresh pull and can't get a clean
> build...ant compile works, but I get a failure with ant test. See link
> below for sys
All,
Many apologies...it has been too long since I've worked with our
codebase. I recently did a fresh pull and can't get a clean
build...ant compile works, but I get a failure with ant test. See link
below for system, versions and stacktrace [1].
User error?
Thank you!
Ch
-- Forwarded message -
From: Sergey Beryozkin
Date: Tue, Oct 20, 2020 at 7:54 AM
Subject: [OT] Looking for Apache POI help
To:
Hi All,
sorry for this off-topic post, it is a little bit relevant to Tika dev, but
only a little bit :-),
We are having some good interest in making
Does this meet the needs?
https://github.com/apache/tika/blob/main/tika-parser-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testPPT_oleWorkbook.ppt
On Sun, Oct 11, 2020 at 5:09 PM Andreas Beeker wrote:
> Hi Nick,
>
> > Should we have WorkbookFactory spot this case, gr
o modules?
>
> Best wishes,
> Andi
>
>
> [1]
> https://builds.apache.org/view/P/view/POI/job/POI-XMLBeans-DSL-1.8/lastSuccessfulBuild/artifact/build/
>
> On 13.08.20 20:06, Tim Allison wrote:
> > All,
> >
> > I've been away from POI for a bit, and And
All,
I've been away from POI for a bit, and Andi has done some amazing work.
THANK YOU!
The build works as it should on the commandline, but what's the
recommendation for adding ooxml-schemas as a dependency in the IDE?
Should I run a full build and then create my own lib/poi-ooxml-schemas
fix this too.
> I guess this will take another few weeks to be completed.
>
> Best wishes,
> Andi
>
>
> On 22.06.20 22:28, Tim Allison wrote:
> > All,
> >From a Tika perspective, I'm happy with 5.0.0 as well...any idea when
> > the next release will
All,
From a Tika perspective, I'm happy with 5.0.0 as well...any idea when
the next release will be? Last release was in February.
Now that we have the regression testing vm back up and running, I can
kick off tests whenever...
Thank you!
Cheers,
Tim
On
All,
If you have an interest in guiding the ongoing development of the
regression corpus vm, please join the new mailing list:
corpora-...@tika.apache.org via the usual means:
corpora-dev-subscr...@tika.apache.org
Unless there are objections, we can continue to use the regular Tika JIRA
to tr
Should have cc'd you all...this should be up and running in the next 24
hours. Please subscribe if you'd like to discuss/collaborate on the vm and
regression corpora.
-- Forwarded message -----
From: Tim Allison
Date: Thu, Jun 4, 2020 at 8:56 AM
Subject: Fwd: New ma
All,
I started #tika-vm on the ASF’s Slack for informal
discussion/coordination of the regression corpus and vm.
Cheers,
Tim
gt;
>
> On Fri, Feb 14, 2020 at 10:48 PM Tim Allison wrote:
>
>> All,
>>
>> I recently downloaded attachments from the following bug trackers:
>> COMPRESS, TIKA, PDFBox, POI, Open Office, Libre Office and ghostscript:
>> http://162.242.228.174/docs/bugtrack
All,
I recently downloaded attachments from the following bug trackers:
COMPRESS, TIKA, PDFBox, POI, Open Office, Libre Office and ghostscript:
http://162.242.228.174/docs/bugtrackers/
I then unpackaged/uncompressed all of the package/compressed files so:
COMPRESS-115-1.zip is the second fil
+1
Thank you, Andi (and team)!
http://162.242.228.174/reports/reports_poi_4.1.2-rc3.tgz
On Mon, Feb 10, 2020 at 3:38 PM Andreas Beeker wrote:
> Hi *,
>
> I've prepared artifacts for the release of Apache POI 4.1.2 (RC3).
>
> The most notable changes in this release are:
>
> - XDDF - some work
Sorry for the late reply. See Bug 64130 for a regression in parsing old
excel spreadsheets that have worksheets without names. There were about
550 new exceptions caused by this in our regression corpus.
On Sat, Feb 8, 2020 at 5:30 PM Tim Allison wrote:
> I’m afk, but it looked like th
afk.
On Sat, Feb 8, 2020 at 1:21 PM Andreas Beeker wrote:
> Hi *,
>
> just to be sure ... I'm waiting for Tims second +1 or should I release the
> artifacts?
> I.e. as far as I understand the reports we only have marginal differences.
>
> Andi
>
> On 07.02.20 13:0
nsion PixelAspectRatio": "1.0",
"Dimension VerticalPhysicalPixelSpacing": "0.26462027",
"X-Parsed-By": [
"org.apache.tika.parser.CompositeParser",
"org.apache.tika.parser.DefaultParser",
"org.apach
wildly
even with the same versions on different runs. The key for me is the
rollup by parse time suggests _overall_ for ppt,
the time is nearly identical.
> On 07.02.20 13:05, Tim Allison wrote:
> > Hi All,,
> > I haven't had the chance to look, but will
to have ASF infrastructure
> provision
> > a VM to be managed by POI PMC.
> >
> > Regards,
> > Dave
> >
> > Sent from my iPhone
> >
> > > On Feb 5, 2020, at 3:38 PM, Andreas Beeker
> wrote:
> > >
> > > Hi Tim,
> >
Hi All,,
I haven't had the chance to look, but will do so later today::
http://162.242.228.174/reports/poi_4.1.2_reports.tgz
On Wed, Feb 5, 2020 at 7:47 PM Tim Allison wrote:
> Might be faster than I thought...results tomorrow...perhaps.
>
> On Wed, Feb 5, 2020 at 5:51 PM Tim
Might be faster than I thought...results tomorrow...perhaps.
On Wed, Feb 5, 2020 at 5:51 PM Tim Allison wrote:
> I did not. I can kick it off now, but with travel and other stuff,
> wouldn't have results until Monday. Happy to do so if desired.
>
> On Wed, Feb 5, 2020 at
nt is unavailable.
>
> Andi
>
> On 05.02.20 01:05, Tim Allison wrote:
> > +1
> >
> > built without surprises, digests check out and Tika builds. Thank you,
> > Andi and team!
> >
> > On Tue, Feb 4, 2020 at 2:20 PM Andreas Beeker
> wrote:
>
+1
built without surprises, digests check out and Tika builds. Thank you,
Andi and team!
On Tue, Feb 4, 2020 at 2:20 PM Andreas Beeker wrote:
> +1 ... the NOTICE file was still on 2019, but I don't think this matters.
> Apart of it, my sample application works.
>
> On 03.02.20 22:55, PJ Fannin
up till then ...
>
> Andi
>
>
> On 23.01.20 15:41, Tim Allison wrote:
> > Hi All,
> > We're getting pinged over on Tika for when the next release of POI will
> > be available. Any plans?
> >
> > https://issues.apache.org/jira/browse/TIKA-3017
> >
> > Thank you!
> >
>
>
>
Hi All,
We're getting pinged over on Tika for when the next release of POI will
be available. Any plans?
https://issues.apache.org/jira/browse/TIKA-3017
Thank you!
All,
Thank you for this release! I'm sorry that I was mostly AWOL.
Andi,
Thank you for running this release!
Cheers,
Tim
On Sun, Oct 20, 2019 at 3:52 PM Andreas Beeker wrote:
> The Apache POI project is pleased to announce the release of POI 4.1.1.
> Featured are a
heers,
Tim
On Sat, Oct 5, 2019 at 8:38 AM Tim Allison wrote:
>
> Andi,
> I’m sorry for my delay. I’ve booked a chunk of time on Monday to look at
> this...data is prepped...just need to run latest code and compare. I don’t
> want to hold up the release tho...please move fo
wrote:
> Hi Tim,
>
> On 20.09.19 13:55, Tim Allison wrote:
> > I think I remember a regression in emf/wmf...could be spurious or my
> fault
> > at the Tika level.
>
> I've just checked my mails for the original emf/wmf issue, which you've
> (partly) fixed v
Hi All,
Do we have any sense of when the next release will be? IIRC I have a bit
of work to do w emf[1], what else do we want to include?
Thank you!
Cheers,
Tim
[1] I have a vague memory of slight regressions in text extraction, but I
have to test w latest.
All,
For some recent work on Apache Tika, I used commons-compress to
extract entry names and metadata via a streaming read from roughly
500k zip-based files we have in Tika's regression corpus.
I was happy to see we have some POI-generated files in there. :)
I noticed some areas for improveme
All,
Again, my apologies for being late, but the results might still be
useful for work towards 4.1.1.
http://162.242.228.174/reports/poi-4.1.0-reports.zip
Some tentative observations:
1) there was the new and non-replicable set of problems with the XSSFBParser.
2) The emf/wmf regressions are
Hi Andi,
Y, to be clear, I really like what you’ve done and it is all a bunch
cleaner than my earlier stuff; I wasn’t at all questioning the design. The
question was more to back compat. There was quite a bit of red when I made
the upgrade and before I modernized our code on Tika.
As long as we’r
On Mon, Apr 8, 2019 at 4:55 PM Andreas Beeker wrote:
> Hi Tim,
>
> I've made that changes on purpose, as I wanted to make the EMF API similar
> to the WMF one.
>
> > oap.hemf.extractor.HemfExtractor -> oap.hemf.usermodel.HemfPicture
> All (?) our user models are called by their content and being
itial 4.0.2 to 4.1.0, but that's not an area of code
> I'm familiar with.
>
> On Mon, Apr 8, 2019 at 6:07 AM Tim Allison wrote:
>
> > Are we ok with the backward incompatibilities in EMF...These are just
> > a few. I realize these class
Are we ok with the backward incompatibilities in EMF...These are just
a few. I realize these classes are @Internal, and the updates look
great.
HwmfRecord.getRecordType() -> getWmfRecordType()
oap.hemf.record.AbstractHemfComment -> oap.hemf.record.hemf.Comment
oap.hemf.record.HemfRecord -> oap.h
Sorry for being late to the game. I won’t have time to run regression tests
until Monday or so... thank you Dominik and Greg!
On Sat, Apr 6, 2019 at 4:27 AM Dominik Stadler
wrote:
> Hi Greg,
>
> thanks for running the release and removing all the obstacles on the way,
> always good if as many pe
I've added SAX parsers for pptx and docx over on Apache Tika. These
rely on POI for OPCPackage, a bunch of other classes and overall
design.
I've thought about moving that code into POI, but I haven't found the
time or need, and the code is my typical kludgy-mess...and I don't
want to pollute POI
+1
Reports are available here:
http://162.242.228.174/reports/reports_poi_4_0_1-rc2.tgz
Thank you, Andi!
On Mon, Nov 26, 2018 at 6:01 PM Andreas Beeker wrote:
>
> Hi,
>
> I've prepared artifacts for the release of Apache POI 4.0.1 (RC2).
>
> The most notable changes in this release are:
>
> - de
Sorry, now that I've figured out what the problem was, I'm -1. Y,
let's respin.
On Thu, Nov 22, 2018 at 4:34 PM Andreas Beeker wrote:
>
> Hi Tim,
>
> On 21.11.18 19:26, Tim Allison wrote:
> > This looks like a regression.
>
>
> Please make your mind up
ike a regression.
On Wed, Nov 21, 2018 at 12:56 PM Tim Allison wrote:
>
> >These were in the header...I have to step away from the keyboard for
> now...any ideas?
>
> I confirmed this by flipping btwn 4.0.0 and 4.0.1 in our dependencies
> and using our Tika's SNAPSHOT f
>These were in the header...I have to step away from the keyboard for
now...any ideas?
I confirmed this by flipping btwn 4.0.0 and 4.0.1 in our dependencies
and using our Tika's SNAPSHOT for both. This is not caused by a
different version of Tika.
On Wed, Nov 21, 2018 at 12:53 PM Tim
: de: 2 | la: 2 | 03: 1 | 06: 1 | 1: 1 | 16: 1 | 2009: 1
| 3: 1 | conciencia: 1 | despertar: 1
These were in the header...I have to step away from the keyboard for
now...any ideas?
On Wed, Nov 21, 2018 at 12:37 PM Tim Allison wrote:
>
> Reports are available here:
> http://162.242.228.174/r
Reports are available here:
http://162.242.228.174/reports/reports_poi_4_0_1-rc1.tgz
We have a bunch less content in ppt, but I _think_ this is because at
the Tika level we used to duplicate notes content, and we've fixed
that bug. So, I think this is an improvement, but I need to check.
On Wed,
the release process was too smooth.
> Only my local version of the commons-openpgp needed to be used. [1]
>
> Andi
>
> [1] https://issues.apache.org/jira/browse/SANDBOX-508
>
>
> On 20.11.18 22:33, Tim Allison wrote:
> > Andi,
> >Thank you! I've built thi
Andi,
Thank you! I've built this locally and integrated it into Tika, and
I've kicked off the regression tests. The one small glitch I noticed
so far is that poi-ooxml-schemas jar has an extra ".jar" in it:
build/dist/maven/poi-ooxml-schemas/poi-ooxml-schemas-4.0.1.jar.jar
I'll let you all k
W00t!!!
Here's Dave's talk on POI at COSCON in Shenzhen, China on October 20, 2018:
https://www.youtube.com/watch?v=N7_Y3zNb_-w
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@p
Autoboxing?!
On Fri, Nov 2, 2018 at 7:27 AM Tim Allison wrote:
>
> Colleagues, any idea what might be going on? How can -1 != -1?!
>
> Error: Test with 2/3: Should not find 3 but found it at -1 in 0 1 2
> at org.apache.poi.hwpf.usermodel.TestBug47563.test(TestBug47563.java:80)
Colleagues, any idea what might be going on? How can -1 != -1?!
Error: Test with 2/3: Should not find 3 but found it at -1 in 0 1 2
at org.apache.poi.hwpf.usermodel.TestBug47563.test(TestBug47563.java:80)
assertTrue("Test with " + rows + "/" + columns + ": Should not find "
+ i + " but found it a
+1 "end of this week" that'll work well for my issues, too. I want to
confirm I didn't break anything in my recent commits via large scale
regression testing.
On Tue, Oct 30, 2018 at 8:31 AM Yegor Kozlov wrote:
>
> +1
>
> Bug 62836 is pending. I'm going to check in the code anyway, just waiting
>
Dejan,
Thank you for letting us know about this problem. I was able to
reproduce it, and I've opened a ticket:
https://bz.apache.org/bugzilla/show_bug.cgi?id=62815
On Wed, Sep 12, 2018 at 5:58 AM dejan ikodinovic
wrote:
>
> Hi guys,
>
> I m working on parsing Excel xlsb files using Apache POI 3
Turns out that's a subset. It looks like there should be ~200k emfs.
I'll try to dig up the extraction code and re-run.
On Tue, Oct 9, 2018 at 8:55 AM Tim Allison wrote:
>
> Y. Turns out I extracted a bunch a while ago. See the 'emfs'
> directory in this tar.bz2 f
Y. Turns out I extracted a bunch a while ago. See the 'emfs'
directory in this tar.bz2 file:
http://162.242.228.174/embedded_files/xmfs.tar.bz2
Let me know if you have any questions and/or if I can make that any
more useful for you.
Cheers,
Tim
On Mon, Oct 8, 2018 at 7
At some point I extracted all emfs from our corpus. I’ll see if that data
is still around and/or re-extract...prob have time tomorrow/ Wednesday
On Sun, Oct 7, 2018 at 5:01 PM Dominik Stadler
wrote:
> Hi Andi
>
> It is easy to change CommonCrawlDocumentDownload to fetch other mime-types,
> see
>
All,
I opened https://issues.apache.org/jira/browse/TIKA-2750 to track
updating data on the regression corpus. Please track/join the
conversation there if you'd like to participate.
Cheers,
Tim
-
Tobias,
I just gave you access to the vm and sent login stuff to you
personally. I have to update some groups and permissions, but I'll
let you know when that is ready.
Let me know if you have problems getting on.
Best,
Tim
> 1. Is it OK that 100% CPU is used wh
Tobias,
I'm sorry for my delay. We welcome you to use our regression vm
hosted by Rackspace for fuzzing work to identify vulnerabilities. Our
one request: we ask that you pause/stop your processes when we need to
run regression tests before a release.
Email me privately with your desired user
All, I broke our mp3 parser w changes in Tika 1.19. We're about to
roll a 1.19.1. Is there anything catastrophic in 4.0.0 that would
lead us to wait for 4.0.1? I noticed the 62692 (wildfly xml
parser)...is there anything else?
Thank you!
Cheers,
Tim
On Wed, Sep 19, 2018 at 5
Let me know if these are of any use...
https://github.com/centic9/CommonCrawlDocumentDownload
http://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/
https://events.static.linuxfound.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf
https://wiki.apac
Can you open an issue on out bugzilla and post a test file w a unit test?
Thank you for sharing this w us!
On Wed, Sep 12, 2018 at 5:58 AM dejan ikodinovic
wrote:
> Hi guys,
>
> I m working on parsing Excel xlsb files using Apache POI 3.17 version and
> have problem for some numbers.
> The probl
Looks great! If at all possible, I’d appreciate a bullet or two on
Dominik’s and my large scale regression tests... More input on test files
for the corpus would be useful. Complete understand if this is off topic.
Thank you!
On Fri, Sep 14, 2018 at 5:27 PM Dave Fisher wrote:
> Hi Team,
>
> I’ve
+1
Reports are here:
http://162.242.228.174/reports/poi-4.0.0-reports-e.tgz
These reports compare 3.17 with 4.0.0-RC1.
There are numerous fixed exceptions. The new exceptions appear to be
caused by better exception reporting for truncated files.
Two small issues that I'm ok with for now:
1) We
Sorry for my delay. I'm kicking off our regression tests now.
On Sat, Sep 1, 2018 at 11:46 AM Dominik Stadler wrote:
>
> Hi,
>
> Content of release-archives look good compared to 3.17.
>
> Only found a very minor glitch: osgi/build.xml and sonar/**/pom.xml still
> contain "4.0.0-SNAPSHOT", but I
+1. Thank you, Andi!
On Mon, Aug 27, 2018 at 5:52 AM Alain FAGOT BÉAREZ wrote:
>
> +1 for full refactoring to POIFS*
>
> Gesendet mit BlueMail
>
>
> Originale Nachricht
> Von: Andreas Beeker
> Gesendet: Sun Aug 26 19:06:02 GMT-03:00 2018
> An: dev@poi.apache.org
> Betreff: Re:
Despite that gaffe -- thank you, again, Andi -- I compared the output
after some recent modifications, and there are no differences:
http://162.242.228.174/reports/poi-4.0.0-reports-d.tgz
On Fri, Aug 17, 2018 at 11:22 AM Tim Allison wrote:
>
> Ugh, and thank you!
> On Fri, Aug 17, 201
1 - 100 of 115 matches
Mail list logo