+1

Thank you, Yaniv, for leading this effort.  

I have a small preference for getting rid of File entirely eventually (2.0?) as 
Lucene and Hadoop seem to have done (?).

-----Original Message-----
From: Yaniv Kunda [mailto:yaniv.ku...@answers.com] 
Sent: Wednesday, August 26, 2015 5:31 PM
To: dev@tika.apache.org
Subject: RE: Adding API support for Java 7's java.nio.file.Path

I can point out several benefits of supporting the new API, in no particular
order:
- Exception handling: operations like File.delete return a boolean which 
provides less useful information if the operation failed than the exception 
thrown by Files.delete() (or a Minion...)
- Performance: The new API delegates more parts of I/O operations to the OS, 
resulting in better usage of resources.
In independent testing I've done (considering big files, cache warmup and 
randomized order) I've achieved 30% faster reads when using Files.copy() or
FileChannel.transferTo()
- Adoption: Java 7, in which the new API appeared, is already EOL.
Supporting this API, considering that java.io is considered legacy, is good for 
keeping us with times, and even better for our users as it offers them an 
incentive of moving forward as well.

More can be found here:
http://docs.oracle.com/javase/tutorial/essential/io/legacy.html

I believe that the library <-> user relationship must have a balance between 
compatibility and progress, as if libraries are stuck at compatibility - the 
users are sometimes stuck without progress...
If we can have progress without breaking compatibility - we have a winner.

I propose to add support for and make the most of the new (4 y/o) API without 
breaking compatibility, which means:
- Public methods accepting a File will not be changed; overloaded versions will 
be added.
- Public methods returning a File will not be changed; methods with different 
names will be added.
- Non-public methods accepting or returning a File will be changed
- Internal uses of the legacy I/O will be updated to use the new API where easy

Regarding deprecation, I suggest that:
1) Methods accepting a File will not be deprecated - they will probably be used 
as long as File itself is not deprecated (forever?)
2) Methods returning a File will be deprecated - progressive users can use the 
new methods easily, less progressive can use the new methods adding
.toFile() to the result, and the rest can still use the deprecated methods 
(which will most likely call the new methods internally anyway).
To summarize: overloading = convenience, methods with the same operation but 
different name and return value = confusing.

If this seems like a decent proposal, I can separate this work into several 
JIRA issues and patches, so that reviewing the changes is easier.

-----Original Message-----
From: Nick Burch [mailto:apa...@gagravarr.org]
Sent: Wednesday, August 26, 2015 13:27
To: dev@tika.apache.org
Subject: Re: Adding API support for Java 7's java.nio.file.Path

On Wed, 26 Aug 2015, Yaniv Kunda wrote:
> I would like to propose adding support for Java 7’s java.nio.file.Path 
> as an alternative to those methods in the API that deal with a 
> java.io.File.

Any chance you could briefly summarise what advantages this would give to us 
and/or our users?

> 1)      What can we do with methods returning a File? e.g.
> TemporaryResources.createTemporaryFile, TikaInputStream.getFile.
> Should we break compatibility and encourage (=force) users to change 
> their code (Note that since they all use Java 7 now, the change is 
> minimal by adding .toFile() to the result), or create new methods with 
> different names (confusing)?

Breaking compatibility outside of a 2.0 release is a big no-no, sorry.

TemporaryResources.createTemporaryPath and TikaInputStream.getPath could work 
as naming

> 2)  Should we deprecate the old methods accepting a File, or delete 
> them?

Deleting would break compatibility, so shouldn't be done. Deprecating could be 
done, if there's a strong reason to encourage people off them


https://wiki.apache.org/tika/Tika2_0RoadMap is where we're tracking proposed 
API-breaking changes for 2.0

Nick

-- 


This email communication (including any attachments) contains information from 
Answers Corporation or its affiliates that is confidential and may be 
privileged. The information contained herein is intended only for the use of 
the addressee(s) named above. If you are not the intended recipient (or the 
agent responsible to deliver it to the intended recipient), you are hereby 
notified that any dissemination, distribution, use, or copying of this 
communication is strictly prohibited. If you have received this email in error, 
please immediately reply to sender, delete the message and destroy all copies 
of it. If you have questions, please email le...@answers.com. 

If you wish to unsubscribe to commercial emails from Answers and its 
affiliates, please go to the Answers Subscription Center 
http://campaigns.answers.com/subscriptions to opt out.  Thank you.

Reply via email to