2013/7/31 Gary Gregory <garydgreg...@gmail.com> > On Wed, Jul 31, 2013 at 10:42 AM, Benedikt Ritter <brit...@apache.org > >wrote: > > > <snip> > > > > >> A use case I have now is a CSV file with a lot of columns (~90) but I > > only > > >> care about a small subset of the columns (~10). I'd like to be able to > > say > > >> withHeader(Set) where the Set may be a subset of the actual column > names > > in > > >> the header line. This is different from withHeader(String[]) because > the > > >> names in the Set must match the names in the header record. > > > > > > > > > > What you are talking about sounds more like a view or a projection of > > the > > > > actual content being parsed. > > > > Do we really need this for 1.0 or can it be postponed? > > > > > > This is a real scenario and a real need, not some imaginary > complication > > ;) > > > > > > Even if it is not implemented for 1.0, we should talk about how it > > > should be done such that it fits in and does not cause API problems > > > later. And if I can get it done by then, then that much the better. > > > > > > > Okay, then let's discuss this on a new thread :-) > > > > As I've said, I think we should not push to much into > > withHeaders(String...). Maybe this is some sort of view, where you can > pass > > a parser and the headers you are interested in and it will return an > > Iterable<CSVRecord> (or CSVParser) that just gives access to the > specified > > headers you are interessted in? > > > > Would it be possible to give a code example of what you have to do with > to > > current API in your use case and what you want? > > > > I am switching to withHeader() with no arg (same as a new String[]{}) and > let the parser guess the headers and then pray that the names match between > the app and the files. Which is just as unsafe as forcing the headers in > fixed order on the parser because the column order might have changed. > Ideally, the column order should not matter, which it does not when you do > a record.get(String), which is nice. > > Calling withHeader() with no args is less brittle than calling it with 90 > args. The benefit is that the column order in the file can change without > affecting the app, which is good. I could use a little more bullet-proofing > by making the column names optionally case-insensitive, but that's a > different feature. > > Ideally, I want to define the column names in the app as a simple Java > enum, then use an enum as a record key. That does not work for column names > that have spaces in them as mine do, so it's back to classic static final > Strings as keys. I could create a fancier custom enum but it's not worth it > for now. >
Hey Gary, I still don't understand what you are suggesting. At first I though this was about accessing a subset of the actual columns (you said your file has 90 columns but you are only interested in ~10). Your last message sounds more like you're looking for a better way to make sure the headers parsed from the file match what you are expecting. I guess this is why getHeaderMap is now public (?!) What am I missing? Benedikt > > Gary > > > > Benedikt > > > > > > > > -- > > http://people.apache.org/~britter/ > > http://www.systemoutprintln.de/ > > http://twitter.com/BenediktRitter > > http://github.com/britter > > > > > > -- > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org > Java Persistence with Hibernate, Second Edition< > http://www.manning.com/bauer3/> > JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> > Spring Batch in Action <http://www.manning.com/templier/> > Blog: http://garygregory.wordpress.com > Home: http://garygregory.com/ > Tweet! http://twitter.com/GaryGregory > -- http://people.apache.org/~britter/ http://www.systemoutprintln.de/ http://twitter.com/BenediktRitter http://github.com/britter