Hi All, This thread is a follow-up to https://github.com/apache/commons-csv/pull/309#issuecomment-1441456258
Bruno says: "With Pandas it automatically deduplicates the column names. Maybe that's a feature that we could have in Commons CSV too?" What does that mean and actually do? Say I have column A with row 1 value of "X" and 2nd column A with row 1 value of 2. What do I get when I ask for column A row 1? Seth says: "HeaderStrategy Interface Contains two functions: #normalizeHeaders(headings) - With given heading, output a list that fits with whatever the strategy is going for. #get(record, header) - Fetch value(s) based on given column name." I would see perhaps two interfaces so that lambdas might be used more simply. Maybe, needs an example. "I'm also wary that this may screw up existing projects that depend on allowing/disallowing duplicates. i.e. want to allow duplicates and handle things through indexes / iteration, so this didn't cause a problem for them and want to preserve header names, and so don't need the headers deduplicated." As a first cut whatever we do could/should maintain the existing behavior. We can change the default later by popular demand. Gary --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org