Thank you for the response, Paul. I will integrate and try these suggestions within my Groovy code that runs in a nifi ExecuteScript processor. I'll be working on this once again tonight.
The map topValuesMap is intended to capture this: for each key identified in the json, cross-tabulate for each value associated with that key how many times it occurs. After the json is fully processed for key, sort the resulting map and retain only the top ten values found in the json. If a set has a lastName key, the topValuesMap that results might look something like this after all keys have been cross-tabulated: ["lastName": ["Smith" : 1023, "Jones" : 976, "Chang": 899, "Doe": 511, ...], "address.street": [.....], . . . "a final key": [.....] ] Each key would have ten values in its value map, unless it cross-tabulates to less than ten in total, in which case it will be sorted by count value and all values accepted. Again, many thanks. Jim On Fri, May 12, 2023 at 2:19 AM Paul King <pa...@asert.com.au> wrote: > I am not 100% sure what you are trying to capture in topValuesMap but for > tallyMap you probably want something like: > > def tallyMap = [:].withDefault{ 0 } > def tally > tally = { Map json, String prefix -> > json.each { k, v -> > if (v instanceof List) { > tallyMap[prefix + k] += 1 > v.each{ tally(it, "$prefix${k}.") } > } else if (v?.toString().trim()) { > tallyMap[prefix + k] += 1 > if (v instanceof Map) tally(v, "$prefix${k}.") > } > } > } > > def root = new JsonSlurper().parse(inputStream) > def initialPrefix = '' > tally(root, initialPrefix) > println tallyMap > > Output: > [name:1, age:1, address:1, address.street:1, address.city:1, > address.state:1, address.zip:1, phoneNumbers:1, phoneNumbers.type:2, > phoneNumbers.number:2] > > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > Virus-free.www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > <#m_3707991732434518544_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > On Fri, May 12, 2023 at 10:38 AM James McMahon <jsmcmah...@gmail.com> > wrote: > >> I have this incoming json: { "name": "John Doe", "age": 42, "address": { >> "street": "123 Main St", "city": "Anytown", "state": "CA", "zip": "12345" >> }, "phoneNumbers": [ { "type": "home", "number": "555-1234" }, { "type": >> "work", "number": "555-5678" } ] } I wish to tally all the keys in this >> json in a map that gives me the key name as its key, and a count of the >> number of times the key occurs in the json as its value. For this example, >> the keys I expect in my output should include name, age, address, >> address.street, address.city, address.state, address.zip, phoneNumbers, >> phoneNumbers.type, and phoneNumbers.number. But I do not get that. Instead, >> I get this for the list of fields: triage.json.fields >> name,age,address,phoneNumbers And I get this for my tally count by key: >> triage.json.tallyMap [name:1, age:1, address:1, phoneNumbers:1] >> >> I am close, but not quite there. I don't capture all the keys. Here is my >> code. How must I modify this to get the result I require? import >> groovy.json.JsonSlurper import org.apache.commons.io.IOUtils import >> java.nio.charset.StandardCharsets def keys = [] def tallyMap = [:] def >> topValuesMap = [:] def ff = session.get() if (!ff) return try { >> session.read(ff, { inputStream -> def json = new >> JsonSlurper().parseText(IOUtils.toString(inputStream, >> StandardCharsets.UTF_8)) json.each { k, v -> if (v != null && >> !v.toString().trim().isEmpty()) { tallyMap[k] = tallyMap.containsKey(k) ? >> tallyMap[k] + 1 : 1 if (topValuesMap.containsKey(k)) { def valuesMap = >> topValuesMap[k] valuesMap[v] = valuesMap.containsKey(v) ? valuesMap[v] + 1 >> : 1 topValuesMap[k] = valuesMap } else { topValuesMap[k] = [:].withDefault{ >> 0 }.plus([v: 1]) } } } } as InputStreamCallback) keys = >> tallyMap.keySet().toList() def tallyMapString = tallyMap.collectEntries { >> k, v -> [(k): v] }.toString() def topValuesMapString = >> topValuesMap.collectEntries { k, v -> [(k): v.sort{ -it.value }.take(10)] >> }.toString() ff = session.putAttribute(ff, 'triage.json.fields', >> keys.join(",")) ff = session.putAttribute(ff, 'triage.json.tallyMap', >> tallyMapString) ff = session.putAttribute(ff, 'triage.json.topValuesMap', >> topValuesMapString) session.transfer(ff, REL_SUCCESS) } catch (Exception e) >> { log.error('Error processing json fields', e) session.transfer(ff, >> REL_FAILURE) } >> >