I think the word “payload” is confusing me.

The client is sending a JSON document. That JSON document has a “content” field 
which is string-valued and is escaped (stringified) JSON. Correct?

You want to parse that JSON and treat it as additional fields to index?

So this content (fragment):

"content": 
"{\"Page\":{\"Id\":\"2ff99d1a-a21b-4391-9c47-af2865acb753\",\"Name\":\"Ronald 
McDonald House Idaho 
meals\",\"Url\":\"/blogs/st-lukes/news-and-community/2021/jan/ronald-mcdonald-house-idaho-meals\",\"Date\":\"2022-10-03T12:30:17.3388537\",\"ContentType\":\"Blog\",\"Body\":{\"Fields\":[{\"Name\":\"Heading
 Background Image\",\"Type\":\"Image\",\"Value\":\"\”},...

would add fields like Page, and Id and Name under that?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Nov 28, 2022, at 1:31 PM, Matthew Castrigno <castr...@slhs.org> wrote:
> 
> Hello Walter,
> 
> Thank you for your reply. Yes, it is invalid JSON. However, it is "my" 
> problem unfortunately. 
> 
> I am looking for a way to filter the payload as a character string.
> 
> The charFilter would be great, however for that to work it would have to be 
> first recognized as a valid field. 
> 
> Is there a way in SOLR to process the entire payload in this way so I can 
> turn it into proper JSON by filtering out the "/" s ?
> 
> Thank you.
> From: Walter Underwood <wun...@wunderwood.org <mailto:wun...@wunderwood.org>>
> Sent: Monday, November 28, 2022 2:23 PM
> To: users@solr.apache.org <mailto:users@solr.apache.org> 
> <users@solr.apache.org <mailto:users@solr.apache.org>>
> Subject: Re: Is there a way to run the entire payload of a request through a 
> charFilter and not just the fields?
>  
> This Message Is From an External Sender 
> This message came from outside the St. Luke's email system. 
> That is invalid JSON. The client needs to fix it. I’m surprised it indexes at 
> all. This should not be your problem.
> 
> Past that string into this: 
> https://urldefense.com/v3/__https://jsonlint.com__;!!FkC3_z_N!K2Droj3x11Rpw7SE8FWYfCL5RP-Csp8j-9RRLv1EYypmATl8cteTtGrFKpxLPrxknF9jN0quzVuaiamt$
>  
> <https://urldefense.com/v3/__https://jsonlint.com__;!!FkC3_z_N!K2Droj3x11Rpw7SE8FWYfCL5RP-Csp8j-9RRLv1EYypmATl8cteTtGrFKpxLPrxknF9jN0quzVuaiamt$>
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org <mailto:wun...@wunderwood.org>
> https://urldefense.com/v3/__http://observer.wunderwood.org/__;!!FkC3_z_N!K2Droj3x11Rpw7SE8FWYfCL5RP-Csp8j-9RRLv1EYypmATl8cteTtGrFKpxLPrxknF9jN0quzRPZK7sh$
>  
> <https://urldefense.com/v3/__http://observer.wunderwood.org/__;!!FkC3_z_N!K2Droj3x11Rpw7SE8FWYfCL5RP-Csp8j-9RRLv1EYypmATl8cteTtGrFKpxLPrxknF9jN0quzRPZK7sh$>
>   (my blog)
> 
> > On Nov 28, 2022, at 12:57 PM, Matthew Castrigno <castr...@slhs.org 
> > <mailto:castr...@slhs.org>> wrote:
> > 
> > Hello Mikhail,
> > 
> > I have to work with the payload as is, I cannot modify it. My entire 
> > solution has a lot of other things going on which would just confuse the 
> > discussion.
> > 
> > The issue I am having can be recreated using the update handler with the 
> > script enabled (as shown in the documentation example)  and json.command 
> > set to false.
> > 
> > Solr does not recognize a field with escape characters "\"
> > 
> > Here is  a much smaller payload that demonstrates the issue:
> > 
> > from the script:
> >  doc = cmd.solrDoc;  // org.apache.solr.common.SolrInputDocument
> >  logger.warn(doc.toString());
> > 
> > sending this payload:
> > 
> > {"partner":"88027688-62c4-459a-b4d5-a8ecf9edd1bf","command":"add","doc_id":"2ff99d1a-a21b-4391-9c47-af2865acb753","content":"Page\"}
> > 
> > results in this output in the console, notice the "content" field is not 
> > listed. Solr cannot parse this part of the payload, it simply ignores it.
> > 
> > SolrInputDocument(fields: 
> > [partner=88027688-62c4-459a-b4d5-a8ecf9edd1bf,&#8203; command=add,&#8203; 
> > doc_id=2ff99d1a-a21b-4391-9c47-af2865acb753])
> > 
> > I am trying to find a way to filter out these escape characters so Solr, 
> > specificaly,  org.apache.solr.common.SolrInputDocument,  will recognize the 
> > fields that have them.
> > 
> > Thank you.
> > ________________________________
> > From: Mikhail Khludnev <m...@apache.org <mailto:m...@apache.org>>
> > Sent: Monday, November 28, 2022 1:07 PM
> > To: users@solr.apache.org <mailto:users@solr.apache.org> 
> > <users@solr.apache.org <mailto:users@solr.apache.org>>
> > Subject: Re: Is there a way to run the entire payload of a request through 
> > a charFilter and not just the fields?
> > 
> > Hello, It's still not clear. Which update request params (or curl) you use? 
> > What if you put content as a tiny string, and then complicate it step by 
> > step? On Mon, Nov 28, 2022 at 7: 27 PM Matthew Castrigno <castrigm@ slhs. 
> > org> wrote: >
> > ZjQcmQRYFpfptBannerStart
> > This Message Is From an External Sender
> > This message came from outside the St. Luke's email system.
> > 
> > ZjQcmQRYFpfptBannerEnd
> > 
> > Hello,
> > It's still not clear. Which update request params (or curl) you use? What
> > if you put content as a tiny string, and then complicate it step by step?
> > 
> > On Mon, Nov 28, 2022 at 7:27 PM Matthew Castrigno <castr...@slhs.org 
> > <mailto:castr...@slhs.org>> wrote:
> > 
> >> Hi Mikhail,
> >> 
> >> Thank you for your response. I am currently using the script update
> >> processor, but I have not been able to access the entire payload for
> >> processing. cmd.solrDoc is not correctly reading the payload. I have a
> >> payload where it is not recognizing a field value. This payload had four
> >> fields, the last one is "content" but if I do this:
> >>  doc = cmd.solrDoc;
> >>  logger.warn(doc.toString());
> >> The content field is not shown.
> >> 
> >> I want to filter that field an remove the quotes, so it is recognized as
> >> additional JSON for me to process.
> >> 
> >> logger output:
> >> SolrInputDocument(fields:
> >> [partner=88027688-62c4-459a-b4d5-a8ecf9edd1bf,&#8203; command=add,&#8203;
> >> doc_id=2ff99d1a-a21b-4391-9c47-af2865acb753])
> >> Here is the payload:
> >> {
> >>  "partner": "88027688-62c4-459a-b4d5-a8ecf9edd1bf",
> >>  "command": "add",
> >>  "doc_id": "2ff99d1a-a21b-4391-9c47-af2865acb753",
> >>  "content":
> >> "{\"Page\":{\"Id\":\"2ff99d1a-a21b-4391-9c47-af2865acb753\",\"Name\":\"Ronald
> >> McDonald House Idaho
> >> meals\",\"Url\":\"/blogs/st-lukes/news-and-community/2021/jan/ronald-mcdonald-house-idaho-meals\",\"Date\":\"2022-10-03T12:30:17.3388537\",\"ContentType\":\"Blog\",\"Body\":{\"Fields\":[{\"Name\":\"Heading
> >> Background Image\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"Tile Wide
> >> Image\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"Specialties\",\"Type\":\"Treelist\",\"Value\":\"\"},{\"Name\":\"Blog
> >> Post Name\",\"Type\":\"Single-Line Text\",\"Value\":\"Ronald McDonald
> >> House, St. Luke’s Children’s find new ways to help
> >> families\"},{\"Name\":\"Blog Summary\",\"Type\":\"Rich
> >> Text\",\"Value\":\"\"},{\"Name\":\"Share Summary\",\"Type\":\"Multi-Line
> >> Text\",\"Value\":\"\"},{\"Name\":\"ShareTitle\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"Ronald McDonald House, St. Luke’s Children’s find new
> >> ways to help families\"},{\"Name\":\"Blog Post
> >> Date\",\"Type\":\"Datetime\",\"Value\":\"2021-01-18T10:10:00Z\"},{\"Name\":\"Heading\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"Better Together\"},{\"Name\":\"Rss
> >> Link\",\"Type\":\"General
> >> Link\",\"Value\":\"\"},{\"Name\":\"Providers\",\"Type\":\"Treelist\",\"Value\":\"\"},{\"Name\":\"Main
> >> Blog Image Caption\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"\"},{\"Name\":\"Procedures and
> >> Treatments\",\"Type\":\"Multilist\",\"Value\":\"\"},{\"Name\":\"Special
> >> Services\",\"Type\":\"Treelist\",\"Value\":\"\"},{\"Name\":\"Page
> >> Title\",\"Type\":\"Single-Line Text\",\"Value\":\"Ronald McDonald House,
> >> St. Luke’s Children’s finding new ways to help
> >> families\"},{\"Name\":\"ShareImage\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"Share
> >> Image\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"Channels\",\"Type\":\"Multilist\",\"Value\":\"Better
> >> Together\"},{\"Name\":\"Blog
> >> Tags\",\"Type\":\"Treelist\",\"Value\":\"\"},{\"Name\":\"TileHeadline\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"Ronald McDonald House, St. Luke’s Children’s find new
> >> ways to help families\"},{\"Name\":\"Include in
> >> Sitemap\",\"Type\":\"Checkbox\",\"Value\":\"1\"},{\"Name\":\"Facilities\",\"Type\":\"Treelist\",\"Value\":\"\"},{\"Name\":\"TileImage\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"Tile
> >> Category\",\"Type\":\"Droptree\",\"Value\":\"Blog
> >> Post\"},{\"Name\":\"BreadcrumbTitle\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"Ronald McDonald House, St. Luke’s Children’s find new
> >> ways to help
> >> families\"},{\"Name\":\"Author\",\"Type\":\"Droplink\",\"Value\":\"{E9CF1FC9-EF41-4B6F-9D78-F206A5997A84}\"},{\"Name\":\"Restricted
> >> To Pages\",\"Type\":\"Treelist\",\"Value\":\"\"},{\"Name\":\"Meta
> >> Keywords\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"\"},{\"Name\":\"Associated Content
> >> Type\",\"Type\":\"Droptree\",\"Value\":\"Blog Post\"},{\"Name\":\"Main Blog
> >> Image\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"TileSummary\",\"Type\":\"Rich
> >> Text\",\"Value\":\"\"},{\"Name\":\"Meta Description\",\"Type\":\"Multi-Line
> >> Text\",\"Value\":\"\"},{\"Name\":\"Health
> >> Topics\",\"Type\":\"Multilist\",\"Value\":\"\"},{\"Name\":\"Icon\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"NavigationTitle\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"St. Luke’s Blogs\"},{\"Name\":\"Include in Search
> >> Index\",\"Type\":\"Checkbox\",\"Value\":\"1\"},{\"Name\":\"Conditions\",\"Type\":\"Treelist\",\"Value\":\"\"},{\"Name\":\"Heading
> >> Sub Text\",\"Type\":\"Single-Line Text\",\"Value\":\"Highlights from St.
> >> Luke’s and our community partners to improve
> >> health.\"},{\"Name\":\"typeaheadRollupCat\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"\"},{\"Name\":\"BlogPostYear\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"2021\"},{\"Name\":\"AuthorName\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"Anna
> >> Fritz\"},{\"Name\":\"BlogCategory\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"News and
> >> Community\"}],\"Modules\":{\"Fields\":[{\"Name\":\"Content\",\"Type\":\"Rich
> >> Text\",\"Value\":\"\"},{\"Name\":\"Image
> >> Position\",\"Type\":\"Droptree\",\"Value\":\"Right\"},{\"Name\":\"Image
> >> Source\",\"Type\":\"Image\",\"Value\":\"\"},{\"Name\":\"Image
> >> Content\",\"Type\":\"Rich Text\",\"Value\":\"<p>For more than three
> >> decades, Ronald McDonald House Charities of Idaho has provided housing to
> >> families with children seeking medical care.</p>\\n<p>It also has found new
> >> ways to help during the COVID-19 era. </p>\\n<p>When the novel coronavirus
> >> gained a foothold in Idaho in March 2020, the organization had to put
> >> safety first and made the tough decision to temporarily stop accepting new
> >> families not already staying at its new Boise house. Instead, the
> >> organization paid for hotel rooms for families it could not
> >> accommodate.&nbsp;</p>\\n<p>“When everything happened, because we had to
> >> pull back services, we were trying to look for other ways we could help
> >> families,” said Taylor Munson, communications manager at Ronald McDonald
> >> House Charities of Idaho. </p>\\n<p>“They are obviously already in a
> >> stressful situation with a sick child, but the pandemic amplified that
> >> because there is even more unknown now.”</p>\\n<p>So, how could their staff
> >> keep serving families with kids in need? </p>\\n<p>The team at RMHCI
> >> decided to start assembling lunch boxes filled with meals for families with
> >> kids at St. Luke’s Children’s Hospital. </p>\\n<p>Since March, the staff
> >> has provided 4,770 meals to families at St. Luke’s.</p>\\n<p>“The lunches
> >> provided by the Ronald McDonald House have been a true blessing for our
> >> families in pediatrics, the pediatric ICU and the newborn ICU,” said Sherry
> >> Iverson, director of patient and family services at St. Luke’s
> >> Children’s.</p>\\n<p>“Being at the bedside of their children of all ages is
> >> top priority for parents and remembering to take care of themselves is
> >> easily forgotten. These lunches carefully assembled by the Ronald McDonald
> >> team and then delivered to their room provide a break, healthy food and a
> >> chance to reenergize during a very stressful time.” </p>\\n<p>An additional
> >> 920 meals have been provided by RMHCI to families with children receiving
> >> care at Saint Alphonsus Health System. </p>\\n<p>The Ronald McDonald staff
> >> provides the lunch boxes four days a week, typically including sandwiches,
> >> fruit and chips, as well as snack bags. The total cost of the lunches so
> >> far has been about $24,000.</p>\\n<p>“Without these wonderful care
> >> packages, many parents would go all day without food,” Iverson said. “This
> >> partnership has been so important during this COVID
> >> pandemic.”</p>\\n<p>Some of the food items are donated from local
> >> organizations, while others are purchased by staff and assembled in the
> >> kitchen at the new Ronald McDonald House facility, near the St. Luke’s
> >> Boise Medical Center. </p>\\n<p>St. Luke’s employees pick up the meals and
> >> take them across the street to the children’s hospital.</p>\\n<p>“The
> >> feedback that we’ve gotten from families and nurses and people over at St.
> >> Luke’s is that it’s so helpful because families either may not have money
> >> to get food or they don’t want to leave their child’s bedside,” Munson
> >> said.</p>\\n<p>The Ronald McDonald House started accepting new families
> >> again at its facility in May 2020. St. Luke’s Children’s Hospital is the
> >> only children’s hospital in Idaho, which has led to a strong partnership
> >> between the medical center and nearby Ronald McDonald House. </p>\\n<p>“It
> >> has been very collaborative with St. Luke’s. We wanted to make sure what we
> >> were going to be doing was beneficial,” Munson said. </p>\\n<p>The program
> >> will continue through the end of March, marking one full year of providing
> >> meals, and then the Ronald McDonald House staff will reevaluate for short-
> >> and long-term plans, Munson said. </p>\\n<p>“The pandemic obviously isn’t
> >> ideal, but it did allow us find new ways of helping families,” Munson said.
> >> “A lot of our focus is family centered care—that’s really our goal, and
> >> feeding families is a big part of
> >> that.”</p>\\n<p><br>\\n</p>\"},{\"Name\":\"Channel\",\"Type\":\"Droptree\",\"Value\":\"Better
> >> Together\"},{\"Name\":\"Heading\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"Better Together\"},{\"Name\":\"Number to
> >> Display\",\"Type\":\"Integer\",\"Value\":\"4\"},{\"Name\":\"Related
> >> Item\",\"Type\":\"Droptree\",\"Value\":\"St Lukes Childrens
> >> Hospital\"},{\"Name\":\"Heading\",\"Type\":\"Single-Line
> >> Text\",\"Value\":\"Related
> >> Hospital\"}]}},\"Facets\":[\"Blogs\",\"Article\"],\"Title\":\"Ronald
> >> McDonald House, St. Luke’s Children’s find new ways to help
> >> families\",\"Summary\":\"\"}}"
> >> }
> >> 
> >> ________________________________
> >> From: Mikhail Khludnev <m...@apache.org <mailto:m...@apache.org>>
> >> Sent: Saturday, November 26, 2022 1:28 PM
> >> To: users@solr.apache.org <mailto:users@solr.apache.org> 
> >> <users@solr.apache.org <mailto:users@solr.apache.org>>
> >> Subject: Re: Is there a way to run the entire payload of a request through
> >> a charFilter and not just the fields?
> >> 
> >> Hi Matthew. Can it be https: //urldefense. com/v3/__https: //solr. apache.
> >> org/guide/solr/latest/configuration-guide/script-update-processor.
> >> html__;!!FkC3_z_N!ON6B9iNNwK7AkdwAKGpLzLAzNKXR4m8SIom95HENXZNK381f6vhLlbAf5l7Z2mpVNUNJWAP2dw$
> >> ? On Sat,
> >> ZjQcmQRYFpfptBannerStart
> >> This Message Is From an External Sender
> >> This message came from outside the St. Luke's email system.
> >> 
> >> ZjQcmQRYFpfptBannerEnd
> >> 
> >> Hi Matthew.
> >> Can it be
> >> 
> >> https://urldefense.com/v3/__https://solr.apache.org/guide/solr/latest/configuration-guide/script-update-processor.html__;!!FkC3_z_N!ON6B9iNNwK7AkdwAKGpLzLAzNKXR4m8SIom95HENXZNK381f6vhLlbAf5l7Z2mpVNUNJWAP2dw$
> >> <https://urldefense.com/v3/__https://solr.apache.org/guide/solr/latest/configuration-guide/script-update-processor.html__;!!FkC3_z_N!ON6B9iNNwK7AkdwAKGpLzLAzNKXR4m8SIom95HENXZNK381f6vhLlbAf5l7Z2mpVNUNJWAP2dw$%3E%3E>?
> >> 
> >> On Sat, Nov 26, 2022 at 1:15 AM Matthew Castrigno <castr...@slhs.org 
> >> <mailto:castr...@slhs.org>>
> >> wrote:
> >> 
> >>> I need to filter out some characters in a payload so that SOLR will
> >>> recognize the payload as a JSON document.
> >>> 
> >>> The solr.MappingCharFilterFactory functionality is what I need but I need
> >>> to run over the entire payload and not just the fields.
> >>> 
> >>> I cannot change the payload prior to submitting to SOLR.
> >>> 
> >>> Is there any way to accomplish this?
> >>> 
> >>> Any insights are most appreciated.
> >>> 
> >>> Thank you.
> >>> 
> >>> ----------------------------------------------------------------------
> >>> "This message is intended for the use of the person or entity to which it
> >>> is addressed and may contain information that is confidential or
> >>> privileged, the disclosure of which is governed by applicable law. If the
> >>> reader of this message is not the intended recipient, you are hereby
> >>> notified that any dissemination, distribution, or copying of this
> >>> information is strictly prohibited. If you have received this message by
> >>> error, please notify us immediately and destroy the related message."
> >>> 
> >> 
> >> 
> >> --
> >> Sincerely yours
> >> Mikhail Khludnev
> >> 
> >> 
> >> ----------------------------------------------------------------------
> >> "This message is intended for the use of the person or entity to which it
> >> is addressed and may contain information that is confidential or
> >> privileged, the disclosure of which is governed by applicable law. If the
> >> reader of this message is not the intended recipient, you are hereby
> >> notified that any dissemination, distribution, or copying of this
> >> information is strictly prohibited. If you have received this message by
> >> error, please notify us immediately and destroy the related message."
> >> 
> > 
> > 
> > --
> > Sincerely yours
> > Mikhail Khludnev
> > 
> > 
> > ----------------------------------------------------------------------
> > "This message is intended for the use of the person or entity to which it 
> > is addressed and may contain information that is confidential or 
> > privileged, the disclosure of which is governed by applicable law. If the 
> > reader of this message is not the intended recipient, you are hereby 
> > notified that any dissemination, distribution, or copying of this 
> > information is strictly prohibited. If you have received this message by 
> > error, please notify us immediately and destroy the related message."
> 
> "This message is intended for the use of the person or entity to which it is 
> addressed and may contain information that is confidential or privileged, the 
> disclosure of which is governed by applicable law. If the reader of this 
> message is not the intended recipient, you are hereby notified that any 
> dissemination, distribution, or copying of this information is strictly 
> prohibited. If you have received this message by error, please notify us 
> immediately and destroy the related message."

Reply via email to