[ https://issues.apache.org/jira/browse/SOLR-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745047#comment-17745047 ]
Mark Robert Miller commented on SOLR-16812: ------------------------------------------- If you took these benchmarks and data points for this case into a serious room, you’d be ignored or kicked out; it’s worth noting since I don’t see any comment acknowledging awareness of how misleading a cherry-pick they are. Not that you’d JSON complain, it comes off looking relatively fabulous. It’s never really the work itself that scares you, either terrible or excellent, it’s when there is a lack of communication that indicates one has some grasp and idea about what there is to be done and what they have done. You lay out those cards, and even the most rushed code or worst implementation becomes palatable. It’s when you just say, I’ve benchmarked this thing, here is a little data point, here is a big one, (two synthetic benchmarks that I’m happy to extend as a given for the sake of argument, as being of the type the JMH team themselves gold star review), conclusion therefore ABC. You can find the little code here and the big code there and try it yourself if you need. Case closed, ship. Now that’s scary. Wait what? No mention at all about a realistic gauge of what should be done here and what was? Even the mention, and it’s like all the work saved but still the ghosts die. “Ok, at least they know what they are doing. I may not agree with it, but they know what they are doing.” You could just go look at how someone involved in a real binary protocol project would approach any kind of even minimal comparison around performance, and this comparison would look like 99% of those Elastic vs Solr shootouts where a super fan pits a formula one-car against a NASCAR and does some super fan mechanical tweaking to make a “fair” comparison. We loaded each one up, we pulled back hard, let her rip, and you won’t believe what one defaults to a straight-up query cache. The winner of course.” If you took this as a PR to anyone involved in any of these protocols, you would get back a Hossman level of bullet points and a professors level of projected content. Don’t need benchmarks reasonable to the change to be on board with Solr getting out of the binary protocol business though. You could tell me it’s to something you properly benchmarked slower and I’d still be +1 on CBOR, CaptainJackProto Hack, or anything you named. > Support CBOR format for update/query > ------------------------------------ > > Key: SOLR-16812 > URL: https://issues.apache.org/jira/browse/SOLR-16812 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Noble Paul > Assignee: Noble Paul > Priority: Major > Fix For: 9.3 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Javabin is quite efficient and fast . But non-java users have to use JSON > exclusively > > [CBOR |http://example.com/] is a widely used format that is supported by most > languages. > > Here is a benchmark of updating using CBOR vs. JSON our films.json > {code:java} > Payload Size (bytes) > ============ > > json : 633600 > cbor : 210439 > javabin: 234520 > time taken to index > ==================== > JSON: 330ms > JAVABIN: 216ms > CBOR: 200ms > time takes to query *:* 1100 docs > ================================== > json: 85 ms > javabin : 64ms > cbor : 53ms{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org