Re: log4j zero day exploit

2021-12-11 Thread Tim Casey
The vulnerability is quite nasty. If there is a user string logged in a log4j line, then you are vulnerable. I would suspect everyone would need to at least worry about it or risk becoming a bitcoin harvester. tim On Sat, Dec 11, 2021 at 2:19 PM Shawn Heisey wrote: > On 12/11/21 2:05 PM, Scot

Re: Question regarding the MoreLikeThis features

2022-03-10 Thread Tim Casey
Marco, Finding 'similar' documents will end up being weighted by document length. I would recommend, at the point of indexing, also indexing an ordered token set of the first 256, 1024 up to around 5k tokens (depending on document lengths). What this does is allow a vector to vector normalized co

Re: Question regarding the MoreLikeThis features

2022-03-14 Thread Tim Casey
existence of a specific parameter to > restrict the corpus of documents that are analyzed for the return of > similar contents, I must admit that I have not yet figured out how to > proceed. > > Thank you very much and have a nice day, > > Marco > > -Original Message- >

Re: Solr as a dedicated data store?

2022-04-04 Thread Tim Casey
Srijan, Comments off the top of my head, so buyer beware. Almost always you want to be able to reindex your data from a 'source'. This makes things like indexes not good as a data store, or a source of truth. The reasons for this vary. Indexes age out data because there is frequently a weight t

Re: Why OR in query does not work sometimes?

2022-05-16 Thread Tim Casey
You need to look at the documents and see what is being hit and returned. If you have documents with no field, that would be the 8, maybe. If you have 3 documents with 'Phi', then that would be the 3. Once you have a positive clause, you can only remove documents with a '-', so that would be the th

Re: DNS Lookups during requests?

2023-03-22 Thread Tim Casey
There are some settings which are common to apply to the JVM. You might try those first. If you do a search for JVM java DNS settings it will come up. On Wed, Mar 22, 2023 at 7:01 AM Tim Funk wrote: > I ran into an interesting situation today with respect to latency due to > failed DNS lookups.

Re: DNS Lookups during requests?

2023-03-22 Thread Tim Casey
and they didn't seem to have any effect > during the triage. > > -Tim > > On Wed, Mar 22, 2023 at 12:44 PM Tim Casey wrote: > > > There are some settings which are common to apply to the JVM. You might > > try those first. > > If you do a search for JVM

Re: Compound words in English

2023-08-15 Thread Tim Casey
Index all diagrams. If you use a dictionary then there is a lot of work to maintain it. Also this does not translate well to other languages. The downside to this is having partial token hits which decrease precision. But, usually people who are looking for "well being" or "wellbeing" will not e