On Tue, 2019-10-15 at 20:34 -0700, raylu wrote: > On Tue, Oct 15, 2019 at 3:35 PM Alan Hodgson <ahodg...@lists.simkin.ca > > wrote: > > My company has found the pg_trm extension to be more useful for > > partial text searches than the full text functions. I don't know > > specifically how it might help with your hyphens but it would be > > worth testing. The docs actually suggest using them in conjunction > > in some cases. > > We actually do use pg_trgm already for the names/titles of > things.Indexing the content with a trigram index and then > doingLOWER(content) LIKE '%789-xyz%' would certainly work, but1. we'd > have to do a little bit of finagling if we wanted to match onword > boundaries (don't match '6789-xyza' in the above example)2. trigram > indexes are pretty huge for long documents, which is why wecurrently > only use them for names/titles > We may give up and just use pg_trgm for contents if nothing else > worksout but it feels like the text search lexer is _so_ close to what > wewant.
Maybe you could have a trigger pull out those specific hypenated references into a separate column when the document is added or updated, and store/index those separately?