You can look into NLP https://metacpan.org/search?q=nlp
On Mon, 13 Sept 2021 at 21:04, Julius Hamilton <juliushamilton...@gmail.com> wrote: > Hey, > > I'm not sure if this is possible, and if it's not, I'll explore a better > way to do this. > > I would like to write a script which analyzes if a line of text is > (likely) a broken natural language sentence, i.e., it is probably part of a > sentence, even if the start or end is not present, rather than it being a > fully "complete" linguistic entity, for example, a header of a section, > which does not have a period at the end and is not really a sentence, yet > is in a complete and unbroken form. > > I'm pretty sure in principle this will require some kind of syntax > parsing. I think I read somewhere regular expressions for some mathematical > reason cannot parse tree / nested structures, for example HTML. > > Does anyone know what some next most ubiquitous, standard tool is for > analyzing nested linguistic structures? Is that an XML parser? > > Thanks very much, > Julius >