[PATCH] get rid of Paragraph::isWord (was: Re: [PATCH] make insetlatexaccents real letters)

Jean-Marc Lasgouttes Wed, 17 Nov 2004 03:10:07 -0800

>>>>> "Jean-Marc" == Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes:


Jean-Marc> OK, here is the difference between the inclusion and the
Jean-Marc> exclusion method: isWord considers the following characters
Jean-Marc> as part of a word, while isLetter does not:
Jean-Marc> "$'*<=>_`|0123456789

Jean-Marc> I think that all the non-numeric signs are just an error
Jean-Marc> and should be in IsKommaChar.

Jean-Marc> Concerning the digits, both openoffice and word consider
Jean-Marc> that they are part of words as far as navigation is
Jean-Marc> concerned. However, the spellchecker skips over words
Jean-Marc> containing digits. Is this what we should do?

The following patch gets rid of Paragraph::isWord and of various
things that are not needed anymore due to this. Note that as a
consequence numbers are considered as part of words. This is what
everybody else does, but it breaks spellcheking of words like Hello12,
which should actually be skipped.

This means that we now have a unique notion of what a word is.

Comments?

JMarc

Index: src/ChangeLog
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/ChangeLog,v
retrieving revision 1.2039
diff -u -p -r1.2039 ChangeLog
--- src/ChangeLog	17 Nov 2004 00:54:17 -0000	1.2039
+++ src/ChangeLog	17 Nov 2004 10:57:51 -0000
@@ -1,9 +1,18 @@
+2004-11-16  Jean-Marc Lasgouttes  <[EMAIL PROTECTED]>
+
+	* paragraph.C (isLetter): remove special spellchecker-related
+	code; return true also for digits
+	(isWord, isKomma): remove
+
+	* text.C (cursorRightOneWord, cursorLeftOneWord, getWord): 
+	* lyxfind.C (MatchString()): use isLetter instead of isWord
+
 2004-11-17  Lars Gullik Bjonnes  <[EMAIL PROTECTED]>
 
 	* pariterator.h (operatir=): comment out un-implemented member
 	function. 
 
-	* paragraph.h: resolv ambiguity found by gcc 4.0 with the use of a
+	* paragraph.h: resolve ambiguity found by gcc 4.0 with the use of a
 	static cast.
 
 2004-11-17  Lars Gullik Bjonnes  <[EMAIL PROTECTED]>
Index: src/lyxfind.C
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/lyxfind.C,v
retrieving revision 1.86
diff -u -p -r1.86 lyxfind.C
--- src/lyxfind.C	5 Oct 2004 10:11:27 -0000	1.86
+++ src/lyxfind.C	17 Nov 2004 10:57:51 -0000
@@ -85,10 +85,10 @@ public:
 
 		// if necessary, check whether string matches word
 		if (mw) {
-			if (pos > 0 && par.isWord(pos - 1))
+			if (pos > 0 && par.isLetter(pos - 1))
 				return false;
 			if (pos + lyx::pos_type(size) < parsize
-			    && par.isWord(pos + size));
+			    && par.isLetter(pos + size));
 				return false;
 		}
 
Index: src/paragraph.C
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/paragraph.C,v
retrieving revision 1.390
diff -u -p -r1.390 paragraph.C
--- src/paragraph.C	15 Nov 2004 13:39:06 -0000	1.390
+++ src/paragraph.C	17 Nov 2004 10:57:51 -0000
@@ -53,7 +53,6 @@
 
 using lyx::pos_type;
 
-using lyx::support::contains;
 using lyx::support::subst;
 
 using std::distance;
@@ -1510,34 +1509,15 @@ bool Paragraph::isLineSeparator(pos_type
 }
 
 
-bool Paragraph::isKomma(pos_type pos) const
-{
-	return IsKommaChar(getChar(pos));
-}
-
-
 /// Used by the spellchecker
 bool Paragraph::isLetter(pos_type pos) const
 {
-	value_type const c = getChar(pos);
-	if (IsLetterChar(c))
-		return true;
 	if (isInset(pos))
 		return getInset(pos)->isLetter();
-	// We want to pass the ' and escape chars to ispell
-	string const extra = lyxrc.isp_esc_chars + '\'';
-	return contains(extra, c);
-}
-
-
-bool Paragraph::isWord(pos_type pos) const
-{
-	if (isInset(pos))
-		return getInset(pos)->isLetter();
-	value_type const c = getChar(pos);
-	return !(IsSeparatorChar(c)
-		  || IsKommaChar(c)
-		  || IsInsetChar(c));
+	else {
+		value_type const c = getChar(pos);
+		return IsLetterChar(c) || IsDigit(c);
+	}
 }
 
 
Index: src/paragraph.h
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/paragraph.h,v
retrieving revision 1.141
diff -u -p -r1.141 paragraph.h
--- src/paragraph.h	17 Nov 2004 00:54:18 -0000	1.141
+++ src/paragraph.h	17 Nov 2004 10:57:51 -0000
@@ -335,12 +335,8 @@ public:
 	bool isSeparator(lyx::pos_type pos) const;
 	///
 	bool isLineSeparator(lyx::pos_type pos) const;
-	///
-	bool isKomma(lyx::pos_type pos) const;
-	/// Used by the spellchecker
+	/// True if the character/inset at this point can be part of a word
 	bool isLetter(lyx::pos_type pos) const;
-	///
-	bool isWord(lyx::pos_type pos) const;
 
 	/// returns -1 if inset not found
 	int getPositionOfInset(InsetBase const * inset) const;
Index: src/text.C
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/text.C,v
retrieving revision 1.585
diff -u -p -r1.585 text.C
--- src/text.C	11 Nov 2004 08:12:10 -0000	1.585
+++ src/text.C	17 Nov 2004 10:57:52 -0000
@@ -1355,10 +1355,10 @@ void LyXText::cursorRightOneWord(LCursor
 	} else {
 		// Skip through initial nonword stuff.
 		// Treat floats and insets as words.
-		while (cur.pos() != cur.lastpos() && !cur.paragraph().isWord(cur.pos()))
+		while (cur.pos() != cur.lastpos() && !cur.paragraph().isLetter(cur.pos()))
 			++cur.pos();
 		// Advance through word.
-		while (cur.pos() != cur.lastpos() && cur.paragraph().isWord(cur.pos()))
+		while (cur.pos() != cur.lastpos() && cur.paragraph().isLetter(cur.pos()))
 			++cur.pos();
 	}
 	setCursor(cur, cur.par(), cur.pos());
@@ -1374,10 +1374,10 @@ void LyXText::cursorLeftOneWord(LCursor 
 	} else {
 		// Skip through initial nonword stuff.
 		// Treat floats and insets as words.
-		while (cur.pos() != 0 && !cur.paragraph().isWord(cur.pos() - 1))
+		while (cur.pos() != 0 && !cur.paragraph().isLetter(cur.pos() - 1))
 			--cur.pos();
 		// Advance through word.
-		while (cur.pos() != 0 && cur.paragraph().isWord(cur.pos() - 1))
+		while (cur.pos() != 0 && cur.paragraph().isLetter(cur.pos() - 1))
 			--cur.pos();
 	}
 	setCursor(cur, cur.par(), cur.pos());
@@ -1797,8 +1797,8 @@ void LyXText::getWord(CursorSlice & from
 	switch (loc) {
 	case lyx::WHOLE_WORD_STRICT:
 		if (from.pos() == 0 || from.pos() == from_par.size()
-		    || !from_par.isWord(from.pos())
-		    || !from_par.isWord(from.pos() - 1)) {
+		    || !from_par.isLetter(from.pos())
+		    || !from_par.isLetter(from.pos() - 1)) {
 			to = from;
 			return;
 		}
@@ -1806,13 +1806,13 @@ void LyXText::getWord(CursorSlice & from
 
 	case lyx::WHOLE_WORD:
 		// If we are already at the beginning of a word, do nothing
-		if (!from.pos() || !from_par.isWord(from.pos() - 1))
+		if (!from.pos() || !from_par.isLetter(from.pos() - 1))
 			break;
 		// no break here, we go to the next
 
 	case lyx::PREVIOUS_WORD:
 		// always move the cursor to the beginning of previous word
-		while (from.pos() && from_par.isWord(from.pos() - 1))
+		while (from.pos() && from_par.isLetter(from.pos() - 1))
 			--from.pos();
 		break;
 	case lyx::NEXT_WORD:
@@ -1825,7 +1825,7 @@ void LyXText::getWord(CursorSlice & from
 	}
 	to = from;
 	Paragraph & to_par = pars_[to.par()];
-	while (to.pos() < to_par.size() && to_par.isWord(to.pos()))
+	while (to.pos() < to_par.size() && to_par.isLetter(to.pos()))
 		++to.pos();
 }
 
Index: src/frontends/controllers/ControlSpellchecker.C
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/frontends/controllers/ControlSpellchecker.C,v
retrieving revision 1.72
diff -u -p -r1.72 ControlSpellchecker.C
--- src/frontends/controllers/ControlSpellchecker.C	17 Sep 2004 16:28:47 -0000	1.72
+++ src/frontends/controllers/ControlSpellchecker.C	17 Nov 2004 10:57:52 -0000
@@ -44,6 +44,7 @@ using std::string;
 namespace lyx {
 
 using support::bformat;
+using support::contains;
 
 namespace frontend {
 
@@ -121,7 +122,10 @@ bool isLetter(DocIterator const & cur)
 	return cur.inTexted()
 		&& cur.inset().allowSpellCheck()
 		&& cur.pos() != cur.lastpos()
-		&& cur.paragraph().isLetter(cur.pos())
+		&& (cur.paragraph().isLetter(cur.pos()) 
+		    // We want to pass the ' and escape chars to ispell
+		    || contains(lyxrc.isp_esc_chars + '\'', 
+				cur.paragraph().getChar(cur.pos())))
 		&& !isDeletedText(cur.paragraph(), cur.pos());
 }
 
Index: src/insets/ChangeLog
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/insets/ChangeLog,v
retrieving revision 1.1075
diff -u -p -r1.1075 ChangeLog
--- src/insets/ChangeLog	15 Nov 2004 13:35:49 -0000	1.1075
+++ src/insets/ChangeLog	17 Nov 2004 10:57:52 -0000
@@ -1,3 +1,7 @@
+2004-11-16  Jean-Marc Lasgouttes  <[EMAIL PROTECTED]>
+
+	* insetspace.C (isLetter): remove (same as default)
+
 2004-11-10  Jean-Marc Lasgouttes  <[EMAIL PROTECTED]>
 
 	* insetlatexaccent.h (isLetter): implement, so that word selection
Index: src/insets/insetspace.C
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/insets/insetspace.C,v
retrieving revision 1.24
diff -u -p -r1.24 insetspace.C
--- src/insets/insetspace.C	5 Oct 2004 12:56:22 -0000	1.24
+++ src/insets/insetspace.C	17 Nov 2004 10:57:52 -0000
@@ -266,11 +266,6 @@ bool InsetSpace::isChar() const
 	return true;
 }
 
-bool InsetSpace::isLetter() const
-{
-	return false;
-}
-
 bool InsetSpace::isSpace() const
 {
 	return true;
Index: src/insets/insetspace.h
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/insets/insetspace.h,v
retrieving revision 1.22
diff -u -p -r1.22 insetspace.h
--- src/insets/insetspace.h	21 Nov 2003 16:35:46 -0000	1.22
+++ src/insets/insetspace.h	17 Nov 2004 10:57:52 -0000
@@ -81,8 +81,6 @@ public:
 
 	// should this inset be handled like a normal charater
 	bool isChar() const;
-	/// is this equivalent to a letter?
-	bool isLetter() const;
 	/// is this equivalent to a space (which is BTW different from
 	// a line separator)?
 	bool isSpace() const;
Index: src/support/ChangeLog
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/support/ChangeLog,v
retrieving revision 1.275
diff -u -p -r1.275 ChangeLog
--- src/support/ChangeLog	16 Nov 2004 23:18:46 -0000	1.275
+++ src/support/ChangeLog	17 Nov 2004 10:57:52 -0000
@@ -1,3 +1,7 @@
+2004-11-16  Jean-Marc Lasgouttes  <[EMAIL PROTECTED]>
+
+	* textutils.h (isKommaChar): remove
+
 2004-11-16  Lars Gullik Bjonnes  <[EMAIL PROTECTED]>
 
 	* forkedcontr.C (find_pid): simplify and also make pass concept
@@ -13,7 +17,7 @@
 
 2004-11-07  Lars Gullik Bjonnes  <[EMAIL PROTECTED]>
 
-	* Make it clearer where include files are comming from.
+	* Make it clearer where include files are coming from.
 
 2004-11-06  Lars Gullik Bjonnes  <[EMAIL PROTECTED]>
 
Index: src/support/textutils.h
===================================================================
RCS file: /usr/local/lyx/cvsroot/lyx-devel/src/support/textutils.h,v
retrieving revision 1.27
diff -u -p -r1.27 textutils.h
--- src/support/textutils.h	17 Sep 2004 16:28:47 -0000	1.27
+++ src/support/textutils.h	17 Nov 2004 10:57:52 -0000
@@ -31,36 +31,6 @@ bool IsLineSeparatorChar(char c)
 }
 
 
-/// return true if the char is "punctuation"
-inline
-bool IsKommaChar(char c)
-{
-	return c == ','
-		|| c == '('
-		|| c == ')'
-		|| c == '['
-		|| c == ']'
-		|| c == '{'
-		|| c == '}'
-		|| c == ';'
-		|| c == '.'
-		|| c == ':'
-		|| c == '-'
-		|| c == '?'
-		|| c == '!'
-		|| c == '&'
-		|| c == '@'
-		|| c == '+'
-		|| c == '-'
-		|| c == '~'
-		|| c == '#'
-		|| c == '%'
-		|| c == '^'
-		|| c == '/'
-		|| c == '\\';
-}
-
-
 /// return true if a char is alphabetical (including accented chars)
 inline
 bool IsLetterChar(unsigned char c)

[PATCH] get rid of Paragraph::isWord (was: Re: [PATCH] make insetlatexaccents real letters)

Reply via email to