On Mon, 26 Jul 1999 14:03:37 +0200 (MET DST), Jean-Marc Lasgouttes
wrote:

>Arnd>  I'd like a safe and *efficient* method for conversion to
>Arnd> printable ASCII. Steal such a thing from somewhere to use it
>Arnd> with class LString?
>
>Use =XX (where XX is the hex value) like MIME does. This is already
>used in InsetRef::escape(). This method should be moved somewhere else
>(LString?) and used in both cases.
>
>Would you like that? It creates strange names, but not worse than what
>mime does.
>
>JMarc
>

This one is fast and simple, tested and safe on my box. (Most bad chars
are replaced by x). Only remaining problem is a trailing period without
extension; this may be changed when you simplify the extension handling
code in filetools.C. 

With Andre's code I did not understand, how he would actually test for
the best replacement (testing individually per char through a
replacement table for UNICODE?); though the idea of best ascii match
was nice. 

Jean's would be fast and simple, but very ugly ('Keep it simple!').

LString I found nice, filetools interesting: Is it possible to build a
shared library with a simple and consistent interface out of such tool
classes? (The LyTK base classes library?) At least on OS/2 shared
libraries (with automatically created ordinal number entry points) have
a simpler structure and are faster and more efficient, they normally
use less memory (than named entry points within) an executable. They
enforce stricter modularity and greater abstraction. The kernel
interface uses this feature.

Contains few more DOS path handling patches, too. Unfortunately
necessary, because EMX could not decide, whether to prefer UNIX or DOS
style path handling, we find a mixture of everything: 

In case of future changes of path handling, please consider:

         X:/, x:/, /, \, X:\, x:\ 
(even this one 'X:\\foo\\bar\\' I've seen working in a file selector)

are all valid descriptions of one and the same root of device X, etc.
and so forth; even a chaotic mixture of / and \ is still valid and
identical. Upper and lower case is silently disregarded; all those
paths are treated as equal. 

It would be a great relief, Jean-Marc or Lars, if you kept an eye
('paths police') on this wierd problem. Chasing later through the whole
of the code to find all path identifiers that cause a hanging system or
a crash in LaTeX is much less fun! So please have a heart for poor
porters!

---------------------snip-----------------------------------

diff -p -N -r -U 4 -X excl.tmp src/original/filetools.C
src/modified/filetools.C
--- src/original/filetools.C    Wed May 12 06:51:18 1999
+++ src/modified/filetools.C    Mon Jul 26 15:31:40 1999
@@ -72,34 +72,19 @@ bool IsSGMLFilename(LString const & file
        return filename.contains(".sgml");
 }
 
 
-// Substitutes spaces with underscores in filename (and path)
+// Substitute chars that LaTeX or shell can't handle with safe ones
 LString SpaceLess(LString const & file)
 {
-       LString name = OnlyFilename(file);
-       LString path = OnlyPath(file);
-       // Substitute chars that LaTeX can't handle with safe ones
-       name.subst('~','-');
-       name.subst('$','S');
-       name.subst('%','_');
-       name.subst('\'','_');
-
-       // Substitute some more chars that LaTeX or /bin/sh can't
handle   -ck-
-       name.subst('`', '_');
-       name.subst('"', '_');
-       name.subst('&', 'G');
-       name.subst('|', 'I');
-       name.subst(';', ':');
-       name.subst('(', 'c');
-       name.subst(')', 'C');
-       name.subst('<', 'k');
-       name.subst('>', 'K');
-       
-       LString temp = AddName(path, name);
+       LString temp = OnlyFilename(file);
+       temp.toAsciiAlnum();    /* AHanses:  if >127, subtract
+                               ** 128, 256, etc., put REPLACER */
+
+       temp = AddName(OnlyPath(file), temp);
+
        // Replace spaces with underscores, also in directory
        temp.subst(' ','_');
-
        return temp;
 }
 
 
@@ -208,8 +193,13 @@ LString FileOpenSearch (LString const & 
 {
        LString real_file, path_element;
        LString tmppath = path;
        bool notfound = true;
+       
+#ifdef __EMX__   /* SMiyata: This should fix searchpath bug. */
+       tmppath.subst('\\', '/');
+       tmppath.lowercase();
+#endif
 
        tmppath.split(path_element, ';');
        
        while (notfound && !path_element.empty()) {
@@ -461,17 +451,22 @@ LString OnlyPath(LString const &Filename
 {
        // If empty filename, return empty
        if (Filename.empty()) return Filename;
 
+       LString temp = Filename;
+
+#ifdef __EMX__   /* SMiyata: This should fix searchpath bug. */
+       temp.subst('\\', '/');
+       temp.lowercase();
+#endif
        // Find last / or start of filename
-       int j = Filename.length() - 1;
-       for (; j > 0 && Filename[j] != '/'; j--);
+       int j = temp.length() - 1;
+       for (; j > 0 && temp[j] != '/'; j--);
 
-       if (Filename[j] != '/')
+       if (temp[j] != '/')
                return "./";
        else {
                // Strip to pathname
-               LString temp = Filename;
                return temp.substring(0, j);
        }
 }
 
@@ -575,17 +570,24 @@ LString OnlyFilename(LString const &File
 {
        // If empty filename, return empty
        if (Filename.empty()) return Filename;
 
+       LString temp = Filename;
+
+#ifdef __EMX__   /* SMiyata: This should fix searchpath bug. */
+       temp.subst('\\', '/');
+       temp.lowercase();
+#endif
+
        int j;
        // Find last / or start of filename
-       for (j=Filename.length()-1; Filename[j] != '/' && j>0; j--);
+       for (j=temp.length()-1; temp[j] != '/' && j>0; j--);
 
        // Skip potential /
        if (j!=0) j++;
 
        // Strip to basename
-       LString temp = Filename;
+//     LString temp = Filename;
        return temp.substring(j, temp.length()-1);
 }
 
 
@@ -593,10 +595,15 @@ LString OnlyFilename(LString const &File
 bool AbsolutePath(LString const &path)
 {
 #ifndef __EMX__
        return (!path.empty() && path[0]=='/');
-#else
-       return (!path.empty() && (path[0]=='/' || (isalpha((unsigned
char) path[0]) && path[1]==':')));
+#else          /* AHanses: Handle DOS-style paths ('\') */
+       return (!path.empty() &&
(
+                                                                       
                                        (path[0]=='/' || path[0]=='\\'
||
+                                                                       
                                        (isalpha((unsigned char)
path[0])
&&
+                                                                       
                                                path[1]==':'))
+                                                                       
                                        )
+                                       );
 #endif
 }
 
 
@@ -609,8 +616,9 @@ LString ExpandPath(LString const &path)
        if (AbsolutePath(RTemp))
                return RTemp;
 
        LString Temp;
+
        LString copy(RTemp);
 
        // Split by next /
        RTemp.split(Temp, '/');
@@ -635,15 +643,22 @@ LString NormalizePath(LString const &pat
        LString TempBase;
        LString RTemp;
        LString Temp;
 
-       if (AbsolutePath(path))
+       if (AbsolutePath(path)) {
                RTemp = path;
-       else
+       }
+       else {
                // Make implicit current directory explicit
                RTemp = "./" +path;
-
+       }
        while (!RTemp.empty()) {
+       
+#ifdef __EMX__   /* SMiyata: This should fix searchpath bug. */
+               RTemp.subst('\\', '/');
+               RTemp.lowercase();
+#endif
+       
                // Split by next /
                RTemp.split(Temp, '/');
                
                if (Temp==".") {
@@ -694,10 +709,17 @@ LString ReplaceEnvironmentPath(LString c
        const LString RegExp("*}*"); // Exist EndChar inside a String?
 
        if (path.empty()) return path; // nothing to do.
 
+       LString temppath = path;
+       
+#ifdef __EMX__   /* SMiyata: This should fix searchpath bug. */
+       temppath.subst('\\', '/');
+       temppath.lowercase();
+#endif
+
 // first: Search for a '$' - Sign.
-       LString copy(path);
+       LString copy(temppath);
     LString result1(copy);    // for split-calls
        LString result0 = copy.split(result1, CompareChar);
        while (!result0.empty()) {
                LString copy1(result0); // contains String after $
@@ -771,9 +793,57 @@ LString MakeRelPath(LString const & absp
 {
        // This is a hack. It should probaly be done in another way.
Lgb.
        if (abspath.empty())
                return "<unknown_path>";
+               
+       /* AHanses: Duplicating instead of cluttering with #ifdef */
+#ifdef __EMX__   /* SMiyata: This should fix searchpath bug. */
+
+       const int abslen = abspath.length();    /* AHanses: I use
the 'const'     */
+       const int baselen = basepath.length();  /* as far as
possible, for speed */
+       
+       int i = 0;
+       LString tempabspath=abspath;
+       LString tempbasepath=basepath;
+       tempabspath.subst('\\', '/');
+       tempabspath.lowercase();
+       tempbasepath.subst('\\', '/');
+       tempbasepath.lowercase();
+
+       // Find first different character
+       while (i < abslen && i < baselen && tempabspath[i] ==
tempbasepath[i]) ++i;
+
+       // Go back to last /
+       if (i < abslen && i < baselen
+           || (i<abslen && tempabspath[i] != '/' && i==baselen)
+           || (i<baselen && tempbasepath[i] != '/' && i==abslen))
+       {
+               if (i) --i;     // here was the last match
+               while (i && tempabspath[i] != '/') --i;
+       }
 
+       if (i == 0) {
+               // actually no match - cannot make it relative
+               return tempabspath;
+       }
+
+       // Count how many dirs there are in basepath above match
+       // and append as many '..''s into relpath
+       LString buf;
+       int j = i;
+       while (j < baselen) {
+               if (tempbasepath[j] == '/') {
+                       if (j+1 == baselen) break;
+                       buf += "../";
+               }
+               ++j;
+       }
+
+       // Append relative stuff from common directory to abspath
+       if (tempabspath[i] == '/') ++i;
+       for (; i<abslen; ++i)
+               buf += tempabspath[i];
+#else
        const int abslen = abspath.length();
        const int baselen = basepath.length();
        
        // Find first different character
@@ -809,8 +879,9 @@ LString MakeRelPath(LString const & absp
        // Append relative stuff from common directory to abspath
        if (abspath[i] == '/') ++i;
        for (; i<abslen; ++i)
                buf += abspath[i];
+#endif
        // Remove trailing /
        if (buf.suffixIs('/'))
                buf.substring(0,buf.length()-2);
        // Substitute empty with .
@@ -818,30 +889,57 @@ LString MakeRelPath(LString const & absp
                buf = '.';
        return buf;
 }
 
-
 // Append sub-directory(ies) to a path in an intelligent way
 LString AddPath(LString const & path, LString const & path2)
 {
        LString buf;
+       
+#ifdef __EMX__         /* AHanses: Handle DOS-style paths ('\')
*/
+       if ( !path.empty() && path != "." && path != "./" &&
+               path != ".\\" )
+       {
+               buf = path;
+               /* SMiyata: This should fix searchpath bug. */
+               buf.subst('\\', '/');
+               buf.lowercase();
+               if (!path.suffixIs('/'))
+                       buf += '/';
+       }
+
+       if (!path2.empty())
+       {
+               /* SMiyata: This should fix searchpath bug. */
+               LString tmppath2 = path2;
+               tmppath2.subst('\\', '/');
+               tmppath2.lowercase();
+               int p2start = 0;
+               while (tmppath2[p2start] == '/') p2start++;
+
+               int p2end = tmppath2.length()-1;
+               while (tmppath2[p2end] == '/') p2end--;
 
+               tmppath2.substring(p2start,p2end);
+               buf += tmppath2 + '/';
+#else
        if (!path.empty() && path != "." && path != "./") {
                buf = path;
                if (!path.suffixIs('/'))
                        buf += '/';
        }
 
        if (!path2.empty()){
-               int p2start = 0;
+               int p2start = 0;
                while (path2[p2start] == '/') p2start++;
 
                int p2end = path2.length()-1;
                while (path2[p2end] == '/') p2end--;
 
-               LString tmp = path2;
-               tmp.substring(p2start,p2end);
-               buf += tmp + '/';
+               LString tmppath2 = path2;
+               tmppath2.substring(p2start,p2end);
+               buf += tmppath2 + '/';
+#endif
        }
        return buf;
 }
 
diff -p -N -r -U 4 -X excl.tmp src/original/filetools.h
src/modified/filetools.h
--- src/original/filetools.h    Wed May 12 06:51:22 1999
+++ src/modified/filetools.h    Sun Jul 25 16:52:00 1999
@@ -214,13 +214,13 @@ LString AddName(LString const &Path, LSt
 
 /// Append sub-directory(ies) to path in an intelligent way
 LString AddPath(LString const & path, LString const & path2);
 
-/** Change extension of oldname to extension.
- If no_path is true, the path is stripped from the filename.
- If oldname does not have an extension, it is appended.
- If the extension is empty, any extension is removed from the name.
- */
+/* Change extension of oldname to extension.
+** If no_path is true, the path is stripped from the filename.
+** If oldname does not have an extension, it is appended.
+** If the extension is empty, any extension is removed from the name.
+*/
 LString ChangeExtension(LString const & oldname, LString const &
extension, 
                        bool no_path);
 
 /// Create absolute path. If impossible, don't do anything
diff -p -N -r -U 4 -X excl.tmp src/original/LString.C
src/modified/LString.C
--- src/original/LString.C      Mon Oct 26 15:17:18 1998
+++ src/modified/LString.C      Sun Jul 25 18:05:28 1999
@@ -162,9 +162,10 @@ LString &LString::clean()
 char& LString::operator[](int i)
 {
 #ifdef DEVEL_VERSION
        if (i < 0 || i >= length()) {
-               fprintf(stderr,"LString::operator[]: Index out of
range: '%s' %d\n", p->s, i);
+               fprintf(stderr,"LString::operator[]: Index out of
range: '%s' %d\n",
+                        p->s, i);
                abort();
        }
 #endif
 
@@ -493,8 +494,25 @@ bool LString::suffixIs(char const * suf)
                return strncmp(p->s + (length()-suflen), suf,
suflen)==0;
 }
 
 
+/* SMiyata: To bitwise AND 0x7f a string is more efficient than
subst();
+** ISO-8859-x and EUC (Extended Unix Code) works just fine with this.
+** The only remaining problem is that Microsoft/IBM codepages, 
+** Shift-JIS and Big5 utilizes the region 0x80-0x9f which will be 
+** converted to non-printable control characters. 
+*/
+LString& LString::toAsciiAlnum()
+{
+       for (int i=0; i<length(); i++) {
+               p->s[i] &= MASK_RANGE;  /* make test result
foreseeable */
+               if (!(isalnum(p->s[i])) && p->s[i] !=
EXTENSION_MARKER)
+                       p->s[i] = REPLACER;
+       }
+        return *this;
+}
+
+
 LString& LString::subst(char oldchar, char newchar)
 {
        for (int i=0; i<length() ; i++)
                if (p->s[i]==oldchar)
@@ -524,9 +542,13 @@ LString& LString::subst(char const * old
 
 LString& LString::lowercase()
 {
        for (int i=0; i<length() ; i++)
+#ifdef _EMX_ /* AHanses: This macro handles non ASCII characters
according to locale */
+               p->s[i] = _nls_tolower((unsigned char) p->s[i]); /*
AHanses: DBCS unsafe */
+#else
                p->s[i] = tolower((unsigned char) p->s[i]);
+#endif
        return *this;
 }
 
 
diff -p -N -r -U 4 -X excl.tmp src/original/LString.h
src/modified/LString.h
--- src/original/LString.h      Mon Oct 26 15:17:20 1998
+++ src/modified/LString.h      Mon Jul 26 15:36:54 1999
@@ -26,8 +26,12 @@
 // should go (JMarc)
 #include <strings.h>
 #else
 #include <string.h>
+# ifdef __EMX__ 
+/* AHanses: Macros handling non ASCII chars according to locale */
+#  include <sys/nls.h>
+# endif
 #endif
 
 /** A string class for LyX
   
@@ -208,8 +212,20 @@ public:
 
        /// Does the string end with this suffix?
        bool suffixIs(char const *) const;
        
+       /// Convert whole string bit-wise to LaTeXable ASCII 
+       /* SMiyata: Bitwise AND 0x7f the string, then replace
+       ** AHanses: non-alphanumeric characters with REPLACER.
+       ** Exception: Preserve non alnum EXTENSION_MARKER.
+       ** More efficient than subst() per char.
+       */
+       #define MASK_RANGE      0x7f    /* ASCII range  */
+       #define REPLACER        120     /* ASCII 'x'    */
+       #define EXTENSION_MARKER 46     /* ASCII '.'    */
+
+       LString& toAsciiAlnum();
+
        /// Substitute all "oldchar"s with "newchar"
        LString& subst(char oldchar, char newchar);
 
        /// Substitutes all instances of oldstr with newstr
-----------------snap---------------------------------------

Reply via email to