[PATCH-updated] Bug 3676--Citation Bug

Richard Heck Tue, 31 Jul 2007 21:20:43 -0700

Andre Poenitz wrote:

On Tue, Jul 31, 2007 at 03:00:34PM -0400, Richard Heck wrote:

Index: src/Buffer.h
===================================================================
--- src/Buffer.h        (revision 19264)
+++ src/Buffer.h        (working copy)
@@ -13,7 +13,11 @@
 #define BUFFER_H

#include "DocIterator.h"

+#include "ErrorList.h"
+#include "InsetList.h"

+#include "frontends/controllers/frontend_helpers.h"

Why that? Including stuff from frontends/ in src/ is wrong in theory.

Yes, I agree. This is needed, IIRC (I did this a while ago), for certainconstants, e.g.:static const docstringTheBibliographyRef(from_ascii("TheBibliographyRef"));which was in frontends/controllers/frontend_helpers.cpp and which I'vemoved (in effect) to the header.

There is a lot of BibTeX-related stuff in frontends/controllers/. Ithink most of that should be elsewhere. I think the whole biblionamespace should be moved, probably into src/Biblio.{h,cpp}, orsomething similar. But I guess I thought I should address this firstwithin the existing structure and save the clean-up for later. If youthink I should just do that now, I can do that.

[...]
@@ -1359,11 +1359,12 @@
                                static_cast<InsetBibitem const &>(*it);
                        // FIXME UNICODE
                        string const key = to_utf8(inset.getParam("key"));
-                       docstring const label = inset.getParam("label");
+                       biblio::KeyValMap keyvalmap;
+                       keyvalmap[from_ascii("label")] = 
inset.getParam("label");
                        DocIterator doc_it(it); doc_it.forwardPos();
-                       docstring const ref = 
doc_it.paragraph().asString(*this, false);
-                       docstring const info = label + "TheBibliographyRef" + 
ref;
-                       keys.push_back(pair<string, docstring>(key, info));
+                       keyvalmap [from_ascii("ref")] = 
doc_it.paragraph().asString(*this, false);

No space before [

Thanks.

[...]
-docstring const parseBibTeX(docstring data, std::string const & findkey);
+docstring const getValueForKey(KeyValMap data, std::string const & findkey);

Passing a  std::map<docstring, docstring>  by value seems wrong.

Yes. Made it const &. This required a const_cast, but that seems ok.

        void fillWithBibKeys(Buffer const & buffer,
-               std::vector<std::pair<std::string, docstring> > & keys) const;
+               std::map<std::string, std::map<docstring, docstring> > & keys) 
const;


Maybe a typedef would make this clearer, too.

Done.

Thanks again. Glad someone reads these patches....

Updated patch attached.

Richard

--
==================================================================
Richard G Heck, Jr
Professor of Philosophy
Brown University
http://frege.brown.edu/heck/
==================================================================
Get my public key from http://sks.keyserver.penguin.de
Hash: 0x1DE91F1E66FFBDEC
Learn how to sign your email using Thunderbird and GnuPG at:
http://dudu.dyn.2-h.org/nist/gpg-enigmail-howto

Index: src/Buffer.h
===================================================================
--- src/Buffer.h	(revision 19264)
+++ src/Buffer.h	(working copy)
@@ -13,7 +13,11 @@
 #define BUFFER_H
 
 #include "DocIterator.h"
+#include "ErrorList.h"
+#include "InsetList.h"
 
+#include "frontends/controllers/frontend_helpers.h"
+
 #include "support/FileName.h"
 #include "support/limited_stack.h"
 #include "support/types.h"
@@ -283,7 +287,7 @@
 	void validate(LaTeXFeatures &) const;
 
 	/// return all bibkeys from buffer and its childs
-	void fillWithBibKeys(std::vector<std::pair<std::string, docstring> > & keys) const;
+	void fillWithBibKeys(biblio::BibKeyList & keys) const;
 	/// Update the cache with all bibfiles in use (including bibfiles
 	/// of loaded child documents).
 	void updateBibfilesCache();
Index: src/Buffer.cpp
===================================================================
--- src/Buffer.cpp	(revision 19264)
+++ src/Buffer.cpp	(working copy)
@@ -1333,7 +1333,7 @@
 
 
 // This is also a buffer property (ale)
-void Buffer::fillWithBibKeys(vector<pair<string, docstring> > & keys)
+void Buffer::fillWithBibKeys(biblio::BibKeyList & keys)
 	const
 {
 	/// if this is a child document and the parent is already loaded
@@ -1359,11 +1359,12 @@
 				static_cast<InsetBibitem const &>(*it);
 			// FIXME UNICODE
 			string const key = to_utf8(inset.getParam("key"));
-			docstring const label = inset.getParam("label");
+			biblio::KeyValMap keyvalmap;
+			keyvalmap[from_ascii("label")] = inset.getParam("label");
 			DocIterator doc_it(it); doc_it.forwardPos();
-			docstring const ref = doc_it.paragraph().asString(*this, false);
-			docstring const info = label + "TheBibliographyRef" + ref;
-			keys.push_back(pair<string, docstring>(key, info));
+			keyvalmap[from_ascii("ref")] = doc_it.paragraph().asString(*this, false);
+			keyvalmap[biblio::TheBibliographyRef] = biblio::TheBibliographyRef;
+			keys[key] = keyvalmap;
 		}
 	}
 }
@@ -1725,10 +1726,10 @@
 	vector<docstring> labels;
 
 	if (code == Inset::CITE_CODE) {
-		vector<pair<string, docstring> > keys;
+		biblio::BibKeyList keys;
 		fillWithBibKeys(keys);
-		vector<pair<string, docstring> >::const_iterator bit  = keys.begin();
-		vector<pair<string, docstring> >::const_iterator bend = keys.end();
+		biblio::BibKeyList::const_iterator bit  = keys.begin();
+		biblio::BibKeyList::const_iterator bend = keys.end();
 
 		for (; bit != bend; ++bit)
 			// FIXME UNICODE
Index: src/frontends/controllers/frontend_helpers.h
===================================================================
--- src/frontends/controllers/frontend_helpers.h	(revision 19264)
+++ src/frontends/controllers/frontend_helpers.h	(working copy)
@@ -28,8 +28,18 @@
 
 /** Functions of use to citation and bibtex GUI controllers and views */
 namespace lyx {
+	
 namespace biblio {
+	
+/// First entry is field, second is value
+typedef std::map<docstring, docstring> KeyValMap;
+/// First entry is the bibliography key, second the data
+typedef std::map<std::string, std::map<docstring, docstring> > BibKeyList;
 
+static const docstring TheBibliographyRef(from_ascii("@LyXInfo"));
+static const docstring TheDataString(from_ascii("@BibTeXData"));
+static const docstring TheEntryType(from_ascii("@BibTeXEntryType"));
+
 enum CiteEngine {
 	ENGINE_BASIC,
 	ENGINE_NATBIB_AUTHORYEAR,
@@ -69,34 +79,31 @@
 std::string const asValidLatexCommand(std::string const & input,
 				      CiteEngine const engine);
 
-/// First entry is the bibliography key, second the data
-typedef std::map<std::string, docstring> InfoMap;
-
 /// Returns a vector of bibliography keys
-std::vector<std::string> const getKeys(InfoMap const &);
+std::vector<std::string> const getKeys(BibKeyList const &);
 
 /** Returns the BibTeX data associated with a given key.
-    Empty if no info exists. */
-docstring const getInfo(InfoMap const &, std::string const & key);
+    Empty if the key was not defined in the BibTeX record. */
+docstring const getInfo(BibKeyList const &, std::string const & key);
 
 /// return the year from the bibtex data record
-docstring const getYear(InfoMap const & map, std::string const & key);
+docstring const getYear(BibKeyList const & map, std::string const & key);
 
 /// return the short form of an authorlist
-docstring const getAbbreviatedAuthor(InfoMap const & map, std::string const & key);
+docstring const getAbbreviatedAuthor(BibKeyList const & map, std::string const & key);
 
 // return only the family name
 docstring const familyName(docstring const & name);
 
 /** Search a BibTeX info field for the given key and return the
     associated field. */
-docstring const parseBibTeX(docstring data, std::string const & findkey);
+docstring const getValueForKey(KeyValMap const & data, std::string const & findkey);
 
 /** Returns an iterator to the first key that meets the search
     criterion, or end() if unsuccessful.
 
     User supplies :
-    the InfoMap of bibkeys info,
+    the BibKeyList of bibkeys info,
     the vector of keys to be searched,
     the search criterion,
     an iterator defining the starting point of the search,
@@ -105,7 +112,7 @@
 */
 
 std::vector<std::string>::const_iterator
-searchKeys(InfoMap const & map,
+searchKeys(BibKeyList const & map,
 	   std::vector<std::string> const & keys_to_search,
 	   docstring const & search_expression,
 	   std::vector<std::string>::const_iterator start,
@@ -145,12 +152,12 @@
 
    User supplies :
    the key,
-   the InfoMap of bibkeys info,
+   the BibKeyList of bibkeys info,
    the available citation styles
 */
 std::vector<docstring> const
 getNumericalStrings(std::string const & key,
-		    InfoMap const & map,
+		    BibKeyList const & map,
 		    std::vector<CiteStyle> const & styles);
 
 /**
@@ -162,12 +169,12 @@
 
    User supplies :
    the key,
-   the InfoMap of bibkeys info,
+   the BibKeyList of bibkeys info,
    the available citation styles
 */
 std::vector<docstring> const
 getAuthorYearStrings(std::string const & key,
-		     InfoMap const & map,
+		     BibKeyList const & map,
 		     std::vector<CiteStyle> const & styles);
 
 } // namespace biblio
Index: src/frontends/controllers/frontend_helpers.cpp
===================================================================
--- src/frontends/controllers/frontend_helpers.cpp	(revision 19264)
+++ src/frontends/controllers/frontend_helpers.cpp	(working copy)
@@ -5,6 +5,7 @@
  *
  * \author Angus Leeming
  * \author Herbert Voß
+ * \author Richard Heck (improvements of BibTeX stuff)
  *
  * Full author contact details are available in file CREDITS.
  */
@@ -26,9 +27,6 @@
 
 #include "support/filetools.h"
 #include "support/lstrings.h"
-#include "support/Package.h"
-#include "support/filetools.h"
-#include "support/lstrings.h"
 #include "support/lyxalgo.h"
 #include "support/os.h"
 #include "support/Package.h"
@@ -123,9 +121,6 @@
 	return str;
 }
 
-
-static const docstring TheBibliographyRef(from_ascii("TheBibliographyRef"));
-
 } // namespace anon
 
 
@@ -200,38 +195,29 @@
 }
 
 
-docstring const getAbbreviatedAuthor(InfoMap const & map, string const & key)
+docstring const getAbbreviatedAuthor(BibKeyList const & map, string const & key)
 {
 	BOOST_ASSERT(!map.empty());
 
-	InfoMap::const_iterator it = map.find(key);
+	BibKeyList::const_iterator it = map.find(key);
 	if (it == map.end())
 		return docstring();
-	docstring const & data = it->second;
+	KeyValMap const & data = it->second;
 
 	// Is the entry a BibTeX one or one from lyx-layout "bibliography"?
-	docstring::size_type const pos = data.find(TheBibliographyRef);
-	if (pos != docstring::npos) {
-		if (pos <= 2) {
-			return docstring();
-		}
+	KeyValMap::const_iterator it2 = data.find(TheBibliographyRef);
+	if (it2 != data.end()) 
+		// We don't have any way to tell how the author names might have
+		// been formatted.
+		return docstring();
 
-		docstring const opt = trim(data.substr(0, pos - 1));
-		if (opt.empty())
-			return docstring();
+	docstring author = getValueForKey(data, "author");
 
-		docstring authors;
-		split(opt, authors, '(');
-		return authors;
-	}
-
-	docstring author = parseBibTeX(data, "author");
-
 	if (author.empty())
-		author = parseBibTeX(data, "editor");
+		author = getValueForKey(data, "editor");
 
 	if (author.empty()) {
-		author = parseBibTeX(data, "key");
+		author = getValueForKey(data, "key");
 		if (author.empty())
 			// FIXME UNICODE
 			return from_utf8(key);
@@ -253,36 +239,23 @@
 }
 
 
-docstring const getYear(InfoMap const & map, string const & key)
+docstring const getYear(BibKeyList const & map, string const & key)
 {
 	BOOST_ASSERT(!map.empty());
 
-	InfoMap::const_iterator it = map.find(key);
+	BibKeyList::const_iterator it = map.find(key);
 	if (it == map.end())
 		return docstring();
-	docstring const & data = it->second;
+	KeyValMap const & data = it->second;
 
 	// Is the entry a BibTeX one or one from lyx-layout "bibliography"?
-	docstring::size_type const pos = data.find(TheBibliographyRef);
-	if (pos != docstring::npos) {
-		if (pos <= 2) {
-			return docstring();
-		}
+	KeyValMap::const_iterator it2 = data.find(TheBibliographyRef);
+	if (it2 != data.end()) 
+		// We don't have any way to tell how the entry might have
+		// been formatted.
+		return docstring();
 
-		docstring const opt =
-			trim(data.substr(0, pos - 1));
-		if (opt.empty())
-			return docstring();
-
-		docstring authors;
-		docstring const tmp = split(opt, authors, '(');
-		docstring year;
-		split(tmp, year, ')');
-		return year;
-
-	}
-
-	docstring year = parseBibTeX(data, "year");
+	docstring year = getValueForKey(data, "year");
 	if (year.empty())
 		year = _("No year");
 
@@ -304,11 +277,11 @@
 } // namespace anon
 
 
-vector<string> const getKeys(InfoMap const & map)
+vector<string> const getKeys(BibKeyList const & map)
 {
 	vector<string> bibkeys;
-	InfoMap::const_iterator it  = map.begin();
-	InfoMap::const_iterator end = map.end();
+	BibKeyList::const_iterator it  = map.begin();
+	BibKeyList::const_iterator end = map.end();
 	for (; it != end; ++it) {
 		bibkeys.push_back(it->first);
 	}
@@ -318,72 +291,69 @@
 }
 
 
-docstring const getInfo(InfoMap const & map, string const & key)
+docstring const getInfo(BibKeyList const & map, string const & key)
 {
 	BOOST_ASSERT(!map.empty());
 
-	InfoMap::const_iterator it = map.find(key);
+	BibKeyList::const_iterator it = map.find(key);
 	if (it == map.end())
 		return docstring();
-	docstring const & data = it->second;
+	KeyValMap const & data = it->second;
 
-	// is the entry a BibTeX one or one from lyx-layout "bibliography"?
-	docstring::size_type const pos = data.find(TheBibliographyRef);
-	if (pos != docstring::npos) {
-		docstring::size_type const pos2 = pos + TheBibliographyRef.size();
-		docstring const info = trim(data.substr(pos2));
-		return info;
+	// Is the entry a BibTeX one or one from lyx-layout "bibliography"?
+	KeyValMap::const_iterator it2 = data.find(TheBibliographyRef);
+	if (it2 != data.end()) {
+		KeyValMap::const_iterator it3 = data.find(from_ascii("ref"));
+		return it3->second;
 	}
 
-	// Search for all possible "required" keys
-	docstring author = parseBibTeX(data, "author");
+	//FIXME FIXME FIXME
+	//This could be made alot better using the biblio::TheEntryType
+	//field to customize the output based upon entry type.
+	
+	//Search for all possible "required" fields
+	docstring author = getValueForKey(data, "author");
 	if (author.empty())
-		author = parseBibTeX(data, "editor");
+		author = getValueForKey(data, "editor");
 
-	docstring year      = parseBibTeX(data, "year");
-	docstring title     = parseBibTeX(data, "title");
-	docstring booktitle = parseBibTeX(data, "booktitle");
-	docstring chapter   = parseBibTeX(data, "chapter");
-	docstring number    = parseBibTeX(data, "number");
-	docstring volume    = parseBibTeX(data, "volume");
-	docstring pages     = parseBibTeX(data, "pages");
-	docstring annote    = parseBibTeX(data, "annote");
-	docstring media     = parseBibTeX(data, "journal");
-	if (media.empty())
-		media = parseBibTeX(data, "publisher");
-	if (media.empty())
-		media = parseBibTeX(data, "school");
-	if (media.empty())
-		media = parseBibTeX(data, "institution");
+	docstring year      = getValueForKey(data, "year");
+	docstring title     = getValueForKey(data, "title");
+	docstring docLoc    = getValueForKey(data, "pages");
+	if (docLoc.empty()) {
+		docLoc = getValueForKey(data, "chapter");
+		if (!docLoc.empty())
+			docLoc = from_ascii("Ch. ") + docLoc;
+	}	else 
+		docLoc = from_ascii("pp. ") + docLoc;
+	docstring media     = getValueForKey(data, "journal");
+	if (media.empty()) {
+		media = getValueForKey(data, "publisher");
+		if (media.empty()) {
+			media = getValueForKey(data, "school");
+			if (media.empty())
+				media = getValueForKey(data, "institution");
+		}
+	}
+	docstring volume = getValueForKey(data, "volume");
 
 	odocstringstream result;
 	if (!author.empty())
 		result << author << ", ";
 	if (!title.empty())
 		result << title;
-	if (!booktitle.empty())
-		result << ", in " << booktitle;
-	if (!chapter.empty())
-		result << ", Ch. " << chapter;
 	if (!media.empty())
 		result << ", " << media;
-	if (!volume.empty())
-		result << ", vol. " << volume;
-	if (!number.empty())
-		result << ", no. " << number;
-	if (!pages.empty())
-		result << ", pp. " << pages;
 	if (!year.empty())
 		result << ", " << year;
-	if (!annote.empty())
-		result << "\n\n" << annote;
+	if (!docLoc.empty())
+		result << ", " << docLoc;
 
 	docstring const result_str = rtrim(result.str());
 	if (!result_str.empty())
 		return result_str;
 
 	// This should never happen (or at least be very unusual!)
-	return data;
+	return docstring();
 }
 
 
@@ -420,24 +390,28 @@
 public:
 	// re and icase are used to construct an instance of boost::RegEx.
 	// if icase is true, then matching is insensitive to case
-	RegexMatch(InfoMap const & m, string const & re, bool icase)
+	RegexMatch(BibKeyList const & m, string const & re, bool icase)
 		: map_(m), regex_(re, icase) {}
 
+	//FIXME This should probably be restored to its earlier behavior
 	bool operator()(string const & key) const {
-		// the data searched is the key + its associated BibTeX/biblio
-		// fields
-		string data = key;
-		InfoMap::const_iterator info = map_.find(key);
-		if (info != map_.end())
+		BibKeyList::const_iterator info = map_.find(key);
+ 		if (info == map_.end())
+ 			return false;
+ 
+ 		string data = key;
+		//The machinations here are required because map::operator[] 
+		//has no const version.
+		KeyValMap const kvm = info->second;
+		KeyValMap::const_iterator it = kvm.find(TheDataString);
+		if (it != kvm.end())
 			// FIXME UNICODE
-			data += ' ' + to_utf8(info->second);
-
-		// Attempts to find a match for the current RE
-		// somewhere in data.
+			data += ' ' + to_utf8(it->second);
+ 
 		return boost::regex_search(data, regex_);
 	}
 private:
-	InfoMap const map_;
+	BibKeyList const map_;
 	mutable boost::regex regex_;
 };
 
@@ -445,7 +419,7 @@
 
 
 vector<string>::const_iterator
-searchKeys(InfoMap const & theMap,
+searchKeys(BibKeyList const & theMap,
 	   vector<string> const & keys,
 	   string const & search_expr,
 	   vector<string>::const_iterator start,
@@ -492,142 +466,14 @@
 }
 
 
-docstring const parseBibTeX(docstring data, string const & findkey)
+docstring const getValueForKey(KeyValMap const & data, string const & findkey)
 {
-	// at first we delete all characters right of '%' and
-	// replace tabs through a space and remove leading spaces
-	// we read the data line by line so that the \n are
-	// ignored, too.
-	docstring data_;
-	int Entries = 0;
-	docstring dummy = token(data,'\n', Entries);
-	while (!dummy.empty()) {
-		// no tabs
-		dummy = subst(dummy, '\t', ' ');
-		// no leading spaces
-		dummy = ltrim(dummy);
-		// ignore lines with a beginning '%' or ignore all right of %
-		docstring::size_type const idx =
-			dummy.empty() ? docstring::npos : dummy.find('%');
-		if (idx != docstring::npos)
-			// Check if this is really a comment or just "\%"
-			if (idx == 0 || dummy[idx - 1] != '\\')
-				dummy.erase(idx, docstring::npos);
-			else
-				//  This is "\%", so just erase the '\'
-				dummy.erase(idx - 1, 1);
-		// do we have a new token or a new line of
-		// the same one? In the first case we ignore
-		// the \n and in the second we replace it
-		// with a space
-		if (!dummy.empty()) {
-			if (!contains(dummy, '='))
-				data_ += ' ' + dummy;
-			else
-				data_ += dummy;
-		}
-		dummy = token(data, '\n', ++Entries);
-	}
-
-	// replace double commas with "" for easy scanning
-	data = subst(data_, from_ascii(",,"), from_ascii("\"\""));
-
-	// unlikely!
-	if (data.empty())
+	docstring key = from_ascii(findkey);
+	KeyValMap::const_iterator it = data.find(key);
+	if (it == data.end())
 		return docstring();
-
-	// now get only the important line of the bibtex entry.
-	// all entries are devided by ',' except the last one.
-	data += ',';
-	// now we have same behaviour for all entries because the last one
-	// is "blah ... }"
-	Entries = 0;
-	bool found = false;
-	// parsing of title and booktitle is different from the
-	// others, because booktitle contains title
-	do {
-		dummy = token(data, ',', Entries++);
-		if (!dummy.empty()) {
-			found = contains(ascii_lowercase(dummy), from_ascii(findkey));
-			if (findkey == "title" &&
-			    contains(ascii_lowercase(dummy), from_ascii("booktitle")))
-				found = false;
-		}
-	} while (!found && !dummy.empty());
-	if (dummy.empty())
-		// no such keyword
-		return docstring();
-
-	// we are not sure, if we get all, because "key= "blah, blah" is
-	// allowed.
-	// Therefore we read all until the next "=" character, which follows a
-	// new keyword
-	docstring keyvalue = dummy;
-	dummy = token(data, ',', Entries++);
-	while (!contains(dummy, '=') && !dummy.empty()) {
-		keyvalue += ',' + dummy;
-		dummy = token(data, ',', Entries++);
-	}
-
-	// replace double "" with originals ,, (two commas)
-	// leaving us with the all-important line
-	data = subst(keyvalue, from_ascii("\"\""), from_ascii(",,"));
-
-	// Clean-up.
-	// 1. Spaces
-	data = rtrim(data);
-	// 2. if there is no opening '{' then a closing '{' is probably cruft.
-	if (!contains(data, '{'))
-		data = rtrim(data, "}");
-	// happens, when last keyword
-	docstring::size_type const idx =
-		!data.empty() ? data.find('=') : docstring::npos;
-
-	if (idx == docstring::npos)
-		return docstring();
-
-	data = trim(data.substr(idx));
-
-	// a valid entry?
-	if (data.length() < 2 || data[0] != '=')
-		return docstring();
-	else {
-		// delete '=' and the following spaces
-		data = ltrim(data, " =");
-		if (data.length() < 2) {
-			// not long enough to find delimiters
-			return data;
-		} else {
-			docstring::size_type keypos = 1;
-			char_type enclosing;
-			if (data[0] == '{') {
-				enclosing = '}';
-			} else if (data[0] == '"') {
-				enclosing = '"';
-			} else {
-				// no {} and no "", pure data but with a
-				// possible ',' at the end
-				return rtrim(data, ",");
-			}
-			docstring tmp = data.substr(keypos);
-			while (tmp.find('{') != docstring::npos &&
-			       tmp.find('}') != docstring::npos &&
-			       tmp.find('{') < tmp.find('}') &&
-			       tmp.find('{') < tmp.find(enclosing)) {
-
-				keypos += tmp.find('{') + 1;
-				tmp = data.substr(keypos);
-				keypos += tmp.find('}') + 1;
-				tmp = data.substr(keypos);
-			}
-			if (tmp.find(enclosing) == docstring::npos)
-				return data;
-			else {
-				keypos += tmp.find(enclosing);
-				return data.substr(1, keypos - 1);
-			}
-		}
-	}
+	KeyValMap & data2 = const_cast<KeyValMap &>(data);
+	return data2[key];
 }
 
 
@@ -745,7 +591,7 @@
 
 vector<docstring> const
 getNumericalStrings(string const & key,
-		    InfoMap const & map, vector<CiteStyle> const & styles)
+		    BibKeyList const & map, vector<CiteStyle> const & styles)
 {
 	if (map.empty())
 		return vector<docstring>();
@@ -799,7 +645,7 @@
 
 vector<docstring> const
 getAuthorYearStrings(string const & key,
-		    InfoMap const & map, vector<CiteStyle> const & styles)
+		    BibKeyList const & map, vector<CiteStyle> const & styles)
 {
 	if (map.empty())
 		return vector<docstring>();
Index: src/frontends/controllers/ControlCitation.h
===================================================================
--- src/frontends/controllers/ControlCitation.h	(revision 19264)
+++ src/frontends/controllers/ControlCitation.h	(working copy)
@@ -63,7 +63,7 @@
 	}
 private:
 	/// The info associated with each key
-	biblio::InfoMap bibkeysInfo_;
+	biblio::BibKeyList bibkeysInfo_;
 
 	///
 	static std::vector<biblio::CiteStyle> citeStyles_;
Index: src/frontends/controllers/ControlCitation.cpp
===================================================================
--- src/frontends/controllers/ControlCitation.cpp	(revision 19264)
+++ src/frontends/controllers/ControlCitation.cpp	(working copy)
@@ -12,6 +12,7 @@
 #include <config.h>
 
 #include "ControlCitation.h"
+#include "frontend_helpers.h"
 
 #include "Buffer.h"
 #include "BufferParams.h"
@@ -48,12 +49,8 @@
 
 	bool use_styles = engine != biblio::ENGINE_BASIC;
 
-	vector<pair<string, docstring> > blist;
-	kernel().buffer().fillWithBibKeys(blist);
-	bibkeysInfo_.clear();
-	for (size_t i = 0; i < blist.size(); ++i)
-		bibkeysInfo_[blist[i].first] = blist[i].second;
-
+	kernel().buffer().fillWithBibKeys(bibkeysInfo_);
+	
 	if (citeStyles_.empty())
 		citeStyles_ = biblio::getCiteStyles(engine);
 	else {
@@ -137,23 +134,21 @@
 		// it is treated as a simple string by boost::regex.
 		expr = escape_special_chars(expr);
 
-	boost::regex reg_exp(to_utf8(expr), case_sensitive?
+	boost::regex reg_exp(to_utf8(expr), case_sensitive ?
 		boost::regex_constants::normal : boost::regex_constants::icase);
 
 	vector<string>::const_iterator it = keys_to_search.begin();
 	vector<string>::const_iterator end = keys_to_search.end();
 	for (; it != end; ++it ) {
-		biblio::InfoMap::const_iterator info = bibkeysInfo_.find(*it);
+		biblio::BibKeyList::iterator info = bibkeysInfo_.find(*it);
 		if (info == bibkeysInfo_.end())
 			continue;
 
 		string data = *it;
 		// FIXME UNICODE
-		data += ' ' + to_utf8(info->second);
+		data += ' ' + to_utf8((info->second)[biblio::TheDataString]);
 
 		try {
-			// Attempts to find a match for the current RE
-			// somewhere in data.
 			if (boost::regex_search(data, reg_exp))
 				foundKeys.push_back(*it);
 		}
Index: src/insets/InsetBibtex.h
===================================================================
--- src/insets/InsetBibtex.h	(revision 19264)
+++ src/insets/InsetBibtex.h	(working copy)
@@ -13,8 +13,9 @@
 #define INSET_BIBTEX_H
 
 
-#include <vector>
+#include <map>
 #include "InsetCommand.h"
+#include "frontends/controllers/frontend_helpers.h"
 
 #include "support/FileName.h"
 
@@ -38,8 +39,7 @@
 	///
 	int latex(Buffer const &, odocstream &, OutputParams const &) const;
 	///
-	void fillWithBibKeys(Buffer const & buffer,
-		std::vector<std::pair<std::string, docstring> > & keys) const;
+	void fillWithBibKeys(Buffer const & buffer, biblio::BibKeyList & keys) const;
 	///
 	std::vector<support::FileName> const getFiles(Buffer const &) const;
 	///
Index: src/insets/InsetBibtex.cpp
===================================================================
--- src/insets/InsetBibtex.cpp	(revision 19264)
+++ src/insets/InsetBibtex.cpp	(working copy)
@@ -4,6 +4,7 @@
  * Licence details can be found in the file COPYING.
  *
  * \author Alejandro Aguilar Sierra
+ * \author Richard Heck (BibTeX parser improvements)
  *
  * Full author contact details are available in file CREDITS.
  */
@@ -34,7 +35,6 @@
 
 #include <boost/tokenizer.hpp>
 
-
 namespace lyx {
 
 using support::absolutePath;
@@ -415,13 +415,13 @@
 		bool legalChar = true;
 		while (ifs && !isSpace(ch) && 
 			   delimChars.find(ch) == docstring::npos &&
-			   (legalChar = illegalChars.find(ch) == docstring::npos)
-			   ) {
-			if (chCase == makeLowerCase) {
+			   (legalChar = (illegalChars.find(ch) == docstring::npos))
+			   ) 
+		{
+			if (chCase == makeLowerCase)
 				val += lowercase(ch);
-			} else {
+			else
 				val += ch;
-			}
 			ifs.get(ch);
 		}
 		
@@ -478,19 +478,41 @@
 					return false;
 
 			} else if (ch == '"' || ch == '{') {
+				// set end delimiter
+				char_type delim = ch == '"' ? '"': '}';
 
-				// read delimited text - set end delimiter
-				char_type delim = ch == '"'? '"': '}';
-
+				//Skip whitespace
+				do {
+					ifs.get(ch);
+				} while (ifs && isSpace(ch));
+				
+				if (!ifs)
+					return false;
+				
+				//We now have the first non-whitespace character
+				//We'll collapse adjacent whitespace.
+				bool lastWasWhiteSpace = false;
+				
 				// inside this delimited text braces must match.
 				// Thus we can have a closing delimiter only
 				// when nestLevel == 0
 				int nestLevel = 0;
 
-				ifs.get(ch);
 				while (ifs && (nestLevel > 0 || ch != delim)) {
+					if (isSpace(ch)) {
+						lastWasWhiteSpace = true;
+						ifs.get(ch);
+						continue;
+					} 
+					//We output the space only after we stop getting 
+					//whitespace so as not to output any whitespace
+					//at the end of the value.
+					if (lastWasWhiteSpace) {
+						lastWasWhiteSpace = false;
+						val += ' ';
+					}
+					
 					val += ch;
-
 					// update nesting level
 					switch (ch) {
 						case '{':
@@ -503,7 +525,7 @@
 					}
 
 					ifs.get(ch);
-				}
+				} //end while loop
 
 				if (!ifs)
 					return false;
@@ -554,9 +576,9 @@
 }
 
 
-// This method returns a comma separated list of Bibtex entries
+// This method returns a map of Bibtex entries
 void InsetBibtex::fillWithBibKeys(Buffer const & buffer,
-		std::vector<std::pair<string, docstring> > & keys) const
+		biblio::BibKeyList & keys) const
 {
 	vector<FileName> const files = getFiles(buffer);
 	for (vector<FileName>::const_iterator it = files.begin();
@@ -571,15 +593,6 @@
 		//   field values.
 		// - it accepts more characters in keys or value names than
 		//   bibtex does.
-		//
-		// TODOS:
-		// - the entries are split into name = value pairs by the
-		//   parser. These have to be merged again because of the
-		//   way lyx treats the entries ( pair<...>(...) ). The citation
-		//   mechanism in lyx should be changed such that it can use
-		//   the split entries.
-		// - messages on parsing errors can be generated.
-		//
 
 		// Officially bibtex does only support ASCII, but in practice
 		// you can use the encoding of the main document as long as
@@ -588,6 +601,7 @@
 		// We don't restrict keys to ASCII in LyX, since our own
 		// InsetBibitem can generate non-ASCII keys, and nonstandard
 		// 8bit clean bibtex forks exist.
+		
 		idocfstream ifs(it->toFilesystemEncoding().c_str(),
 				std::ios_base::in,
 				buffer.params().encoding().iconvName());
@@ -658,24 +672,29 @@
 					continue;
 
 			} else {
-
-				// Citation entry. Read the key and all name = value pairs
+				// Citation entry. Try to read the key.
 				docstring key;
-				docstring fields;
-				docstring name;
-				docstring value;
-				docstring commaNewline;
 
 				if (!readTypeOrKey(key, ifs, from_ascii(","), 
 				                   from_ascii("}"), keepCase) || !ifs)
 					continue;
 
-				// now we have a key, so we will add an entry
+				/////////////////////////////////////////////
+				// now we have a key, so we will add an entry 
 				// (even if it's empty, as bibtex does)
 				//
+				// we now read the field = value pairs.
 				// all items must be separated by a comma. If
 				// it is missing the scanning of this entry is
 				// stopped and the next is searched.
+				docstring fields;
+				docstring name;
+				docstring value;
+				docstring commaNewline;
+				docstring data;
+				biblio::KeyValMap keyvalmap;
+				keyvalmap[biblio::TheEntryType] = entryType;
+				
 				bool readNext = removeWSAndComma(ifs);
 
 				while (ifs && readNext) {
@@ -698,23 +717,15 @@
 					if (!readValue(value, ifs, strings))
 						break;
 
-					// append field to the total entry string.
-					//
-					// TODO: Here is where the fields can be put in
-					//       a more intelligent structure that preserves
-					//	     the already known parts.
-					fields += commaNewline;
-					fields += name + from_ascii(" = {") + value + '}';
-
-					if (!commaNewline.length())
-						commaNewline = from_ascii(",\n");
-
+					keyvalmap[name] = value;
+					data += "\n\n" + value;
+					
 					readNext = removeWSAndComma(ifs);
 				}
 
 				// add the new entry
-				keys.push_back(pair<string, docstring>(
-				to_utf8(key), fields));
+				keyvalmap[biblio::TheDataString] = data;
+				keys[to_utf8(key)] = keyvalmap;
 			}
 
 		} //< searching '@'
Index: src/insets/InsetCitation.cpp
===================================================================
--- src/insets/InsetCitation.cpp	(revision 19264)
+++ src/insets/InsetCitation.cpp	(working copy)
@@ -65,13 +65,13 @@
 		return docstring();
 
 	// Cache the labels
-	typedef std::map<Buffer const *, biblio::InfoMap> CachedMap;
+	typedef std::map<Buffer const *, biblio::BibKeyList> CachedMap;
 	static CachedMap cached_keys;
 
 	// and cache the timestamp of the bibliography files.
 	static std::map<FileName, time_t> bibfileStatus;
 
-	biblio::InfoMap infomap;
+	biblio::BibKeyList infomap;
 
 	vector<FileName> const & bibfilesCache = buffer.getBibfilesCache();
 	// compare the cached timestamps with the actual ones.
@@ -97,16 +97,7 @@
 
 	// build the keylist only if the bibfiles have been changed
 	if (cached_keys[&buffer].empty() || bibfileStatus.empty() || changed) {
-		typedef vector<std::pair<string, docstring> > InfoType;
-		InfoType bibkeys;
-		buffer.fillWithBibKeys(bibkeys);
-
-		InfoType::const_iterator bit  = bibkeys.begin();
-		InfoType::const_iterator bend = bibkeys.end();
-
-		for (; bit != bend; ++bit)
-			infomap[bit->first] = bit->second;
-
+		buffer.fillWithBibKeys(infomap);
 		cached_keys[&buffer] = infomap;
 	} else
 		// use the cached keys
Index: src/insets/InsetInclude.h
===================================================================
--- src/insets/InsetInclude.h	(revision 19264)
+++ src/insets/InsetInclude.h	(working copy)
@@ -18,6 +18,8 @@
 #include "MailInset.h"
 #include "Counters.h"
 
+#include "frontends/controllers/frontend_helpers.h"
+
 #include "support/FileName.h"
 
 #include <boost/scoped_ptr.hpp>
@@ -57,10 +59,9 @@
 			  std::vector<docstring> & list) const;
 	/** Fills \c keys
 	 *  \param buffer the Buffer containing this inset.
-	 *  \param keys the list of bibkeys in the child buffer.
-	 */
-	void fillWithBibKeys(Buffer const & buffer,
-		std::vector<std::pair<std::string, docstring> > & keys) const;
+	 *  \param keys the map of bibkeys in the child buffer.
+	 */	
+	void fillWithBibKeys(Buffer const & buffer, biblio::BibKeyList & keys) const;
 	/** Update the cache with all bibfiles in use of the child buffer
 	 *  (including bibfiles of grandchild documents).
 	 *  Does nothing if the child document is not loaded to prevent
Index: src/insets/InsetInclude.cpp
===================================================================
--- src/insets/InsetInclude.cpp	(revision 19264)
+++ src/insets/InsetInclude.cpp	(working copy)
@@ -736,7 +736,7 @@
 
 
 void InsetInclude::fillWithBibKeys(Buffer const & buffer,
-		std::vector<std::pair<string, docstring> > & keys) const
+		std::map<std::string, std::map<docstring, docstring> > & keys) const
 {
 	if (loadIfNeeded(buffer, params_)) {
 		string const included_file = includedFilename(buffer, params_).absFilename();

[PATCH-updated] Bug 3676--Citation Bug

Reply via email to