Sorry to add yet another a voice to the discussion. I agree with Apostolos Syropoulos that the adding primitives to XeTeX should be limited, but I disagree on other points.
On 7/2/15, Apostolos Syropoulos <asyropou...@yahoo.com> wrote: > So someone will step in and implement this primitive but then we > will realize we need another primitive to handle the more advanced > sha256. Programming languages have libraries for this and they do > not modify the language to handle every new feature. So the best > solution is to introduce some library mechanism that would make > it possible to introduce new commands without affecting the kernel. The difference is that programming languages provide access to the filesystem and everything else the programmer might need. Then libraries can put these raw features together in nice abstractions. XeTeX currently provides no safe* way to do various operations that pdfTeX/LuaTeX allow. The obstruction to implementing the md5 hash in a package is that XeTeX provides no way to access bytes in a file: it can only read files encoded in utf8. *By "safe" I am excluding shell-escape, which would allow for arbitrary code execution. If it was possible to read a file's bytes, then implementing md5, sha1, sha256 would be straightforward. For this, I suggest the pdfTeX primitive \pdffiledump, which expands to a hexadecimal representation of some bytes in a file. An identical primitive could safely be added to XeTeX. It would allow to compute the md5 hash of a file while being sure that this is indeed the same file as what XeTeX would \read or \input : the PerlTeX approach cannot ensure this, as the path searched by (Xe)TeX is different from that searched by Perl. For definiteness, here is the description of \pdffiledump from the pdfTeX manual. \pdffiledump [ offset ⟨number⟩ ] [ length ⟨number⟩ ] ⟨general text⟩ (expandable) Expands to the dump of the file ⟨general text⟩ in uppercase hexadecimal format (same as \pdfescapehex ), starting at offset ⟨number⟩ or 0 with length ⟨number⟩ , if given. The first ten bytes of the source of this manual are 2520696E746572666163 . The primitive was introduced in pdfTEX 1.30.0. Adding this primitive fixes the question of md5, sha1, sha256 hashes, of reading back in _exactly_ a file that has been written by XeTeX, and also IIRC of finding the bounding box in some eps images. For other "missing" primitives one should evaluate whether they are implementable as library code, and how useful they are. \pdfcreationdate : not sure how useful it is, perhaps for compliance to some standards. \pdfescapestring, \pdfescapename, \pdfescapehex, \pdfunescapehex : implementable in TeX, and anyways it is unclear how chars >127 should be treated \pdfuniformdeviate, \pdfnormaldeviate, \pdfrandomseed, \pdfsetrandomseed : pseudo-randomness is implementable in TeX, but perhaps such better random numbers are needed. It seems very specific. \pdffilemoddate not strictly necessary, \pdffilesize (might be necessary for \pdffiledump), and \pdfmdfivesum (see this whole discussion thread) So all in all, I'd be in favor of adding \pdffilesize and \pdffiledump into XeTeX, and leaving other primitive out, including the mdfive one. Regards, Bruno -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex