Hi Dmitry and all, First of all, please accept my apologies for failing to find the time to participate in the licensing issues discussion in December. It is a topic that I would like to discuss and arrive at a conclusion as I often happen to write code that I'd like to release to the public under the most relaxed terms possible. I thought that not claiming copyright (or even disclaiming copyright) and placing the code into the public domain would be it, but apparently in many (most? all?) jurisdictions there's no explicitly specified way for someone to place their works into the public domain (although the concept of public domain does exist - and stuff "falls" there as old copyrights expire) and now it has also been mentioned that some jurisdictions don't even recognize public domain at all (I have not yet seen/heard a lawyer state that, though).
A possible solution could be to simultaneously try to place stuff in the public domain with a statement to that extent and license it to the public under very liberal terms. One issue with it is that I have to not claim or disclaim copyright in order to place a work of mine into the public domain, yet I have to be the copyright holder in order to license that work. Maybe this can be taken care of with a severability clause, making either the public domain or the license work in any given jurisdiction. But I'd rather see/hear a lawyer comment on that before I possibly go that route. That said, a lot of software that we use has been placed in the public domain by its authors. This includes some software by D. J. Bernstein, perhaps best known as the author of qmail, who is also known for the Bernstein vs. United States litigation - http://cr.yp.to/export.html - so perhaps he should know the law. Then, public domain is officially recognized as being compatible with GNU GPL by the FSF - http://www.fsf.org/licensing/licenses/ - and is apparently recognized by the OSI - http://opensource.org/node/239 On Tue, Feb 05, 2008 at 02:34:40PM +0300, Dmitry Stogov wrote: > We are going to include your md5() implementation into php-5.3.0. Great! > I confirm at least 25% md5() speedup on my Core2 3GHz, however license > issues are not clear. > We are going to distribute files under standard PHP license including > your original copyright notes. > The files which are going to be committed are attached. > > Please confirm your agreement. Confirmed. Please note, however, that there were no "copyright notes" on my original files; instead, there was an authorship note and a public domain statement. I also have some comments on the modified files: > | Copyright (c) 1997-2008 The PHP Group | ... > | Author: Solar Designer <solar at openwall.com> | So you claim copyright to a modified version of my code, that I had placed in the public domain. This is fine by me. I do not formally require it (in fact, I can't), but maybe the "Author" line could be changed to either: | Original author: Solar Designer <solar at openwall.com> | or: | Authors: Solar Designer <solar at openwall.com> with further | | modifications by others. | (or you can make it more explicit, e.g. "... by The PHP Group" if that is appropriate - or whatever). > /* MD5 context. */ > typedef struct { > php_uint32 lo, hi; > php_uint32 a, b, c, d; > unsigned char buffer[64]; > php_uint32 block[16]; > } PHP_MD5_CTX; Maybe it would be better to do: typedef php_uint32 MD5_u32plus; and use the latter type. This would reduce the number of changes between my version of the code and yours, making it easier for you to sync to any newer versions of the code that I might make. > | Author: Solar Designer <solar at openwall.com> | If you do choose to change this in the .h file, then do the same in the .c, obviously. > #if (defined(__APPLE__) || defined(__APPLE_CC__)) && (defined(__BIG_ENDIAN__) > || defined(__LITTLE_ENDIAN__)) > # if defined(__LITTLE_ENDIAN__) > # undef WORDS_BIGENDIAN > # else > # if defined(__BIG_ENDIAN__) > # define WORDS_BIGENDIAN > # endif > # endif > #endif This looks wrong to me. One of the specific properties of my implementation is that it does not strictly depend on the endianness being correctly specified at compile-time (and at all, for that matter). However, if you do happen to use the (little-endian and unaligned-OK) optimized code on a system that is not in fact little-endian or does not in fact tolerate unaligned accesses, then problems will arise! So any #if's you use must assume (might-be-big-endian and might-disallow-unaligned) by default. In fact, I am only aware of three widespread and general-purpose architectures that satisfy the criteria for the optimized code: #if defined(__i386__) || defined(__x86_64__) || defined(__vax__) Thus, I suggest that you leave the above #if intact, the way it was in the patch that I submitted. Do not explicitly check for any endianness macros - this is bound to cause problems. > /* > * * SET reads 4 input bytes in little-endian byte order and stores them > * * in a properly aligned word in host byte order. > * * > * * The check for little-endian architectures that tolerate unaligned > * * memory accesses is just an optimization. Nothing will break if it > * * doesn't work. > * */ > #ifndef WORDS_BIGENDIAN > # define SET(n) \ > (*(php_uint32 *)&ptr[(n) * 4]) > # define GET(n) \ > SET(n) > #else ... As explained above, I strongly recommend that you revert your "#ifndef WORDS_BIGENDIAN" to my "#if ..." What if an architecture is big-endian, but WORDS_BIGENDIAN just happens to not be specified? You'll have incorrect results (not MD5), whereas with my version of the code, everything will be just fine. Similarly, regardless of endianness, if WORDS_BIGENDIAN is not specified (maybe because the architecture is in fact little-endian), but the architecture does not tolerate unaligned accesses (at all or supports them with kernel emulation), things will go wrong (SIGBUS or very poor performance and a flood of kernel messages). This issue can't occur with my original #if that only lists specific known-safe architectures. > data = body(ctx, data, size & ~(unsigned long)0x3f); If you change all of my unsigned long's to size_t, you should change this one as well. When on a 64-bit system (userland pointer size), your size_t better be 64-bit as well (I have not checked whether this is necessarily the case; I hope so). > PHPAPI void PHP_MD5Final(unsigned char *result, PHP_MD5_CTX *ctx) > { > unsigned long used, free; Here's another one. Thanks, Alexander Peslyak <solar at openwall.com> GPG key ID: 5B341F15 fp: B3FB 63F4 D7A3 BCCC 6F6E FC55 A2FC 027C 5B34 1F15 http://www.openwall.com - bringing security into open computing environments -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php