[ 
https://issues.apache.org/jira/browse/SOLR-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Walraven updated SOLR-13238:
---------------------------------
    Description: 
Introduced in SOLR-6787

The blob handler currently uses the following logic for generating/storing the 
md5 for uploads:
{code:java}
MessageDigest m = MessageDigest.getInstance("MD5");
m.update(payload.array(), payload.position(), payload.limit());
String md5 = new BigInteger(1, m.digest()).toString(16);
{code}

Unfortunately, this method does not provide padding for any md5 with less than 
0x10 for its most significant byte. This means that on many occasions it could 
end up with a md5 hash of 31 characters instead of 32. 

I have opened a PR with the following recommended change:
{code:java}
String md5 = new String(Hex.encodeHex(m.digest()));
{code}

  was:
Introduced in SOLR-6787

The blob handler currently uses the following logic for generating/storing the 
md5 for uploads:
{code:java}
MessageDigest m = MessageDigest.getInstance("MD5");
m.update(payload.array(), payload.position(), payload.limit());
String md5 = new BigInteger(1, m.digest()).toString(16);
{code}

Unfortunately, this method does not provide padding for any md5 with less than 
0x10 for its most significant byte. This means that on many occasions it could 
end up with a md5 hash of 31 characters instead of 32. 

I will open a PR shortly with the following recommended change:
{code:java}
String md5 = new String(Hex.encodeHex(m.digest()));
{code}


> BlobHandler generates non-padded md5
> ------------------------------------
>
>                 Key: SOLR-13238
>                 URL: https://issues.apache.org/jira/browse/SOLR-13238
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: blobstore
>    Affects Versions: 6.0, 6.6.5, 7.0, 7.6
>            Reporter: Jeff Walraven
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Introduced in SOLR-6787
> The blob handler currently uses the following logic for generating/storing 
> the md5 for uploads:
> {code:java}
> MessageDigest m = MessageDigest.getInstance("MD5");
> m.update(payload.array(), payload.position(), payload.limit());
> String md5 = new BigInteger(1, m.digest()).toString(16);
> {code}
> Unfortunately, this method does not provide padding for any md5 with less 
> than 0x10 for its most significant byte. This means that on many occasions it 
> could end up with a md5 hash of 31 characters instead of 32. 
> I have opened a PR with the following recommended change:
> {code:java}
> String md5 = new String(Hex.encodeHex(m.digest()));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to