[ https://issues.apache.org/jira/browse/HIVE-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jothy Babu updated HIVE-3906: ----------------------------- Description: Current releases of Hive lacks a function which would encode URL or form parameters or it escapes the URI. The function URI_ESCAPE (uri) would return the encoded form of the URI which would be useful while using HiveQL.Its always advisable to encode URL or form parameters; plain form parameter is vulnerable to cross site attack, SQL injection and may direct our web application into some unpredicted output. Functionality :- Function Name: URI_ESCAPE (uri) Returns the encoded form of the uri. Example: hive> SELECT URI_ESCAPE('http://www.example.com?a=l&t'); -> 'http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t' Usage :- Case 1 : To get encoded uri corresponding to a particular uri hive> SELECT URI_ESCAPE('http://google.com/resource?key=value1 & value2'); -> 'http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2' Case 2 : To query a table to get encoded form of the urls corresponding to users Table :- USER_URLS userid |url USR00001|http://www.example.com?a=l&t USR00010|http://search.barnesandnoble.com/booksearch/first book.pdf USR00100|http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4 USR01000|http://google.com/resource?key=value USR10000|http://google.com/resource?key=value1 & value2 USR10001|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1 USR10010|gopher://gopher.voa.gov USR10100|http://www.apple.com/index.html USR11000|file:/data/letters/to_mom.txt USR11001|http://www.cuug.ab.ca:8001/~branderr/csce.html Query : select userid,url,uri_escape(uri) from USER_URLS; Result :- USR00001|http://www.example.com?a=l&t|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t USR00010|http://search.barnesandnoble.com/booksearch/first book.pdf|http://search.barnesandnoble.com/booksearch/first%20book.pdf USR00100|http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4|http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst%20book.pdf USR01000|http://google.com/resource?key=value|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue USR10000|http://google.com/resource?key=value1 & value2|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2 USR10001|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1|ftp%3A%2F%2Feau.ww.eesd.gov.calgary%2Fhome%2Fsmith%2Fbudget.wk1 USR10010|gopher://gopher.voa.gov|gopher%3A%2F%2Fgopher.voa.gov USR10100|http://www.apple.com/index.html|http%3A%2F%2Fwww.apple.com%2Findex.html USR11000|file:/data/letters/to_mom.txt|file%3A%2Fdata%2Fletters%2Fto_mom.txt USR11001|http://www.cuug.ab.ca:8001/~branderr/csce.html|http%3A%2F%2Fwww.cuug.ab.ca%3A8001%2F%7Ebranderr%2Fcsce.html Current releases of Hive lacks a function which would decode the encoded uri. The function URI_UNESCAPE (uri) would return the decoded form of the encoded URI which would be useful while using HiveQL.This function converts the specified string by replacing any escape sequences with their unescaped representation. Functionality :- Function Name: URI_UNESCAPE (uri) Returns the decoded form of the encoded uri. Example: hive> SELECT URI_UNESCAPE('http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t'); -> 'http://www.example.com?a=l&t' Usage :- Case 1 : To get decoded uri corresponding to a particular encoded uri hive> SELECT URI_UNESCAPE('http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2'); -> 'http://google.com/resource?key=value1 & value2' Case 2 : To query a table to get decoded form of the encoded urls corresponding to users Table :- USER_URLS userid |encodedurl USR00001|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t USR00010|http://search.barnesandnoble.com/booksearch/first%20book.pdf USR00100|http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst%20book.pdf USR01000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue USR10000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2 USR10001|ftp%3A%2F%2Feau.ww.eesd.gov.calgary%2Fhome%2Fsmith%2Fbudget.wk1 USR10010|gopher%3A%2F%2Fgopher.voa.gov USR10100|http%3A%2F%2Fwww.apple.com%2Findex.html USR11000|file%3A%2Fdata%2Fletters%2Fto_mom.txt USR11001|http%3A%2F%2Fwww.cuug.ab.ca%3A8001%2F%7Ebranderr%2Fcsce.html Query : select userid,encodedurl,uri_unescape(encodedurl) from USER_URLS; Result :- USR00001|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t|http://www.example.com?a=l&t USR00010|http://search.barnesandnoble.com/booksearch/first%20book.pdf|http://search.barnesandnoble.com/booksearch/first book.pdf USR00100|http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst%20book.pdf|http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4 USR01000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue|http://google.com/resource?key=value USR10000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2|http://google.com/resource?key=value1 & value2 USR10001|ftp%3A%2F%2Feau.ww.eesd.gov.calgary%2Fhome%2Fsmith%2Fbudget.wk1|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1 USR10010|gopher%3A%2F%2Fgopher.voa.gov|gopher://gopher.voa.gov USR10100|http%3A%2F%2Fwww.apple.com%2Findex.html|http://www.apple.com/index.html USR11000|file%3A%2Fdata%2Fletters%2Fto_mom.txt|file:/data/letters/to_mom.txt USR11001|http%3A%2F%2Fwww.cuug.ab.ca%3A8001%2F%7Ebranderr%2Fcsce.html|http://www.cuug.ab.ca:8001/~branderr/csce.html > URI_Escape and URI_UnEscape UDF > ------------------------------- > > Key: HIVE-3906 > URL: https://issues.apache.org/jira/browse/HIVE-3906 > Project: Hive > Issue Type: New Feature > Components: UDF > Affects Versions: 0.8.1 > Environment: Hadoop 0.20.1 > Java 1.6.0 > Reporter: Liu Zongquan > Labels: patch > Fix For: 0.8.1 > > Original Estimate: 96h > Remaining Estimate: 96h > > Current releases of Hive lacks a function which would encode URL or form > parameters or it escapes the URI. > The function URI_ESCAPE (uri) would return the encoded form of the URI which > would be useful while using HiveQL.Its always advisable to encode URL or form > parameters; plain form parameter is vulnerable to cross site attack, SQL > injection and may direct our web application into some unpredicted output. > Functionality :- > Function Name: URI_ESCAPE (uri) > Returns the encoded form of the uri. > Example: hive> SELECT URI_ESCAPE('http://www.example.com?a=l&t'); > -> 'http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t' > Usage :- > Case 1 : To get encoded uri corresponding to a particular uri > hive> SELECT URI_ESCAPE('http://google.com/resource?key=value1 & value2'); > -> 'http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2' > Case 2 : To query a table to get encoded form of the urls corresponding to > users > Table :- USER_URLS > userid |url > USR00001|http://www.example.com?a=l&t > USR00010|http://search.barnesandnoble.com/booksearch/first book.pdf > > USR00100|http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4 > USR01000|http://google.com/resource?key=value > USR10000|http://google.com/resource?key=value1 & value2 > USR10001|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1 > USR10010|gopher://gopher.voa.gov > USR10100|http://www.apple.com/index.html > USR11000|file:/data/letters/to_mom.txt > USR11001|http://www.cuug.ab.ca:8001/~branderr/csce.html > Query : select userid,url,uri_escape(uri) from USER_URLS; > Result :- > USR00001|http://www.example.com?a=l&t|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t > > USR00010|http://search.barnesandnoble.com/booksearch/first > book.pdf|http://search.barnesandnoble.com/booksearch/first%20book.pdf > > USR00100|http://abc.dev.domain.com/0007AC/ads/800x480 15sec > h.264.mp4|http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst%20book.pdf > USR01000|http://google.com/resource?key=value|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue > USR10000|http://google.com/resource?key=value1 & > value2|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2 > USR10001|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1|ftp%3A%2F%2Feau.ww.eesd.gov.calgary%2Fhome%2Fsmith%2Fbudget.wk1 > USR10010|gopher://gopher.voa.gov|gopher%3A%2F%2Fgopher.voa.gov > USR10100|http://www.apple.com/index.html|http%3A%2F%2Fwww.apple.com%2Findex.html > USR11000|file:/data/letters/to_mom.txt|file%3A%2Fdata%2Fletters%2Fto_mom.txt > USR11001|http://www.cuug.ab.ca:8001/~branderr/csce.html|http%3A%2F%2Fwww.cuug.ab.ca%3A8001%2F%7Ebranderr%2Fcsce.html > Current releases of Hive lacks a function which would decode the encoded uri. > The function URI_UNESCAPE (uri) would return the decoded form of the encoded > URI which would be useful while using HiveQL.This function converts the > specified string by replacing any escape sequences with their unescaped > representation. > Functionality :- > Function Name: URI_UNESCAPE (uri) > Returns the decoded form of the encoded uri. > Example: hive> SELECT > URI_UNESCAPE('http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t'); > -> 'http://www.example.com?a=l&t' > Usage :- > Case 1 : To get decoded uri corresponding to a particular encoded uri > hive> SELECT > URI_UNESCAPE('http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2'); > -> 'http://google.com/resource?key=value1 & value2' > Case 2 : To query a table to get decoded form of the encoded urls > corresponding to users > Table :- USER_URLS > userid |encodedurl > USR00001|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t > USR00010|http://search.barnesandnoble.com/booksearch/first%20book.pdf > USR00100|http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst%20book.pdf > USR01000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue > USR10000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2 > USR10001|ftp%3A%2F%2Feau.ww.eesd.gov.calgary%2Fhome%2Fsmith%2Fbudget.wk1 > USR10010|gopher%3A%2F%2Fgopher.voa.gov > USR10100|http%3A%2F%2Fwww.apple.com%2Findex.html > USR11000|file%3A%2Fdata%2Fletters%2Fto_mom.txt > USR11001|http%3A%2F%2Fwww.cuug.ab.ca%3A8001%2F%7Ebranderr%2Fcsce.html > Query : select userid,encodedurl,uri_unescape(encodedurl) from USER_URLS; > Result :- > USR00001|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t|http://www.example.com?a=l&t > USR00010|http://search.barnesandnoble.com/booksearch/first%20book.pdf|http://search.barnesandnoble.com/booksearch/first > book.pdf > USR00100|http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst%20book.pdf|http://abc.dev.domain.com/0007AC/ads/800x480 > 15sec h.264.mp4 > USR01000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue|http://google.com/resource?key=value > USR10000|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2|http://google.com/resource?key=value1 > & value2 > USR10001|ftp%3A%2F%2Feau.ww.eesd.gov.calgary%2Fhome%2Fsmith%2Fbudget.wk1|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1 > USR10010|gopher%3A%2F%2Fgopher.voa.gov|gopher://gopher.voa.gov > USR10100|http%3A%2F%2Fwww.apple.com%2Findex.html|http://www.apple.com/index.html > USR11000|file%3A%2Fdata%2Fletters%2Fto_mom.txt|file:/data/letters/to_mom.txt > USR11001|http%3A%2F%2Fwww.cuug.ab.ca%3A8001%2F%7Ebranderr%2Fcsce.html|http://www.cuug.ab.ca:8001/~branderr/csce.html > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira