Shaik Meeran created SQOOP-3452: ----------------------------------- Summary: SQOOP Upper bound value issue when import from MSSQLServer using TOP n Key: SQOOP-3452 URL: https://issues.apache.org/jira/browse/SQOOP-3452 Project: Sqoop Issue Type: Bug Components: connectors/sqlserver, op Affects Versions: 1.4.6 Environment: CENTOS
SQOOP - 1.4.6 Hadoop - 2.6.0 Reporter: Shaik Meeran Attachments: SqoopImportLog.txt I am trying to import data from MSSQL Server to hadoop using SQOOP with a fixed number of rows in each batch using top 'N' in the query. Below is the code : sqoop import --connect "jdbc:sqlserver://myserverIp;database=organization" \ --connection-manager org.apache.sqoop.manager.SQLServerManager \ --driver com.microsoft.sqlserver.jdbc.SQLServerDriver \ --username hadoop --password hadoop123 \ --target-dir /user/hive/warehouse/organization.db/employees \ --query "SELECT TOP 5 [employee_id] \ ,[first_name] \ ,[last_name] \ ,[email] \ ,[phone_number] \ ,[hire_date] \ ,[job_id] \ ,[salary] \ ,[commission_pct] \ ,[manager_id] \ ,[department_id] \ FROM [dbo].[employees] where \$CONDITIONS" \ --m 1 --incremental append --check-column employee_id --last-value 99; when i execute, SQOOP is fetching the upper bound value with the below query SELECT MAX([employee_id]) FROM (SELECT TOP 5 [employee_id] ,[first_name] ,[last_name] ,[email] ,[phone_number] ,[hire_date] ,[job_id] ,[salary] ,[commission_pct] ,[manager_id] ,[department_id] FROM [dbo].[employees] where (1 = 1)) sqoop_import_query_alias which is returning 206, this is the highest value in employees table But I am reading only five rows so the returned value should be max of returned 5 rows that is 104 in this case. therefore this will be an issue when I run the job again. I have posting this issue with MAX function on MSDN forum and I got some responses asking to use order by clause in the sub query. [MSDN Issue|https://social.msdn.microsoft.com/Forums/en-US/f1b3ad8f-7936-419b-b10d-8e2c19e4d266/aggregate-max-on-sub-query-with-quottop-nquot-is-not-working-correctly?forum=transactsql] I believe this should be a fix at SQOOP when using with MSSQL -- This message was sent by Atlassian Jira (v8.3.4#803005)