Incremental import using Sqoop2

I would like to import data from MySQL table into HDFS. I have everything configured and I am able to create simple job in sqoop-shell that is copying data. However I would like to copy each time only new records, but I am not sure how to achieve this. When I create job there is a parameter named "check column" and I have columns like ID or eventTimestamp that seem to be suitable there. However, in such case I should enter also "last value" then. Do I have to manage this last value by myself and each time create new job with new "last value"? W...Read more

Sqoop vs Sqoop2 commands

One of the advantages of migrating to Sqoop2 is that we are not sharing database credentials with clients.Now when we execute Sqoop commands, they look like below: sqoop import --connect ... --username ... --table ... When we upgrade to Sqoop2 then also we are executing same command except that connection string will point to Sqoop2 server rather than actual RDBMS (involved in transfer) and credentials will be of Sqoop2 server.Here also we are sharing credentials of Sqoop2 server with all the clients. Is it not violating basic principle for wh...Read more

sqoop2 - Load table from Oracle to Hive, date and timestamp convert to string

I have noticed this in the Sqoop manual: 24.5. Schema Definition in Hive Hive users will note that there is not a one-to-one mapping between SQL types and Hive types. In general, SQL types that do not have a direct mapping (for example, DATE, TIME, and TIMESTAMP) will be coerced to STRING in Hive. The NUMERIC and DECIMAL SQL types will be coerced to DOUBLE. In these cases, Sqoop will emit a warning in its log messages informing you of the loss of precision.And I know Sqoop can specify the column - type mapping using --map-column-java --map...Read more