Hi, Thanks for the info. I understand ELT (Extract, Load, Transform) is more appropriate for big data compared to traditional ETL. What are the major advantages of this in Big Data space. Example. if I started using Sqoop to get data from traditional transactional and Data Warehouse databases and create the same tables in Hive, what would be the next step to get to a consolidated data model in Hive on HDFS. The entry tables will be tabular tables in line with source, correct? How many ELT steps need to apply generally to get to the final model. Will ELT speed up this process I understand this is a very broad question. However, any comments will be welcome. Regards
On Friday, 18 December 2015, 22:27, Jörn Franke <jornfra...@gmail.com> wrote: I think you should draw more the attention that Hive is just one component in the ecosystem. You can have many more components, such as ELT, integrating unstructured data, machine learning, streaming data etc. however usually analysts are not aware about the technologies and it staff is not much aware of how it can bring benefits to a specific business domain. You could explore the potentials together in workshops, design thinking etc. once you know more details, both sides decide on potential ways forward you can start doing PoCs and see what works and what not. It is important that you break old ties created by more traditional data warehouse approaches in the past and go beyond the comfort zone. On 18 Dec 2015, at 22:01, Ashok Kumar <ashok34...@yahoo.com> wrote: Gurus, Some analysts keep asking me the advantages of having Hive tables when the star schema in Data Warehouse (DW) does the same. For example if you have fact and dimensions table in DW and just import them into Hive via a say SQOOP, what are we going to gain. I keep telling them storage economy and cheap disks, de-normalisation can be done further etc. However, they are not convinced :( Any additional comments will help my case. Thanks a lot