Hello All, I need below information about Apache Kafka tool for data integration and ETL needs:
Development effort: The development effort , time and complexity is more in general? Maintainability: Is it less maintainable? Error Handling: Only possesses a single log file? or possesses a log and error port in every transform? What kind of errors can be handled? Various teams needed: Separate Administration team or Unix or NT Admin will suffice needed works. hence it does not need a dedicated administer? File Structure: Only able to read record with single type of delimiter? Data Integration Capability: ODI boasts comparatively lesser range of Data Integration Products and capability which includes many related functions such as profiling and data quality ? Also, if it offers these capabilities then these are more mainstream in nature? Market Segments: Serves medium to large scale companies? Debugging: Is it offer easy debugging? Example -just place some watchers on required places and intermediate data will be saved in temporary files for easy viewing. or complex debugging process through debugger? Company Strategy: You can download a scaled down free version of their software and plenty of free documents available on internet? Go live rate: High “GO Live” success? any know issue during deployment? Scalability: Is there any issue with stability? If yes then why is the issue and what is impact? Which kind of scalability is supported- horizontal, vertical? Performance: Can it supports High volume of data movement, transformation and integration (ETL operations)? How about parallelism - mapping level parallelism, session level parallelism, supports multiple parallel source and multiple target data loads? Heterogeneous system: It integrates data from various heterogeneous systems like multiple variety of databases (SQL server, Oracle, DB2 etc), files (XML, XLS, CSV, text etc)? Targets can be any type of DB , file etc.? Big Data support: It can be integrated and used for Big Data? On cloud solution: It is available for both- on cloud and on premises platforms? Pricing: Is it free ware - open source? Does it come in basic, standard and enterprise editions flavors? If yes , all flavors are free? Repository: Does it offers repositories ? Those repositories are for metadata? Host for repositires should be relational database? Push down mechanism: Do we have pushdown optimization concepts, where it can generate SQL statements from the workflow/mapping which can be directly executed on database? It is ETL or ELT tool? Job scheduling: Does it come with in-built scheduler? Version controlling: Does it offer version controlling? If yes then it is tightly controlled or moderate? Tool Bugs: Any known tool bugs? Any issue due to those bugs? Anything else you want to highlight? Thanks, Rajneesh