This lecture will be given today at Tel-Aviv University, school of Comptuer Science, Schreiber building. Time: 17:00-20:00 Room: 006 Map: http://www2.tau.ac.il/map/unimapl1.asp
Hadoop at Yahoo! State of technology today and future development. Talk Abstract As the most visited site on the Internet Yahoo! has to build products that scale to thousands of servers. Adoption of Behavioral Targeting & WebAnalytics into the production cycle forced web companies to build expensive data pipelines with data warehouse and myriad of compute clusters. Then, grid computing that for a long time strove to find its place in the industry turn to be the right tool for the job. In 2006 Yahoo! started with open-source project called Hadoop with a goal to build stable & scalable grid solution that include distributed file system (HDFS) and Map Reduce framework. Today Y! Grid Technology Team completes migration of all Y! data driven businesses to Hadoop. The project quickly gained momentum in open source community attracting dozens of contributors. Every month we host Hadoop User Group at Yahoo! that became a meeting place for Hadoop developers and users from Facebook, IBM, Google, Ebay and many Silicon Valley startups. In my presentation I will talk about how Web Giants use data, technologies that were used so far, and how Hadoop helps to streamline product development and R&D. We will also cover the current state of Hadoop technology and next year Roadmap. BIO. Michael Pilip is Sr. Product Manager in Y! Cloud Computing & Data Infrastructure division responsible for Hadoop development and Y! products migration to grid. Before joining Hadoop Michael worked on Y! analytical data pipeline that brings data from dozens of thousands web servers around the globe to Data Marts and Behavioral Targeting applications. Prior to turning to data business Michael was a lead developer in Y! Games building the biggst on-line games portal in the World. Agenda: How web giants use user-generated data (WebAnalytics, Behavioral Targeting, Reporting, R&D) Data Pipelines architecture (Data Collection, ETL, Data Warehousing, Aggregations, DataMarts, AdServing Systems) Hadoop to the rescue Hadoop at Yahoo! (development and cluster operations) Hadoop open-source projects and major trends. Q&A ================================================================To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]