This session will take a look at Pig. Pig is a platform for analysing large datasets using a high level language in Hadoop. It is easy to use, intuitive and embarrassingly parallel. It was developed in order to avoid the complexities of Map Reduce. It is useful for ad-hoc analysis and of course ETL. During this session we will see it being used to manipulate structured and unstructured data.
Allan Mitchell is a SQL Server MVP and runs elastio, a small consultancy helping customers to make informed decisions about their data storage and integration. His focus is on enterprise search as well as real-time data integration.