Speaker: Mike Ferguson
This 2-day course looks at the challenges faced by companies trying to deal with an exploding number of data sources, collecting data in multiple data stores (cloud and on-premises), multiple analytical systems and at the requirements to be able to define, govern, manage and share trusted high quality information in a distributed and hybrid computing environment. It also explores a new approach of how IT data architects, business users and IT developers can collaborate together in building and managing an enterprise data lake to get control of your data. This includes data ingestion, data discovery, data profiling and tagging and publishing data in an information catalog. It also involves refining raw data to produce enterprise data services that can be published in a catalog available for consumption across your company. We also introduce multiple data lake configurations including a centralised data lake and a ‘logical’ distributed data lake as well as execution and governance across multiple data stores. It emphasises the need for a common collaborative process and common approach to governing and managing data of all types.