Systems Engineer w Cloudera
Daniel Tydecks works at Cloudera’s Systems Engineering and is responsible for the DACH and CE region. He helps clients to match Hadoop focused big data technology to their needs and use cases and consults on best choices and practices. Prior to joining Cloudera Daniel worked on CEP and streaming event processing, originally focused on financial markets topics like algorithmic trading, FX market making, fraud prevention, risk management and market surveillance.
- Big Data 90%
- Hadoop Ecosystem 94%
- Data Science 92%
- Cloudera Products 100%
Temat prelekcji: „Information Architecture for Hadoop„
The Hadoop Ecosystem makes it possible to build an Enterprise Data Hub capable of storing and analysing a wide variety of data. However, a platform with such broad capability triggers a question: how to organise the myriad data sets in a way that allows users to explore and access the data they need? This session will propose an Information Architecture for Hadoop that enables this.
The Hadoop Ecosystem includes a range of tools which together make it possible to build an Enterprise Data Hub capable of storing, processing and analysing a wide variety of data. However, a platform with such broad capability triggers a question: how to organise the myriad data sets in a way that allows users to explore all the data, discover new data sets and perform the necessary processing and analysis on the data they need?
This session will answer that question by outlining an Information Architecture for an Enterprise Data Hub based on Hadoop. This is composed of a number of layers or zones that are designed to allow an organisation to:
- Ingest data in its full fidelity, in as close to its original, raw form as possible
- Provide a data discovery and exploration facility for analysts and/or data scientists
- Bring together and link multiple data sets to provide a business-wide data model
- Create views of the data that are optimised for the access patterns generated by particular use cases
The session will describe the layers required in an Information Architecture that can provide these functions, with reference to the particular technologies within the Hadoop Ecosystem that enable them.