A Unicredit Case Study of Data Protection with DataStage and Protegrity
A Deep Dive Into DataStage Performance Tuning
Session abstractData privacy is not only a hot topic, but a critical requirement when implementing a multinational enterprise data warehouse (EDWH) in the financial services industry. Designing a waterproof, flexible and scalable data protection solution is a daunting task.We showcase how we managed this task for Unicredit's EDWH and deep dive into the technical solution which combines DataStage with the data protection software Protegrity and the Teradata DBMS. On our dive we will explain different approaches to data protection including tokenization. We will present lessons-learned and best practices from the Unicredit implementation. Finally we will point out how the implementation could be adapted to other BigData environments.Why you should attend this sessionAttendees will learn why data protection is important and how it can be efficiently implemented using DataStage and Protegrity. They will gain an understanding of different data protection methods, what vaultless tokenization is about and why Unicredit decided to use tokenization rather than encryption or data masking. They will get a comprehensive insight into Unicredit's approach, architecture and best practices for protecting data in a multinational enterprise data warehouse.
A Deep Dive Into DataStage Performance Tuning
Session abstractSince the advent of massively scalable ETL engines like the DataStage Parallel Engine it is a common misconception that poorly performing ETL processes are mainly a hardware sizing problem. In this session we explain why this is not the case and we present a bunch of DataStage tips and techniques, design patterns and configuration options which proved successful to avoid performance bottlenecks and to dramatically speedup the production batch runs at Unicredit's EDWH implementation. We show how to customize a DataStage Grid deployment to improve IO throughput and overall resource allocation. We present DataStage Parallel job design patterns and the Balanced Optimization feature which help to reduce resource usage or wait cycles.Why you should attend this sessionAttendees will learn how to distinguish and resolve different classes of performance bottlenecks. They will get a better understanding how to setup a performance optimized DataStage infrastructure supporting multiple projects with different workloads and learn how Grid deployments can be enhanced to resolve IO and other resource bottlenecks. Developers will learn job design patterns to reduce resource usage and how to leverage Balanced Optimization to improve existing jobs.
trheeeeerger
ReplyDelete