Home - IT Training - Our Programs - Data Stage Corporate Training

DataStage - An introduction

In this course participant’s work through a number of instructor led hands-on exercises. Participants also complete a set of integration exercises in which business oriented world problems are answered by building Data-Stage graphs.

After working with Date-Stage in a development environment, users may wish to get more in depth training in several areas. This course is designed to deliver instructions in a variety of advanced topics too. We can work with you to tailor this course to meet your needs.

Course Content:

Module I: Introduction on data stage

brief history of data stage
where does data stage fits with in data ware housing contest?
What is IBM information server 8.0 and Web Sphere data stage?
Where web sphere data stage packs fit with in the IBM Information server architecture
various versions available
Introduction to data stage server components
Repository
Data stage server
Data stage package installer
Introduction to data stage client components
Data stage administrator
Data stage designer
Data stage director
Data stage manager (removed in latest version)
IBM web sphere data stage and quality stage designer overview

How to connect to a project
The designer A quick tour
IBM information server repository
Developing a job
Introduction to job properties
Introduction to job parameters
Introduction to table definitions
Importing and exporting from the repository
Report assistance and documentation tools
Configuration files editor.

Module III: Web Sphere Data Stage Server Jobs:

Introduction to server jobs
handling databases in server jobs
handling databases in server jobs
Handling special characters (#and$) Loading tables.
Data type conversion- writing to oracle
Data type conversion- reading from oracle
Looking up on oracle table
Updating on oracle table
ODBC Stage
Universe stage
Handling files in server jobs
Sequential file stage
How to use sequential file stage
Defining sequential file input data
Defining sequential file output data
How the sequential stage behaves
Folder stages
Handling processing stages in server jobs
Transformer stage
How to use transformer stage
Transformer editor components
The data stage expression editor
Transformer stage properties
Overview of transformer function
Using transformer as a look up stage
Aggregator stage
How to use aggregator stage
Defining the input colomn sort order
Aggregating data
Merge stage
Sort stage
Parallel processing in data stage

Infrastructure as a foundation for data warehousing
various hardware and the operating systems available
What are the various platform option
Client server architecture for data warehouse
Various server hardware available
SMP (symmetric multiprocessing)
Clusters
MPP ( massively parallel processing)
CCNUMA OR NUMA(cache-coherent non – Uniform memory architecture)
Types of parallel processing in data stage

pipeline parallelism
partition parallelism
Combining pipeline and partition parallelism
Repartitioning data
Parallel processing environments
The configuration file
Types of partitioning in data stage
round robin
random
same
entire
hash by field
modulus
range
DB2
Auto

Type of collecting in data stage

  • round robin
  • ordered
  • sorted merge
  • auto

The mechanics of partitioning and collecting
Web sphere data stage parallel jobs
introduction to data stage parallel jobs
difference between a passive stage and active stage
handling metadata in data stage
Running column propagation (RPC)
Table definitions
Schema files and partial schemas
Data types
Data and time formats
Complex data types
Handling oracle enterprise stage in parallel jobs
handling special characters(# and $) loading tables type conversions writing to oracle
updating an oracle database
deleting rows from an oracle database
leading an oracle database
reading an oracle database
performing a direct lookup on an oracle database table
using SQL builder
Handling transformer stage in parallel jobs

how it is different from server transformer stage
creating and deleting columns
handling null values
defining constraints and handling otherwise links
specifying link order
defining local stage variables
what is a BASIC transformer stage
transformer functions
? combining data in data stage parallel jobs

horizontal and vertical combining
join stage
inner
Left outer
Right outer
Full order
Look up stage
Merger stage
Comparison between join merge and look up stage
Partitioning in reference links
Aggregator stage
Funnel stage
Funnel mode
Sort funnel mode
Sequence
Some more useful stages in data stage parallel jobs
sort stage
sequential sort
Parallel sort
Total sort
Partitioning requirement
Remove duplicates stage
Modify stage
Dropping and keeping columns
Changing data type
Null handling
Pivot stage
Limitations in pivot stage
Modify stage
Copy stage
Filter stage
External filter stage
Switch stage
Compress stage
Expand stage
Encode stage
Decode stage
FIP enterprise stage
Generic stage
Surrogate key generator stage
SAS stage
Capturing changes in data stage parallel jobs
change capture stage
Change apply stage
Difference stage
Compare stage
Slowly changing dimension stage
Handling develop / debug stages in data stage parallel jobs

Head stage
Head stage
Head stage default behavior
Skipping data
Tail stage
Sample stage
Peek stage
Row generator stage
How to specify data to be generated
Generating data in parallel
Colomn generator stage
Write range map stage
How to perform range look up in data stage
handling restructure stages in data stage parallel jobs
colomn import stage
Colomn export stage
Make sub record stage
Split sub record stage
Combine records stage
Promote sub record stage
Make vector stage
Split vector stage
Handling XML file in data stage parallel jobs

Introduction to XML files
Using the XML meta data importer
Using xml input stage
Validating documents and schemas
Processing namespaces
Supported x path expressions
Using XML output stage
Processing names spaces
Supported x path expressions
Aggregating input rows on output
Writing output to your file system
Processing NULLS and empty values
How repetition paths work
Using xml transformer stage
Optimizing performance in server and parallel jobs
Web sphere data stage jobs and processes
interpreting performances statistics in server jobs
improving performance in server jobs
CPU limited jobs single processor systems
CPU limited jobs multiprocessor systems
I/O limited jobs
Hashed file stages
Hash file design
Inter process stages in sever jobs
Link collector stages in server jobs
Link partitioned stages in server jobs
Job design tips in parallel jobs
Processing large volumes of data
Modular development
Designing for good performance
Database sparse lookup vs. join
Improving performance in parallel jobs
Understanding a flow
Performance monitoring
Resolving bottlenecks
Ensuring data is evenly partitioned
Programming in data stage
Introduction to programming components
Routines
Transform functions
Before /after subroutines
Custom universe functions
Active (ole) functions
Subroutines
Creating a routine
Defining custom transforms
Transforms
Macros

Precedence rules
BASIC programming
Built in transforms and routines
Handling web services in data stageIntroduction to web services technologies
Encoding requests and responses
Using the soap framework
Publishing web service operations
Accessing web services
What is the web service pack
Using the web service meta data importer
Using the web services transformer stage
Using the web services client stage
Creating web service routines
How to expose data stage job as a web service
? Using IBM information console
Job scheduling using job sequences in data stage
Creating a job sequence
Overview of activity stages
Triggers
Expressions
Job activity properties
Routine activity properties
Email notification activity properties
Wait for file activity properties
exception activity properties
Nested condition activity properties
Start loop activity properties
End loop activity properties
User variables activity properties
Compiling and restarting the job sequence
Some advanced concepts in data stage

Achieving reusability in data stage using containers
Types of containers
Local containers
Server shared containers
Parallel shared containers
Creating a shared containers
Using shared containers in data stage jobs
Converting shared containers to local containers
Deconstruction of shared containers
Specifying our own parallel stage
Defining custom stage
Defining build stage
Build stage macros
Defining wrapper stage
Usage of administrator client in datastage
Adding environment variables
Setting job parameters default values
Changing license details
Handling projects
Buffer settings in data stage
Multiple instances of jobs in data stage
Data stage job control utility
Jobs – compilation execution and checking of logs using data stage tool
Handling multilingual data in data stage
How to enable NLS on data stage
Orchestrate architecture and commands
Orchestrate parallel processing framework in datastage
Orchestrate utility in data stage
Surrogate key generation using data stage
Version control in data stage

Contact us
For more information about this course write to queries@teknowledge.in or walk into our IT Training centers located in Bangalore.

NEW TECHNOLOGIES
CONTACT US
Corporate Office - JP Nagar
Saisadhan,1st floor,100 feet ring road,15th cross,J.P nagar
6th phase, Near Sarakki gate, Next to Vodafone store,
Diagonally opp to ICICI bank, Bangalore-560 078
Phone:+91 80 41310812 / 40981613

Regional Office - Indira Nagar
No.2A,Robby Arcade(Second Floor) Above Cofee day, No.537,CMH Road, Indranagar, Bangalore-560 038
Phone:+91 80 41264581/86 Email: