Sunday, June 19, 2011

Informatica Partitioning



Performance Tuning at session level is applicable to remove Bottleneck at ETL data load. Session Partitioning means "Splitting ETL dataload in multiple parallel pipelines threads". It will be helpful on RDBMS like Oracle but not so effective for Teradata or Netezza (auto parallel aware architectural conflict ). Different Type of Partitioning supported by Informatica
1. Pass-Through (Default)
2. Round-robin
3. Database partitioning
4. Hash auto-keys
5. Hash user keys
6. Key range

Open Workflow Manager, Goto session properties, Mapping Tab, select Partition Hyperlink. Here we can add/delete/view partition,
Set Partition Point, Add Number of Partition then Partition type.

Pass-Through (Default) : All rows in a single partition: No data Distribution. Additional Stage area for better performance
Round-Robin : Equally data distribution among all partition using round robin algorithm. Each partition almost has same number of rows
Hash auto-keys : System generated partition key based on grouped ports at transformation level. When a new set of logical keys exists, Integration service generates a Hash key using Hash map and putted row to appropriate partition. Popularly used as Ramk, Sorter and Unsorted Aggregator
Hash user keys : User Defined group of ports for partition. For key value, System generated a Hash value using Hashing algorithm. Row is puted to ceratin partition based on Hash value.
Key range : Each port(s) for key range partition need to be assigned a range of value. Key value and range decide partition to held current value. Popularly used for Source and Target level.
System Level partitioning key generated for hash auto-keys, round-robin, or pass-through partitioning.
Session partitioning enables parallel processing logic of ETL load implementation. It enhance the performance using Multiprocessing/Grid processing ETL load.

Sunday, June 12, 2011

PushDown Optimization

Informatica 8+ version enabled Push down technique of execution of ETL. when we are using Push Down feature, Every transformation logic will be executed directly on RDBMS used for source/target. Informatica server will not store any intermediate data storage for session execution. Starting up with a very basic Pushdown mapping,






Source and target tables must be using same database connection on workflow manager level.
Workflow manager level changes,
goto -> session ->properties tab -> pushdown = full/source/target
Source/target/lookup all transformation must use same relational connection
Enable checkbox for temporary Views and Sequence if required
Set Datetime formate compatible to RDBMS used. It should be synch on mapping and session level
At the time of execution, Informatica engine creates views/nested views to buffer data from source. Using compatible SQL functions, Mapping transformation function (e.g. TO_CHAR, TO_DATE, SUBSTR etc) will be applied refering reference views.
To check if current settings enable the session for pushdown correctly or Not, Goto->mapping tab->click pushdown link in left treeview pan
It will show internal execution of pushdown and view creation with load plan
Errors will displayed in red color message
best of luck. We will know more in next Post