Performance Tuning at session level is applicable to remove Bottleneck at ETL data load. Session Partitioning means "Splitting ETL dataload in multiple parallel pipelines threads". It will be helpful on RDBMS like Oracle but not so effective for Teradata or Netezza (auto parallel aware architectural conflict ). Different Type of Partitioning supported by Informatica
1. Pass-Through (Default)
2. Round-robin
3. Database partitioning
4. Hash auto-keys
5. Hash user keys
6. Key range
Open Workflow Manager, Goto session properties, Mapping Tab, select Partition Hyperlink. Here we can add/delete/view partition,
Set Partition Point, Add Number of Partition then Partition type.
Pass-Through (Default) : All rows in a single partition: No data Distribution. Additional Stage area for better performance
Round-Robin : Equally data distribution among all partition using round robin algorithm. Each partition almost has same number of rows
Hash auto-keys : System generated partition key based on grouped ports at transformation level. When a new set of logical keys exists, Integration service generates a Hash key using Hash map and putted row to appropriate partition. Popularly used as Ramk, Sorter and Unsorted Aggregator
Hash user keys : User Defined group of ports for partition. For key value, System generated a Hash value using Hashing algorithm. Row is puted to ceratin partition based on Hash value.
Key range : Each port(s) for key range partition need to be assigned a range of value. Key value and range decide partition to held current value. Popularly used for Source and Target level.
System Level partitioning key generated for hash auto-keys, round-robin, or pass-through partitioning.
Session partitioning enables parallel processing logic of ETL load implementation. It enhance the performance using Multiprocessing/Grid processing ETL load.
Sunday, June 19, 2011
Informatica Partitioning
Sunday, June 12, 2011
PushDown Optimization
Source and target tables must be using same database connection on workflow manager level.
Workflow manager level changes,
goto -> session ->properties tab -> pushdown = full/source/target
Source/target/lookup all transformation must use same relational connection
Enable checkbox for temporary Views and Sequence if required
Set Datetime formate compatible to RDBMS used. It should be synch on mapping and session level
At the time of execution, Informatica engine creates views/nested views to buffer data from source. Using compatible SQL functions, Mapping transformation function (e.g. TO_CHAR, TO_DATE, SUBSTR etc) will be applied refering reference views.
To check if current settings enable the session for pushdown correctly or Not, Goto->mapping tab->click pushdown link in left treeview pan
It will show internal execution of pushdown and view creation with load plan
Errors will displayed in red color message
best of luck. We will know more in next Post
Sunday, May 24, 2009
pmcmd : Unix command to control Informatica Workflow
Session specific,
pmcmd stoptask -s $server_name -u $user_name -p $pass -f $folder -w $wkf $sess
Keys :
stoptask : To stop a task
gettaskdetails : To retrieve task details
starttask : To Execute a task
aborttask : To abort a task
waittask : To Run a task in wait mode
workflow specific,
pmcmd startworkflow-s $server_name -u $user_name -p $pass -f $folder $wkf
Keys :
stopworkflow : To stop a workflow
getworkflowdetails : To retrieve workflow details
startworkflow : To Execute a workflow
abortworkflow : To abort a workflow
waitworkflow : To Run a workflow in wait mode
pmcmd command with parameter file
pmcmd startworkflow-s $server_name -u $user_name -p $pass -f $folder -paramfile paramfile_name -wait $wkf
Apart from these pmcmd command options , various utility options available
(check informatica help file for details)
Sunday, May 17, 2009
Mapping Variables Stores Values in Informatica Repository
- Variable must be initialized
- We need to link SETVARIABLE() function column to same datatype column.
- It should executed at least once for insert/update/delete.
Saturday, May 16, 2009
Use of Update Override Properties In Target
Update Override
Informatica supports Update stretegy transformation to update/insert/delete operations on Target. If we need to process insert timestamp for insert scenario and Update timestamp for Update scenario.Generaly, We will use 2 UPDATE STRETEGY transformations to achieve this functionality. Think If we have a complex mapping with more then 10 Targets, Then we need to use 20 UPDATE STRETEGY transformations.
Lets do it in decent way, We can make this mapping simple and fast.
Take SYSDATE as a temperory column in a transformation. Link it with insert timestamp and update timestamp columns. Now open TARGET property in mapping. Select UPDATE OVERRIDE property.Generate the default update statement. Take update timestamp and remove insert timestamp column from default update statement. Now when update scenario will come, only update timestamp will be updated.
This DB type is basically used in data warehousing, to implement increamental logic. When Table stores insert timestamp and update timestamp.