Thursday 24 April 2014

Aggregation without using aggregator transformation in informatica



Aggregation without using aggregator transformation in informatica
Here the scenario is to find the sum of salary department wise.
My source is
Deptno
Sal
20
800
30
1600
30
1250
20
2975
30
1250
30
2850
10
2450
20
3000
10
5000
30
1500
20
1100
30
950
20
3000
10
1300

And my target will be

Deptno
Sal
10
8750
20
10875
30
9400

Solution:
  Algorithm
1. Import source
2. If the source is relational database table apply sorting in the source qualifier. If the source is flat file take a sorter and pass ports from source into the sorter transformation and checked Deptno as key and select ascending in the Direction option. ( even though the source from relational database  you can take a sorter transformation for sorting but the sorting in the source qualifier will increase performance).
3. Take an expression transformation and drag Deptno and Sal ports from the sorter or source qualifier which one you are using.
4. And do like

5. Take another sorter and apply 

6. Again take a expression transformation and apply

7. Take a filter transformation and apply filter condition as FLAG_O =1  as

8. Pass the ports from filter into the target.

That’s it. The final mapping will look like this.

Tracing level

It is very much important, to track the cause of error. But, we have to take care of the performance also.
It is the level of details capture in the log file.  If we increase the amount of detail information it will reduce the performance. So we have to take care, while we choosing the tracing level.

Tracing Level
Description
Normal
Integration Service logs initialization and status information, errors encountered, and skipped rows due to transformation row errors, Summarizes session result, but not at the level of individual rows.
Terse
Integration Service logs initialization information, error messages, and notification of rejected data.
Verbose Initialization
Normal + the Integration Service logs additional initialization details, names of index and data files used, and detailed transformation statistics.
Verbose Data
Verbose Initialization + the Integration logs each row that passes into the mapping. Also where the Integration Service truncates string data fit the precision of a column and provides detailed transformation statistics.
When we configure the tracing level to verbose data, the Integration Service writes row data for all rows in a block when it processes a transformation.

Reject File in Informatica


Integration Service may create reject file when we run a session. So that we can rectify the reason of rejection and correct the code and run it again with the right code. Thanks to the people who developed the Informatica Power Center, they know that some people will come up with the wrong code or wrong data.

Then when and why the reject file will create

Reject file will create when
        1.       The field contains NULL values
        2.       When Numeric data exceeded the specified precision or scale for the column.
        3.       String data exceeded a specified precision for the column, so the Integration Service truncated it.

Integration Service create reject files for each instance in the mapping. By default the reject file is created at the $PMBadFileDir process variable directory. But we can change the directory path by changing value in the mapping tab for the session.
One important point is that Integration Service will create multiple reject files if we run a session with multiple partitions. Integration Service will create one reject file for each partition.

How reject file look like?

 When we open a reject file,( we can use any text editor) we will get rows of data rejected by the writer or target database. Even though the error occurred in only one column of a row, Integration Service will write the entire row in the reject file.
But the question is how we will quickly identify which column is responsible for the rejection?
The answer is, Integration Service adds row indicator and column indicator so that we can find out the reason of rejection.
Row Indicator: The firs column in each row in the reject file is row indicator. The row indicator defines the row was marked for insert, update, delete or reject.

Row Indicator
Meaning
Rejected by
0
Insert
Writer or target
1
Update
Writer or target
2
Delete
Writer or target
3
Reject
Writer
4
Rolled-back insert
Writer
5
Rolled-back update
Writer
6
Rolled-back delete
Writer
7
Committed insert
Writer
8
Committed update
Writer
9
Committed delete
Writer

Column Indicator: we can find column indicator, after every column of data. The column indicator defines whether the column contains valid, overflow, null, or truncated data.

Column Indicator
Type of data
Writer Treats As
D
Valid data.
Good data. Writer passes it to the target database. The target accepts it unless a database error occurs, such as finding a duplicate key.
O
Overflow. Numeric data exceeded the specified precision or scale for the column.
Bad data, if you configured the mapping target to reject overflow or truncated data.
N
Null. The column contains a null value.
Good data. Writer passes it to the target, which rejects it if the target database does not accept null values.
T
Truncated. String data exceeded a specified precision for the column, so the Integration Service truncated it.
Bad data, if you configured the mapping target to reject overflow or truncated data.


Tuesday 22 April 2014

Difference between Joiner and Source Qualifier

Joiner
Source Qualifier
Join is possible with heterogeneous source
Join is possible only with the relational database within the same schema
It require input pipelines
It does not require input pipelines
At least one matching filed is required to apply join
It doesn’t require matching fields it join based on matching ports.

Padding using Java Transformation

I have a source
Dname
RESEARCH
SERVICE
HR
IT
SALES
PRODUTION
MARKETING
FINANCE

and Target should be

Dname
Length
Max Length
Right Padding
RESEARCH
8
9
RESEARCH*
SERVICE
7
9
SERVICE**
HR
2
9
HR*******
IT
2
9
IT*******
SALES
5
9
SALES****
PRODUTION
9
9
PRODUTION
MARKETING
11
9
MARKETING
FINANCE
7
9
FINANCE**

Solution
The logic is to find the maximum length of the Dname field and make the length of the Right Padding field equal to the maximum length of Dname field by putting '*' at the right.


Algorithm
        1.       Import source
        2.       Pass ports into an expression transformation
        3.       Create one output Port dlength as integer data type in the expression transformation.
        4.       In the expression of dlength port write as length(DNAME). It will give the length of the input dname value.
        5.       Create another output port dummy as integer data type.
        6.       In the expression of dummy port give a value 1.
        7.       Take an aggregator transformation.
        8.       Pass the ports dummy and dlength into aggregator transformation.
        9.       Select Group By check box of dummy.
      10.    Create an output port max_lenght as integer data type.
      11.   In the expression of the max_length port write max(dlength). It will give maximum length of the dlength.
      12.   Checked Sorted input of the aggregator transformation.
      13.   Take a joiner transformation.
      14.   Pass the ports dummy,Dname and dlenght from expression transformation into the joiner transformation and checked Sorted Input in the joiner properties.
      15.   Pass the ports dummy1 and max_length from the aggregator transformation into joiner transformation.
      16.   In the joiner transformation give joiner condition as dummy = dummy1
      17.   Take a java transformation.
      18.   Pass the ports Dname,dlength and max_length from joiner transformation into the Java Transformation.
      19.   In the java transformation create one output port new_dname as string.
      20.   Go to java code tab.
21. Select the Helper Code tab and put the code as



      22.   And go to On Input Row


     23.   Click compile.
     24.   Pass the ports Dname,dlength,max_length and new_dname into the target.

The final mapping will be look like

Then create workflow and run it. That’s all.