Denormalize Transactional Data using Informatica

Let’s de-normalize transactional data.
Here is the input:

CUSTOMER_IDPRODUCT_NAMEPRODUCT_PRICETRANSACTION_DATE
1COOKING BOOKS12.506/07/2013 10:57:10
1MEDICAL BOOKS120.506/07/2013 10:57:10
1TRAVEL BOOKS10.506/07/2013 10:57:10
2FOOD PROCESSOR12.506/07/2013 10:57:10
2INTELLIGENT OVEN120.506/07/2013 10:57:10
2SAFE10.506/07/2013 10:57:10
3CLOTHING12555.2506/07/2013 10:57:10
3CAR VACUUM15520.4506/07/2013 10:57:10
3HOTEL PASS1550.5506/07/2013 10:57:10
3MAGAZINE1110.5606/07/2013 10:57:10
4ZOO PASS110.406/07/2013 10:57:10

Here is the output:
The table is quite wide so you can download it as csv.

Here is the mapping overview:
SQ ==> EXP ==> AGGR ==> TGT

mappingOverview

Step 1:
In SQ_ORDERS sort by CUSTOMER_ID

Step 2:
In EXP_SetSequenceNumbers create three variables as follow:

  1. currCustomer_v ==> to_char(CUSTOMER_ID)
  2. Seq_v ==> IIF(currCustomer_v != prevtCustomer_v , 1 ,  Seq_v + 1  ). Each time we get a new customer_id the Seq get reset to 1 otherwise it will increment by 1.
  3. prevtCustomer_v ==>to_char(CUSTOMER_ID)

Make sure that prevtCustomer_v is below currCustomer_v and Seq_v as in the following pic:

setSequenceNumberTrans

Step 3:
In AGG_DenormalizeOrders make sure that the following is set:

  1. Sorted input is checked.
  2. Group by is checked for Customer_ID
  3. Create outputs to house ProductName, ProductPrice and TransactionDate
  4. For each sequence select the value of seq. For instance, ProductName3 select FIRST( PRODUCT_NAME, seq = 3) etc…

For better understanding please see pic below:

AggregatorDenormalizeOrders

 

Step 5:
Connect the aggregator to the target.
Done!

How to denormalize data in Informatica

Most of you know that Informatica does not provide a function for denormalization. However, denormalizing a table is easy and could be done in Informatica by using two functions: an Expression and an Aggregator.

Step 1:
Let us look at the data:

CUSTOMER_BRAND

CUSTOMER_IDBRAND
1BMW
1CHEVY
1FORD
2BMW
2FORD
3CHEVY

 

Step 2:
Let us look at the desired output:

CUSTOMER_IDBMWFORDCHEVY
1YYY
3NNY
2YYN

 

Step 3:
Let us look at the general architecture of the mapping:

The mapping consists of the following a source table, a SQ, an Expression, an Aggregator, an Expression and a Target.

 

Step 4:
Let us look at the SQ:

Make sure that we sort the on CUSTOMER_ID asc. This is needed so we can send sorted input to the aggregator.

 

Step 5:
Let us look at the Expression function

The expression function will set the Brand to “Y” if it exists else it will set it to “N”.


Step 6:

Let us look the Aggregator function:

Since “Y” > “N” then if a customer has the brand it will output “Y” else it will output “N”.


Step 7:
The final step is easy just connect the Aggregator to the Target:

If you are interested in a more complicated example please check: Denormalize Transactional Data using Informatica

Oracle Pivot function

I was amazed at how handy the Pivot function can be. It empowers the user to provide aggregate and denormalized view of a data set which could be a query or a table. In this article I am going to provide a brief example of the pivot function.

First of all let us set some data: 


CREATE TABLE SAMPLE_TRANSACTIONS
(
TRANSACTION_ID NUMBER(8),
COMPANY_NAME VARCHAR2(30),
STATE VARCHAR2(2),
TRANSACTION_AMOUNT NUMBER(10,2)
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 1 , 'PALERMO' , 'TX' , 300.22
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 2 , 'PALERMO' , 'CT' , 600.22
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 3 , 'PALERMO' , 'PA' , 900.669
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 4 , 'BOLOGNA' , 'TX' , 500.22
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 5 , 'BOLOGNA' , 'CT' , 1100.22
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 6 , 'BOLOGNA' , 'PA' , 100.669
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 7 , 'PARMA' , 'NY' , 1300.22
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 8 , 'PARMA' , 'NJ' , 1600.22
);
INSERT INTO SAMPLE_TRANSACTIONS VALUES
( 9 , 'PARMA' , 'CA' , 1900.669
);

Now let us view the data:

SELECT * FROM SAMPLE_TRANSACTIONS;

 

TRANSACTION_IDCOMPANY_NAMESTATETRANSACTION_AMOUNT
1PALERMOTX300.22
2PALERMOCT600.22
3PALERMOPA900.67
4BOLOGNATX500.22
5BOLOGNACT1100.22
6BOLOGNAPA100.67
7PARMANY1300.22
8PARMANJ1600.22
9PARMACA1900.67

 

Now let us create a report that provides the sum of the transactions by company at a state level:
Let us first execute this query:


SELECT *
FROM
( SELECT COMPANY_NAME , STATE , TRANSACTION_AMOUNT
FROM SAMPLE_TRANSACTIONS
) PIVOT
( SUM(TRANSACTION_AMOUNT)
FOR
COMPANY_NAME IN
('BOLOGNA' AS BOLOGNA , 'PARMA' AS PARMA, 'PALERMO' AS PALERMO) )
ORDER BY STATE;

Results:

STATEBOLOGNAPARMAPALERMO
CA{null}1900.67{null}
CT1100.22{null}600.22
NJ{null}1600.22{null}
NY{null}1300.22{null}
PA100.67{null}900.67
TX500.22{null}300.22

 

Now let us create a report that provides the sum of the transactions by state at a company level:

Let us first execute this query:


SELECT *
FROM
( SELECT COMPANY_NAME , STATE , TRANSACTION_AMOUNT FROM SAMPLE_TRANSACTIONS
) PIVOT
( SUM(TRANSACTION_AMOUNT)
FOR STATE IN ('CT' CT , 'NY' NY, 'TX' TX) )
ORDER BY COMPANY_NAME;

Results

COMPANY_NAMECTNYTX
BOLOGNA1100.22{null}500.22
PALERMO600.22{null}300.22
PARMA{null}1300.22{null}

 

As you can see above there are a lot of possibilities on how to leverage the pivot function. For further reading I recommend this link Oracle Pivot.

 

How to denormalize a table in oracle.

Let us first begin by creating a table and setting up some data.

Topic 1:
Customer and Brand are a one to one relationship.

 CREATE TABLE CUSTOMER_BRAND
(
CUSTOMER_ID NUMBER(10),
BRAND VARCHAR2(20)
);
INSERT INTO CUSTOMER_BRAND VALUES( 1 , 'BMW' );
INSERT INTO CUSTOMER_BRAND VALUES( 1, 'CHEVY' );
INSERT INTO CUSTOMER_BRAND VALUES( 1, 'FORD' );
INSERT INTO CUSTOMER_BRAND VALUES( 2 , 'BMW' );
INSERT INTO CUSTOMER_BRAND VALUES( 2 , 'FORD');
INSERT INTO CUSTOMER_BRAND VALUES( 3 , 'CHEVY');

Please note that a customer can only have one brand.

Now let’s try to view the data in the table.

SELECT * FROM CUSTOMER_BRAND;

CUSTOMER_BRAND

CUSTOMER_IDBRAND
1BMW
1CHEVY
1FORD
2BMW
2FORD
3CHEVY

Now we would like to flatten the data (aka pivot).

SELECT DISTINCT CUSTOMER_ID ,
(
CASE
WHEN (SELECT 'Y'
FROM CUSTOMER_BRAND A
WHERE BRAND       = 'BMW'
AND A.CUSTOMER_ID = B.CUSTOMER_ID ) = 'Y'
THEN 'Y'
ELSE 'N'
END) AS BMW ,
(
CASE
WHEN (SELECT 'Y'
FROM CUSTOMER_BRAND A
WHERE BRAND       = 'FORD'
AND A.CUSTOMER_ID = B.CUSTOMER_ID) = 'Y'
THEN 'Y'
ELSE 'N'
END) AS FORD,
(
CASE
WHEN (SELECT 'Y'
FROM CUSTOMER_BRAND A
WHERE BRAND       = 'CHEVY'
AND A.CUSTOMER_ID = B.CUSTOMER_ID ) = 'Y'
THEN 'Y'
ELSE 'N'
END) AS CHEVY
FROM CUSTOMER_BRAND B;

Results:

CUSTOMER_IDBMWFORDCHEVY
1YYY
3NNY
2YYN