Update else Insert in Informatica

In order to successfully implement upsert using Informatica we need to follow these steps:

  1. In the mapping we need an update strategy with DD_UPDATE before the Target.
  2. In the session the we need to have the following settings:
    1. Under Properties:Treat Source rows as Data driven
    2. Under Mapping, the Target should have both Insert and Update else insert checked.

I do not recommend update else insert for large sets of data if you need a process that is efficient and fast. It is better to stick with having two separates paths in the mapping. One for updates and the other one for Inserts.

How to efficiently de-dupe a table in Oracle

A bad approach to de-dupe a table is to delete the dupes. Unfortunately, when dealing with tables that have millions of rows the delete method is going to take a long time and consume a lot of server resources. Below is a step by step example that shows how we can de-dupe a table using “Partition By” Oracle command.

Step 1:
Let us create a table and load data into it.

CREATE TABLE TEST_DEDUPE
(
CUSTOMER_ID NUMBER(10),
BRAND       VARCHAR2(20),
TRANSACTION_DATE DATE
);
INSERT
INTO TEST_DEDUPE VALUES
(
1 ,
'GREAT VALUE' ,
TO_DATE('01/20/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
1 ,
'GREAT VALUE' ,
TO_DATE('01/20/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
1 ,
'GREAT VALUE' ,
TO_DATE('01/20/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
1 ,
'GREAT VALUE' ,
TO_DATE('01/22/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
1 ,
'GREAT VALUE' ,
TO_DATE('01/22/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
2 ,
'MEMBERS MARK' ,
TO_DATE('01/22/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
2 ,
'MEMBERS MARK' ,
TO_DATE('01/22/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
2 ,
'MEMBERS MARK' ,
TO_DATE('01/22/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
2 ,
'MEMBERS MARK' ,
TO_DATE('01/15/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
2 ,
'MEMBERS MARK' ,
TO_DATE('01/15/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
3 ,
'MEMBERS MARK' ,
TO_DATE('01/15/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
3 ,
'MEMBERS MARK' ,
TO_DATE('01/15/2011' , 'MM/DD/YYYY')
);
INSERT
INTO TEST_DEDUPE VALUES
(
4 ,
'MEMBERS MARK' ,
TO_DATE('01/15/2011' , 'MM/DD/YYYY')
);

Now let us look at the data:

SELECT * FROM TEST_DEDUPE
CUSTOMER_IDBRANDTRANSACTION_DATE
1GREAT VALUE20-JAN-11
1GREAT VALUE20-JAN-11
1GREAT VALUE20-JAN-11
1GREAT VALUE22-JAN-11
1GREAT VALUE22-JAN-11
2MEMBERS MARK22-JAN-11
2MEMBERS MARK22-JAN-11
2MEMBERS MARK22-JAN-11
2MEMBERS MARK15-JAN-11
2MEMBERS MARK15-JAN-11
3MEMBERS MARK15-JAN-11
3MEMBERS MARK15-JAN-11
4MEMBERS MARK15-JAN-11

 

We noticed that Customer_IDs {1, 2, 3} have duplicates. For instance, Customer_ID 1 have three rows with Brand = “Great Value” and Purchase date = “20-JAN-11”. After de-duping we should only have one row.


Step 2:

Now let us rename the table that we need to de-dupe.

ALTER TABLE TEST_DEDUPE
RENAME TO TEST_DEDUPE_OLD;

Step 3:
Now let’s create a new table that has the same name as the original table.

CREATE TABLE TEST_DEDUPE AS
(SELECT CUSTOMER_ID ,
BRAND ,
TRANSACTION_DATE
FROM
(SELECT CUSTOMER_ID ,
BRAND ,
TRANSACTION_DATE ,
ROW_NUMBER()
OVER
(PARTITION BY CUSTOMER_ID , BRAND , TRANSACTION_DATE
ORDER BY TRANSACTION_DATE ) AS RN
FROM TEST_DEDUPE_OLD
)
WHERE RN = 1
);

Final Step:
Let us view the data without dupes:

SELECT * FROM TEST_DEDUPE;
CUSTOMER_IDBRANDTRANSACTION_DATE
1GREAT VALUE20-Jan-11
1GREAT VALUE22-Jan-11
2MEMBERS MARK15-Jan-11
2MEMBERS MARK22-Jan-11
3MEMBERS MARK15-Jan-11
4MEMBERS MARK15-Jan-11

 

 

How to extract the domain name from an email address

Method 1:

SELECT SUBSTR(
'EXAMPLE@YAHOO.COM' ,
INSTR('EXAMPLE@YAHOO.COM' , '@' ) + 1 )
AS EMAIL_DOMAIN
FROM DUAL;

Method 2:

SELECT
REPLACE(
REGEXP_SUBSTR( 'EXAMPLE@YAHOO.COM' , '@.*' ) , '@' )
AS EMAIL_DOMAIN
FROM DUAL;

All you need to do now is to replace ‘EXAMPLE@YAHOO.COM’ with your email address column name or value.

 

How to denormalize a table in oracle.

Let us first begin by creating a table and setting up some data.

Topic 1:
Customer and Brand are a one to one relationship.

 CREATE TABLE CUSTOMER_BRAND
(
CUSTOMER_ID NUMBER(10),
BRAND VARCHAR2(20)
);
INSERT INTO CUSTOMER_BRAND VALUES( 1 , 'BMW' );
INSERT INTO CUSTOMER_BRAND VALUES( 1, 'CHEVY' );
INSERT INTO CUSTOMER_BRAND VALUES( 1, 'FORD' );
INSERT INTO CUSTOMER_BRAND VALUES( 2 , 'BMW' );
INSERT INTO CUSTOMER_BRAND VALUES( 2 , 'FORD');
INSERT INTO CUSTOMER_BRAND VALUES( 3 , 'CHEVY');

Please note that a customer can only have one brand.

Now let’s try to view the data in the table.

SELECT * FROM CUSTOMER_BRAND;

CUSTOMER_BRAND

CUSTOMER_IDBRAND
1BMW
1CHEVY
1FORD
2BMW
2FORD
3CHEVY

Now we would like to flatten the data (aka pivot).

SELECT DISTINCT CUSTOMER_ID ,
(
CASE
WHEN (SELECT 'Y'
FROM CUSTOMER_BRAND A
WHERE BRAND       = 'BMW'
AND A.CUSTOMER_ID = B.CUSTOMER_ID ) = 'Y'
THEN 'Y'
ELSE 'N'
END) AS BMW ,
(
CASE
WHEN (SELECT 'Y'
FROM CUSTOMER_BRAND A
WHERE BRAND       = 'FORD'
AND A.CUSTOMER_ID = B.CUSTOMER_ID) = 'Y'
THEN 'Y'
ELSE 'N'
END) AS FORD,
(
CASE
WHEN (SELECT 'Y'
FROM CUSTOMER_BRAND A
WHERE BRAND       = 'CHEVY'
AND A.CUSTOMER_ID = B.CUSTOMER_ID ) = 'Y'
THEN 'Y'
ELSE 'N'
END) AS CHEVY
FROM CUSTOMER_BRAND B;

Results:

CUSTOMER_IDBMWFORDCHEVY
1YYY
3NNY
2YYN