Sharing Knlowledge With The World…

Category: Uncategorized

Informatica Best Practices for Cleaner Development

Informatica Best Practices

 

Don’t you just hate it when you can’t find that one mapping out of the thousand odd mappings present in your repository ??

A best practice is a method or technique that has consistently shown results superior to those achieved with other means, and that is used as a benchmark. In addition, a “best” practice can evolve to become better as improvements are discovered.Following these Informatica Best Practices guidelines , would allow better Repository Management , which would make your Life Easier. Incorporate these practices when you create informatica objects and your life would be much easier:

Mapping Designer

  • There should be a place holder transformation (expression) immediately after the source and one before the target.
  • Active transformations that reduce the number of records, should be used as early as possible.
  • Connect only the ports that are required in targets to subsequent transformations.
  • If a join must be used in the Mapping, select the driving/master table while using joins.
  • For generic logic to be used across mappings, create a mapplet and reuse across mappings.

 

Transformation Developer

  • Replace complex filter expression with a (Y/N) flags. Filter expression will take lesser time to process the flags than the logic.
  • Persistent caches should used in look ups if the look up data is not expected to change often.

Naming conventions – name the informatica transformations starting with the first 3 letters in small case indicating the transformation. E.g. : lkp_<name of the lookup> for Look Up, rtr_<name of router> for router transformation etc.

 

Workflow Manager

  • Naming convention for session, worklet, workflow- s_<name of the session>, wlt_<name of the worklet>, wkf_<name of the workflow>.
  • Sessions should be created as re usable to be used in multiple workflows.
  • While loading tables for full loads, truncate target table option should be checked.
  • Workflow Property “Commit interval” (Default value : 10,000) should be increased for increased for Volumes more than 1 million records.
  • Pre-Session command scripts should be used for disabling constraints, building temporary tables, moving files etc. Post-Sessions scripts should be used for rebuilding indexes and dropping temporary tables.

 

Performance Optimization Best Practices

We often come across situations where Data Transformation Manager(DTM) takes more time to read from Source or when writing in to a Target. Following standards/guidelines can improve the overall performance.

  • Use Source Qualifier if the Source tables reside in the same schema
  • Make use of Source Qualifier “Filter” properties if the Source type is Relational
  • Use flags as integer, as the integer comparison is faster than the string comparison
  • Use tables as lesser number of records as master table for joins
  • While reading from Flat files, define the appropriate data type instead of reading as String and converting
  • Have all ports that are required connected to Subsequent transformations else check whether we can remove these ports

 

  • Suppress ORDER BY using the ‘- – ’ at the end of the query in Lookup transformations
  • Minimize the number of Update strategies
  • Group by simple columns in transformations like Aggregate, Source qualifier
  • Use Router transformation in place of multiple Filter transformations
  • Turn Off the Verbose logging while moving the mappings to UAT/Production environment
  • For large volume of data drop index before loading and recreate indexes after load
  • For large of volume of records Use Bulk load increase the commit interval to a higher value large volume of data
  • Set ‘Commit on Target’ in the sessions

 

These are a few things a beginner should know when he starts coding in Informatica . These Informatica Best Practices guidelines are a must for efficient Repository and overall project management and tracking.

Continue Reading

Clover ETL

clover etl

We are living in exciting times in the field of data integration , tools like Talend , Pentaho are giving heavyweight tools Informatica ,Ablnitio and Datastage a run for their money . The features offered by these tools though not as matured as the bigger heavy weight tools , but are exhaustive nonetheless (especially the features offered by Talend) . Today we are discussing another such tool , Clover ETL .

Clover Overview

Clover Data Integration is relatively new compared to their competitors Talend , Pentaho . Like Talend , Clover uses the Eclipse framework for the visual editor and use JRE to run transformations the difference being Clover is a metadata driven tool and does not need code generation to run jobs.

CloverETL has a smaller pallete but offers functionality which is much complex.They are easier to choose from with well defined functionality.But features like parallelism is supported in the enterprise version , which seems to be a trend in all these “opensource data integration” tools.

Features:

  • Pass parameter to Graphs through file.
  • Enables visual debugging and data monitoring at any point of time.
  • Easy switch between graphs.
  • Share connections between data structures.

 

Input and Output Components:

Capture1

 

 

 

 

 

 

 

 

Transformations Provided:

CloverETL-Transformation

 

 

 

 

 

 

 

 

 

 

 

 

Types of Joins Provided:

CloverETL- Joiners

 

 

 

 

 

 

If you would like to learn more about this awesome tool , access the following link :CloverETL

Conclusion

Clover ETL shows a lot of promise and is a very powerful tool in its own right. Provided this article just provide an overview of the tool , you can expect more articles in the future to have a detailed Talend v/s CloverETL analysis. Given the economic slowdown where bootstrapping is a necessity , CloverETL along with other tools like Talend and Pentaho are reliable , powerful options to consider !!

Continue Reading

Inheritance in Object-oriented Programming

Inheritance in Object-oriented Programming

Inheritance in Object-oriented Programming is an important concept. Inheritance defines the relationship between the different objects.

Inheritance is the process wherein characteristics are inherited from ancestors. Similarly, in Java, a subclass inherits the characteristics (properties and methods) of its superclass (ancestor).

An object is able is able to inherit characteristics from another object. In other words we can say that an object is able to pass on its state and behavior to its children. The objects need to have characteristics in common with each other for inheritance to work.

For example Mountain bikes, road bikes, tandem bikes, all share common characteristics of bikes like speed, gear. Also all these bikes have some additional functionalities.

Inheritance in Object-oriented Programming allows classes of mountain bike, road bike, tandem bike to having the properties of class bike and bike became the superclass of all the type of bikes. In Java Programming language each subclass can have one direct superclass and superclass can have unlimited number of subclass.

Inheritance in Object-oriented Programming

Inheritance in Object-oriented Programming

 

One Liner: Inheritance is a mechanism wherein a new class is derived from an existing class.

In Java, classes can inherit or we can say acquire the properties and methods of other classes. A class that inherits from another class is called a subclass, whereas the class from which a subclass is derived is called a superclass. The keyword “extends” is used to derive a subclass from the superclass.

There is simple syntax for creating a subclass. Use the extend keyword at the beginning of your class declaration, followed by the name of the class to inherit from (superclass):

class MountainBike extends Bike {

 

   // new fields and methods defining

   // a mountain bike would go here

 

}

This gives Mountain Bike all the same fields and methods as Bike, yet allows its code to focus exclusively on the features that make it unique.

Continue Reading

Interface in Object-oriented Programming

Interface in Object-oriented Programming

Objects interact with outside word through the methods that they exposed.

So Interface can be defined as contract which has enlisted all the methods which must be supported by the objects.

One Liner: An Interface in Object-oriented Programming is a blueprint of a class. It has static constants and abstract methods.

The interface is a mechanism to achieve fully abstraction in java. There can be only abstract methods in the interface. It is used to achieve fully abstraction and multiple inheritances in Java.

Consider ‘receive’ and ‘reject’ buttons on your phone. You know that these buttons will receive and reject the call and must support by the phone that’s why they are in the phone. The buttons acts as interface between you and phone.

Interface Phone {

Void receives ()

Void reject ();

            }

Interfaces cannot be instantiated, but rather are implemented.Implementing an interface allows a class to become more formal about the behavior it promises to provide. Interfaces form a contract between the class and the outside world, and this contract is enforced at build time by the compiler. If your class claims to implement an interface, all methods defined by that interface must appear in its source code before the class will successfully compile.

In Java IS-A relationship is represented by interface.

Interface in Object-oriented Programming

Interface in Object-oriented Programming

 

Continue Reading

Class in Object-oriented Programming

Class in Object-oriented Programming

Class in object-oriented programming, a class is an extensible program-code-template for creating objects, providing initial values for state (member variables) and implementations of behavior (member functions, methods).

 

One Liner: A Class in Object-oriented Programming is an expanded concept of a data structure: instead of holding only data, it can hold both data and functions.

A class is a blueprint or template or set of instructions to build a specific type of object. Every object is built from a class.

Class in Object-oriented Programming

Class in Object-oriented Programming

Let’s try to understand the class with the example of making cookies.

Class is like recipe of how to make cookie. The recipe itself is not a cookie. You can’t eat the recipe. If you follow the recipe then you can make cookie. The cookie which we made from the recipe can be referred as an object.

You can make as many cookies as you would like using the same recipe. Similarly you can create as many instances of a class as you would like.

Now assume that you have to create cookies for different people. How you will identify which cookie is for which person. A simple solution is that write name of the person on the cookie. Reference variables work in similar fashion. A reference variable provides a unique name for each instance of a class. In order to work with a particular instance, you use the reference variable it is assigned to.

Creating class Cookie:

Class Cookie {

String flavor;

String taste;

}

 

Creating cookies for different persons:

Cookie cookieForPerson1 = new Cookie ();

Cookie cookieForPerson2 = new Cookie ();

 

 

Continue Reading

Package in JAVA

JAVA

Suppose you and your friend are freelancers and work as a team. You got a project which has to complete in one week. So you decided to divide the project in two modules and each of you works on different module. In this way you can complete the project in time. Say the name of the module on which you working is ‘module-A’ and the module on which on your friend is working is ‘module-B’.

Now both of the modules have the class name ‘Utility’.

When you combine both the module at the last then you have two ‘Utility’ classes. Now how you would identify which ‘Utility’ class belong to whom. Package is the solution for these types of problems.

Both of you write your code in your package and the both the ‘Utility’ classes belong to different packages, so there is no ambiguity.

One Liner: A Package in JAVA is a namespace that organizes a set of related classes and interfaces.

You can relate packages as being similar to the folders on your computer. You can put images in one folder, application in another and movies in another. Software written in Java programming language can be composed of thousands of classes so it is good practice to organize our packages and classes in by putting then in different packages.

When creating a package, you should choose a name for the package and put a package statement with that name at the top of every source file that contains the classes, interfaces, enumerations, and annotation types that you want to include in the package.

The package statement should be the first line in the source file. There can be only one package statement in each source file, and it applies to all types in the file.

If a package statement is not used then the class, interfaces, enumerations, and annotation types will be put into an unnamed package.

 

Example of Package in JAVA:

 

package car;

 interface Car {

   public void run();   public void break();

}

 

package car; 

public class Honda implements Car {  

public void run(){}  

public void break(){}

}

 

Continue Reading

Get Latest File from FTP using shell scripting in Unix

Hello readers,I couldn’t find anything much helpful regarding this topic , hence decided to do what we do best, blog about it !!!!!Our FTP server receives files on a daily basis with the format FILENAME_YYYYMMDD.txt , hence we needed a shell script to get the file with the latest timestamp from the server.

Here’s the shell script we developed:

************code_begins********************************

#!/bin/ksh
#file_format : filename_YYYYMMDD

ftp -inv sa2-sftp01.zs.local << END_FTP_SCRIPT
quote USER USERNAME
quote PASS PASSWORD
bin
cd DIRECTORY_PATH
ls . temp.txt
bye
#cat temp.txt
END_FTP_SCRIPT
pwd
fil_name = `awk -F_ ‘{print $1 ” ” $2}’ temp.txt | sort -n -k 2 | tail -1`
#awk -v {print $9} $filename
echo $fil_name
#ftp -inv sa2-sftp01.zs.local<< EOF
#quote USER USERNAME
#quote PASS PASSWORD
#bin
#cd DIRECTORY_PATH
#get($filename)
#bye
#quit

************code_ends********************************(Changes to code to be made don’t publish)

Continue Reading
PageLines