Updated March 15, 2023
Introduction to SSIS Merge Join
The SSIS merge join is an important function in the toolbox used to execute SQL join operations like right outer join, left outer join, full outer join, and inner join in the integration services in the SQL server. Here the merge join can be executed only on the sorted data where the sort transformation is compulsory before applying any join functions in the merge join transformation. The transformation has dual input and a single output but doesn’t support any output error functions. The merge join transformation in SSIS is discussed briefly in this article.
What is SSIS Merge Join?
In the relational database, various normalization methods are implemented to split the information across multiple tables. In addition, multiple Join operators are available in the SSIS toolbox if the data needs to be retrieved together later. The tables combine the join key or shared key to merge the tables related to the user choice. The SQL Server setup also offers the same integration services to build and implement SSIS packages and perform data manipulations and transformations. With this, the user can perform functions like right and left outer join and inner join.
How to Use SSIS Merge Join?
The merge join transformation of SSIS is combined with two different sorted inputs into single output by using the join operation. Three join operations such as inner, full, and left are implemented. Using the sorting key in every data source, the merge join combines two inputs into a single output. The output columns are chosen from both sources.
As the right join is not given in the join type, the user can choose this option to swap inputs; the inputs are swapped from the right end to the left, the same as the right outer join transformation.
The output rows are different based on the type of join used. Inner join is used to return the rows that have the same values in both the data sources. The left outer join returns every row from the left end source with the same values found in the right side source. Then in full join transformation, it returns all the matching values in rows from the right and left sides of the data sources.
Create SSIS Merge Join
The SSIS merge join can be created when the data sources are ready. The data flow transformation can be added to perform merge join from the tab. It is available in the toolbox on the surface of the data flow design. Then link the data flow path from the dummy source component and demo source component to execute merge join transformation. The data path should be dragged from the Demo source component, then attach the headed arrow to the input and output dialog box from the drop-down list. The OLE DB source output has the output drop-down list, and the user can choose the required part. The merge joins left, and right input from the input list are chosen from the Dummy connection window.
Connection should be given to execute merge join transformation from the data path to the dummy data sources. At this time, the selection of the input-output dialog box will not appear. Instead of such a dialog box, the default list in the drop-down has the input values, the only remaining option. Then right input merge join is chosen, and the data flow path is structured.
The user can note some red circles with white marks on the merge join transformation, which denotes the error symbol. If the user executes any package of the current version, then at times, there may be an error message. The data should be sorted first before executing the merge join transformation. It can be done in two methods: the OLE DB source component and the other by adding sorting transformation to the data flow.
Example of SSIS Merge Join Transformation
An example of merge join transformation in SSIS is given. Create two files, file A and file B, and provide input to the files. The components of flat-file sources should be dragged and dropped and provide the name as source file W. Then, configure the properties used to link and extract the data from the text file. Then connection manager’s name should be provided and start to surf the file. First, the column name in the first data row should be check marked. Then, the column should be checked out, advanced, and then the preview to ensure the correct data and submit or click ok.
Then follow the same steps like the components of the flat file source should be dragged and dropped, and the name should be given as source file V, then configure accordingly.
The transformation in SSIS is a two-step sort that the user has to drag and drop in the data flow region and then double click it on the sort to make preferred settings. Then the column should be sorted in ascending order. Then the previous column also needs to be sorted in ascending order. Finally, the merge join transformation should be dragged and dropped in the data flow region to make an inner join on the given tables.
Open the merge join editor to execute the merge join transformation and configure it accordingly.
Then in the drop-down, choose a join type such as inner join. Then click mapping to ensure the input column is similar to the output column. Finally, complete the design to execute the inner join in the SSIS package. Then the user can opt for left outer join and now apply on the files to make the full outer join and execute it.
Conclusion
Hence, it is the basic operation of merge join transformation in SSIS. It is readily available as a package for the user to make preferred operations by merging two different data files to get a single desired output.
Recommended Articles
This is a guide to SSIS Merge Join. Here we discuss the definition, how to use and create merge join in SSIS, and its transformation example. You may also look at the following articles to learn more –