Ralf Eisenreich

SQLBlog.DE | ..things to remember

April 17th, 2009

SSIS: Split Excel cell values into rows

.Net, SQL Server, Uncategorized, by Ralf.

Here I would like to show you a solution for following problem.
One source row coming from Excel contains several values within one column. Each one of these values should be reflected in a separate row in the data destination.
So the basic problem is that there we have a “list column” whose contents should be result in separate rows.

To make the problem more complex: The number of values in the “list column” is variable and there is a second column which contains corresponding values.

Excel Source

Excel Sheet

Excel Sheet

solution: SSIS Script Component

To implement a solution for the problem we use the most adequate item – the Script Component.

  1. We start with a data flow task in our package and add an OLE DB Excel Source component (that returns the source data above).
    We can use following SQL statement:

    SELECT
    F1,F2,F3
    FROM
    [sheet1$B1:D5]

    The imported columns can be named as KEY, PERSON, DEPARTMENT.
    Excel Source

    Excel Source

  2. Then we add a Derived Column Component to the data flow as well.
    Derived Column

    Derived Column

    In order to replace the CRs with commas we replace the imported columns with following formula:

    REPLACE(REPLACE(REPLACE(REPLACE(PERSON,"\n",",")," ",""),",,",","),",,",",")
    REPLACE(REPLACE(REPLACE(REPLACE(DEPARTMENT,"\n",",")," ",""),",,",","),",,",",")

    Derived Column Settings

    Derived Column Settings

  3. As result we receive following list of elements:

    • – -
    • row 1
    • PERSON: Mr. Wayne,Mr. Kent,Mr. Parker
    • DEPARTMENT: Heros/Batman,Heros/Superman,Heros/Spiderman
    • – -
    • row 2
    • PERSON: Mr. Kirk,Ms. Janeway
    • DEPARTMENT: StarTrek/Classic,StartTrek/Voyager
    • – -
    • row 3
    • PERSON: “Mr. Wrong”
    • DEPARTMENT: Wrong entry 1,Wrong entry 2
    • – -
  4. Now we add a Script Component to the data flow.
    When we are prompted to select the type of the component we choose Transformation.

    Script Component

    Script Component

  5. Then we right-click on the Script Component and choose Edit. In the Script Transformation Editor dialog box we select all three columns to make them available as input columns. Now these column are available in the Script code.
    Script Component Columns

    Script Component Columns

  6. Since the tranformation is going to produce more output rows than it receives input rows we have to mark the transformation as asynchronous. This is done by setting the SynchronousInputID property of the output to None.
    Script Component Settings

    Script Component Settings

  7. Next, we have to specify the output columns that the output buffer will contain. So we select the transformation’s output and click Add Column and set the properties (column name and data type).
    Script Component Input / Output rows

    Script Component Input / Output rows

  8. Now we can add the code by clicking on the Design Script button.
    Script Component - Design Script

    Script Component - Design Script

  9. Following Code is splitting our list of items into rows:

    ' Variables
    Dim i As Integer = 0
    Dim itemList_PERSON As String = Row.PERSON
    Dim itemList_DEPARTMENT As String = Row.DEPARTMENT
    Dim delimiter As String = ','
    If Not (String.IsNullOrEmpty(itemList_PERSON) Or String.IsNullOrEmpty(itemList_DEPARTMENT)) Then
    Dim inputListArray_Wer() As String = itemList_PERSON.Split(New String() {delimiter}, StringSplitOptions.RemoveEmptyEntries)
    Dim inputListArray_Abt() As String = itemList_DEPARTMENT.Split(New String() {delimiter}, StringSplitOptions.RemoveEmptyEntries)
    If (inputListArray_Wer.Length = inputListArray_Abt.Length) Then
    ' new rows
    For Each item As String In inputListArray_Wer
    With Output0Buffer
    .AddRow()
    .KEY = CInt(Row.KEY)
    .PERSON = item
    .DEPARTMENT = inputListArray_Abt(i).ToString()
    End With
    ' increase counter
    i = i + 1
    Next
    Else
    ' Filtered Rows
    With Output1Buffer
    .AddRow()
    .KEY = CInt(Row.KEY)
    .PERSON = Row.PERSON
    .DEPARTMENT = Row.DEPARTMENT
    End With
    End If
    End If

    The code shows that there are two outputs. Whenever a record set cannot be split or the columns PERSON and DEPARTMENT have a different number of itemss the rows is redirected to the second output (here Output1).
  10. In order to receive the output from the Script Component we add a Union All Component. Additionally we add two Data Viewer Components, since we want to see the resulting outputs from the Script Component.
    Union ALL

    Union ALL

  11. Running the SSIS package will bring the following result.
    Data Flow completed

    Data Flow completed

  12. As we can see, the first DataViewer Component show us the valid rows produced by our Script Component.
    Data Viewer valid rows

    Data Viewer valid rows

  13. The second DataViewer Component show us the invalid rows produced by our Script Component.
    Data Viewer invalid rows

    Data Viewer invalid rows

download
Of course I provide you the solution as download.

Back Top

Responses to “SSIS: Split Excel cell values into rows”

Comments (5) Trackbacks (1) Leave a comment Trackback url
  1. thanks alot. it is very helpful info. regards

  2. Hey…..nice post!!

    Awesome, No more words to explain :) :) :D just….cool blog.

  3. Thank U. This was very useful.

  4. Thankyou – this has really helped me.

  1. small business it service (,October 30, 2011)

    small business it service…

    [...]SSIS: Split Excel cell values into rows | Ralf Eisenreich[...]…

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Security Code: