Tuesday, March 27, 2012

combining multiple tables into a single flat file destination

Hi,

I want to combine a series of outputs from tsql queries into a single flat file destination using SSIS.

Does anyone have any inkling into how I would do this.

I know that I can configure a flat file connection manager to accept the output from the first oledb source, but am having difficulty with subsequent queries.

e.g. output

personID, personForename, personSurname,
1, pf, langan

***Roles
roleID, roleName
1, developer
2, architect
3, business analyst
4, project manager
5, general manager
6, ceo

***joinPersonRoles
personID,roleID
1,1
1,2
1,3
1,4
1,5
1,6

Use Merge Joins to join your data sources. You'll need to use more than one because the Merge Join uses two inputs only.

http://msdn2.microsoft.com/en-us/library/ms141775.aspx

|||

You can do the merge join transformations as phil stated, or use a series of lookup transformations, or you could simply do a join on your initial data source query...

select out.personID, out.personForename, out.personSurname, role.roleID, role.roleName

from output as out

INNER JOIN

joinPersonRoles as pr

ON

pr.personID = out.personID

INNER JOIN

roles as role

ON

role.roleID = pr.roleID

Do you want your output to have a single row per person or are multiple rows ok? If you need it all in a single row you will also need to use a pivot transformation (or use pivot in your t-sql statement).

|||Performing the join in the source query via T-SQL as EWisdahl stated is the best route because it lets the database engine do the work.|||Ah, but you're missing my question.

I know how to do joins in order to produce a single recordset to output to a flat file.

I don't want to output a single set of records. I want to produce 3 sets, and output them all to the same file.

the example output I provided is exactly as I want the flat file.

Basically I want to run 3 data flow tasks in sequential order that appends their own output to the flat file destination.|||

So make three data flow tasks and three flat file connection managers, each referencing the same file.

Hook the data flow tasks up together in the control flow to enforce precedence.

What's the problem, I guess? It sounds like you've got it figured out: "Basically I want to run 3 data flow tasks in sequential order that appends their own output to the flat file destination."

|||

This is an odd request, however, you could potentially build up your own csv record.

Have each record be a single column text field and use a derived column transformation to string all of your current columns together for each of the sources.

After you pull these together do a union all.

If needed, you can do a select to grab the metadata information (i.e. table and column names) to push into the output as well. (i.e. select 'tablename'; select 'column1, column2, columnN'; etc...)

:edit - phil once again provided a working answer above while I was typing ... :

|||

great.. thank you so much!

No comments:

Post a Comment