Smart questions
Smart answers
Smart people
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Member Login

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips now!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

Join Tek-Tips
*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

LINK TO THIS FORUM!

Add Stickiness To Your Site By Linking To This Professionally Managed Technical Forum.
Just copy and paste the
code below into your site.

Partner With Us!

"Best Of Breed" Forums Add Stickiness To Your Site
Partner Button
(Download This Button Today!)

Feedback

"...it was ingeniously designed and all those clicks were for my own good... and that was even before I got my speedy and useful answer to my tekkie question that I eventually posted..."

Geography

Where in the world do Tek-Tips members come from?
rotaxmax12 (TechnicalUser)
24 May 12 15:06
I am trying to build an SSIS package that will be used to seed a new system. I have a flat file that we have to concatinate five fields to create the primary key.

I tried to use the Aggregate tool, but as I need to load a total of 30 columns, in order to have all of the data pass through the Aggregate tool, I need to select all the columns I need. When I do the Group By in the Aggreate, I still have duplicate rows (PK). I would like to avoid creating temp tables or working tables if possible.

Essentially, how can I build (SELECT PK, Field1, Field2, Field3 FROM table WHERE PK IN (SELECT PK FROM table GROUP BY Field1, Field2, Field3 HAVING COUNT(*) > 1) out of a flat file source? But then still have the remaining data continue through the processing?

SQL: 2008 R2
Input: csv Flat File
Output: SQL Server 2008 R2

Thanks all
rotaxmax12



xlbo (MIS)
24 May 12 20:26
not meaning to be too pedantic here but if you still have duplicate rows after you have done your concatenation, it is not a primary key

you could simply pass the data flow through a sort
If you choose PK as your sort item you can choose the option to drop duplicate rows...

Rgds, Geoff

We could learn a lot from crayons. Some are sharp, some are pretty and some are dull. Some have weird names and all are different colours but they all live in the same box.

Please read FAQ222-2244: How to get the best answers before you ask a question

jjefferson (TechnicalUser)
31 May 12 18:07
My only thought is you may want to re-think your preference to not use working tables. With no exception (to my knowledge) we always import a foreign data source into tables in a STAGING database, where we can then use stored procedures to check out and transform the data. Just makes it so much easier on the SSIS design.

I understand your application may have reasons this is not appropriate, but I would not otherwise rule this option out.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close