I'm having memory issues when trying to do some "divide and conquer" lookups.
I have a reference table - call it Person. Each record contains a unique ID, LastName, FirstName, BirthDate, Gender.
I have a second table - call it Respondent. Each record also has an ID, LastName, FirstName, BirthDate, Gender. The kicker here is that any of the fields in this table can be NULL - including ID.
I have a single data flow, in which I've split my Respondent data into records that have ID's and records that don't.
I want to match the records that have ID's on ID, LastName, FirstName. These, historically, provide most of my successful matches.
The one's that don't have ID's, I want to match on LastName, FirstName, BirthDate.
My Person table is quite large. Separate copies for each lookup are too much for my available memory. I thought of using a cache connection manager, but, according to Microsoft documentation, I have to use all the index fields I define in a cache connection manager. And I have different index fields for each of my lookups. Which I take to mean that I'd need two cache connection managers. Which I don't think will buy me anything.
I'm relatively new to SSIS, so, I'm hoping for some suggestions on structuring my tasks with a minimal amount of storing intermediate results in files or back in my data base.
Thanks.
I have a reference table - call it Person. Each record contains a unique ID, LastName, FirstName, BirthDate, Gender.
I have a second table - call it Respondent. Each record also has an ID, LastName, FirstName, BirthDate, Gender. The kicker here is that any of the fields in this table can be NULL - including ID.
I have a single data flow, in which I've split my Respondent data into records that have ID's and records that don't.
I want to match the records that have ID's on ID, LastName, FirstName. These, historically, provide most of my successful matches.
The one's that don't have ID's, I want to match on LastName, FirstName, BirthDate.
My Person table is quite large. Separate copies for each lookup are too much for my available memory. I thought of using a cache connection manager, but, according to Microsoft documentation, I have to use all the index fields I define in a cache connection manager. And I have different index fields for each of my lookups. Which I take to mean that I'd need two cache connection managers. Which I don't think will buy me anything.
I'm relatively new to SSIS, so, I'm hoping for some suggestions on structuring my tasks with a minimal amount of storing intermediate results in files or back in my data base.
Thanks.