Tuesday, March 27, 2012

Combining Row Data

I am adding data to a table from a table of identified duplicates.
These duplicates are not Key Violations, they are actual double entries
where one was deactivated prior to the new record being added (not very
clean but necessary given the system the data originates from). What I
would like to do is to take the data and combine some fields of the
duplicate rows.
I have created my mapping table (the table I want to add data to) which
looks like:
UniversalID (Identity)(PK)
Original_ID1 <-key from 1st row
Original_ID2 <-key from 2nd row
2ndSystemID
My data would look like this:
OriginalID Name 2ndSystemID
123 Smith, John KHGK39
124 Smith, John KHGK39
..
What I want is
UniversalID Original_ID1 Original_ID2 2ndSystemID
1 123 124 KHGK39
...
I know that this is doable, and I also know that I should know how, but
just can't seem to see the forest because all of the trees are in my
way.
Thanks for any help you can provide.
MTAnd if there are triplicates?
Don't try fixing old flaws by introducing new ones - such as breaking normal
form. I'd sugesst either deleting unwanted rows or adding a table to store
duplicate values of OriginalID referencing a proper primary key (e.g
2ndSystemID). Normalize now and prevent further difficulties.
ML
http://milambda.blogspot.com/|||Not really trying to break normal form. I need all of these values to
reference back to the original systems where the data lives. This is
why I need all of them. The UniversalID will be the new Key, the old
IDs, will become simple references.
I truly wish I could use only one, but unfortunately due to the nature
of the data I am dealing with, I am unable to and must find a work
around.
MT|||MT wrote:
> Not really trying to break normal form. I need all of these values to
> reference back to the original systems where the data lives. This is
> why I need all of them. The UniversalID will be the new Key, the old
> IDs, will become simple references.
> I truly wish I could use only one, but unfortunately due to the nature
> of the data I am dealing with, I am unable to and must find a work
> around.
> MT
Won't your proposal fail if there are more than 2 original ids? How
many do you expect to have to cope with?
Unfortunately you haven't given much information about keys. Here's a
guess at what I'd do:
CREATE TABLE users (universalid INTEGER PRIMARY KEY, name VARCHAR(50)
NOT NULL UNIQUE, systemid INTEGER NOT NULL);
CREATE TABLE original_ids (original_id INTEGER NOT NULL PRIMARY KEY,
universalid INTEGER NOT NULL REFERENCES users (universalid));
INSERT INTO users VALUES (1, 'Smith, John', 'KHGK39');
INSERT INTO orginal_ids VALUES (123,1);
INSERT INTO orginal_ids VALUES (124,1);
A join will map any number of original ids to the universal one.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--

No comments:

Post a Comment