Thursday, March 29, 2012

combining two tables

hi,

does anyone have any good insight to this problem? I will have two tables which contains the same number of columns for same data types, they are related together by a key book_id. I need to combine them together and create some extra totalling data in the new datatable for a report. Here is an example

table1:
book_id new_words cost_of_change
1 3000 2
1 4000 4
2 500 4

table2
book_id old_words cost_of_change
1 1500 1
3 2500 5

I need to combine them into a table like this:

book_id new_words cost_of_change old_words cost_of_change total_cost
1 7000 6 1500 1 7
2 500 4 0 0 4
3 2500 5 0 0 5

whats the best way to do this?

I have been trying to use full outer joins to do this but I find this a difficult way to create new rows in the new combined table, like what will be an easy way for me to say in SQL that only one row should be used for book_id 1, as it's is present in the two source tables 3 times? I think i will be able to find out from using left and right inner joins, before i make the new combined table but this seems like a very ineligant way of doing this, as it seems to require lots of temp tables.

thxIs table1 the only place where there can be duplicate book IDs? I'm going to assume so, but if table2 can have duplicates you'll need to modify this a bit. But the basic idea should work.

You can use an aggregate subquery for table1 that you then join on table2. The subquery looks something like this (all of this is untested code; you may need to tweak):

SELECT SUM(new_words), SUM(cost_of_change) FROM table1 GROUP BY book_id

That sums the two fields for each id and eliminates the dupes. That subquery becomes one of the derived tables in the outer select. Something like this:

SELECT B.book_id, A.new_words, A.new_cost, B.old_words, B.cost_of_change AS old_cost, total_cost FROM table2 AS B
INNER JOIN (SELECT SUM(new_words) AS new_words, SUM(cost_of_change) AS new_cost
FROM table1 GROUP BY book_id) AS A
ON A.book_id = B.book_id

This query doesn't yet aggregate the totals from the two tables, so that will be another outer query, but the idea is the same. And there are almost certainly ways to simplify this query.

One way is to use table variables in SS2K. Then you can do three more straightforward joins.

Is this helpful? Or have I confused things more?

Don|||Something like this should work:


Select
IsNull(A.book_id,B.book_id) as book_id,
IsNull(A.new_words,0.0) as New_Words,
IsNull(A.Cost_of_change,0.0) ACost_of_Change,
IsNull(B.old_words,0.0) as Old_Words,
IsNull(B.Cost_of_change,0.0) BCost_of_Change,
IsNull(A.Cost_of_change,0.0)+IsNull(B.Cost_of_change,0.0) as Cost_of_change
From
(Select book_id, Sum(new_words) New_Words,Sum(Cost_of_change) Cost_of_change FROM Table1 Group By book_id) A
FULL OUTER JOIN
(Select book_id, Sum(old_words) Old_Words,Sum(Cost_of_change) Cost_of_change FROM Table2 Group By book_id) B
ON A.book_id=B.book_id
|||Thanks Guys, that solved my problem. The second method is lot more readable, but which would be the most efficient method?|||Both methods are basically the same thing. The second method could be made clearer by using Table variables as mentioned in the first method. But I don't think that would affect efficiency. You could test this using the Sql Query Analyzer and compare the execution plans and execution times for each.

No comments:

Post a Comment