Click to See Complete Forum and Search --> : Help for query to clean up my data?


jacobly
04-23-2007, 02:01 PM
I have a table with the a structure similar to the following

id | section | subsection | division | group | subgroup | officialName | Name
--------------------------------------------------------------------------------------
01 | 33 | 001 | 00 | 00 | 00 | Book1 | Book
02 | 33 | 001 | 01 | 00 | 00 | NULL | dsf
03 | 33 | 001 | 01 | 01 | 00 | Book3 | Report
04 | 33 | 001 | 01 | 01 | 99 | NULL | NULL
05 | 33 | 001 | 01 | 02 | 00 | Document1 | document
06 | 33 | 001 | 01 | 02 | 01 | NULL | NULL
07 | 33 | 001 | 01 | 03 | 00 | Document3 | unknown
08 | 33 | 001 | 01 | 03 | 99 | NULL | NULL
09 | 33 | 001 | 01 | 04 | 00 | Document4 |
10 | 33 | 001 | 01 | 04 | 99 | NULL |


What I need to do is clean up my data by performing inserts on data which shares similar charaterisitics.

For example, in the above table, I need to copy the official name from row 03 to the same position in row 04. Essentially using section, subsection, division, and group as a combined key to select:

id | section | subsection | division | group | subgroup | officialName | Name
--------------------------------------------------------------------------------------
03 | 33 | 001 | 01 | 01 | 00 | Book3 | Report
04 | 33 | 001 | 01 | 01 | 99 | NULL | NULL

and end up with:

id | section | subsection | division | group | subgroup | officialName | Name
--------------------------------------------------------------------------------------
03 | 33 | 001 | 01 | 01 | 00 | Book3 | Report
04 | 33 | 001 | 01 | 01 | 99 | Book3 | Report

or in the final resulting data after running the query on the whole table:

id | section | subsection | division | group | subgroup | officialName | Name
--------------------------------------------------------------------------------------
01 | 33 | 001 | 00 | 00 | 00 | Book1 | Book
02 | 33 | 001 | 01 | 00 | 00 | NULL | dsf
03 | 33 | 001 | 01 | 01 | 00 | Book3 | Report
04 | 33 | 001 | 01 | 01 | 99 | Book3 | Report
05 | 33 | 001 | 01 | 02 | 00 | Document1 | document
06 | 33 | 001 | 01 | 02 | 01 | Document1 | document
07 | 33 | 001 | 01 | 03 | 00 | Document3 | unknown
08 | 33 | 001 | 01 | 03 | 99 | Document3 | unknown
09 | 33 | 001 | 01 | 04 | 00 | Document4 | blank
10 | 33 | 001 | 01 | 04 | 99 | Document4 | blank

Thanks in advance for any help!

Let me know if I need to further clarify my question.

Jacob

Phil Weber
04-24-2007, 03:22 PM
What database are you using?

jacobly
04-24-2007, 03:55 PM
I'm using MySQL

joewmaki
04-26-2007, 11:47 AM
I'm not familiar with MySql, but in SQL server you can use a selfjoin similar to this:

UPDATE T
SET T.OfficialName = t2.officialname,
t.name = t2.name
from testdata t, testdata t2
where (t.[section] = t2.[section]
and t.subsection = t2.subsection
and t.division = t2.division
and t.[group] = t2.[group])
and t2.officialname is not null

jacobly
05-14-2007, 12:08 PM
I'm not familiar with MySql, but in SQL server you can use a selfjoin similar to this:

UPDATE T
SET T.OfficialName = t2.officialname,
t.name = t2.name
from testdata t, testdata t2
where (t.[section] = t2.[section]
and t.subsection = t2.subsection
and t.division = t2.division
and t.[group] = t2.[group])
and t2.officialname is not null

Thank you Joe, I had to exclude some more columns from the join for it to work. You put me on the right path and now that particular problem that I had has been solved (Plus I learned in the process).

Thanks again,

Jacob