Splitting column1 values into new column2 based on COUNT(column3)

daredavil · Jun 15, 2013

Hello,
could somebody help me to create query which should count number of values in column3 filtered by the values in column1 and if:
1) count exceeds e.g., 4000, create new column and split the values from column1 into 5 new values i.e., first 1000 into the same value like in column1 + character A, next 1000 into column1 + B, etc...and the remaining part into ????? + E?
2) count exceeds 3000, from A-D
3) count exceeds 2000, from A-C
4) count exceeds 1000, from A-B
Sorting should be ASC based on LastName, FirstName and FathersName?

There are approximately 10 000 distinct values in column1 and of course, if some count does not fall into one of groups, the values should be just "copy/paste" as they were found (i.e., if count does not exceed 1000). Column1 is of NVARCHAR type.
For example, if value in column1 is 001A001, and count of it exceeds 4000, in new column I need to have this:
001A001A, for the first 1000 occurencies of 001A001
001A001B for the second 1000 occurencies of 001A001
001A001C for the third 1000 occurencies of 001A001
001A001D for the forth 1000 occurencies of 001A001
001A001E for the remaining part of occurencies of 001A001

and so on for all 10000 distinct values in column1!

etc...Before splitting, there should be done sorting based on above mentioned LastName, FirstName and FathersName...

How to accomplish this like query in sql server 2008 R2?
Thanks for any code snippet that could lead me towards an solution and further fine tuning with, may be, involvement of some threshold etc..e.g., trying to keep all the LastName in the same group if the first character remains the same regardless of fact that it could lead towards exceeding of number of occurencies in some of groups (e.g., to have 1020 in the first group should be allowed).
Thanks for any assistance!

gmmastros · Jun 17, 2013

Does this work for you?

Code:

; With Data As
(
  Select Column1,
         LastName,
         FirstName,
         FathersName,
         Row_Number() Over (Partition By LastName, FirstName, FathersName Order By Column1) As RowId
  From   YourTableNameHere
)
Select   Column1 + Char(65 + RowId / 1000) As OutputColumn,
         Column1,
         LastName,
         FirstName,
         FathersName,
From     Data

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

daredavil · Jun 18, 2013

Hello, as of now I get this error prompt

sql server 2008 R2)

Msg 156, Level 15, State 1, Line 15
Incorrect syntax near the keyword 'From'.

Could it be possible that code which I can see is the complete code? This is what I can see from your reply:

; With Data As
(
Select Column1,
LastName,
FirstName,
FathersName,
Row_Number() Over (Partition By LastName, FirstName, FathersName Order By Column1) As RowId
From YourTableNameHere
)
Select Column1 + Char(65 + RowId / 1000) As OutputColumn,
Column1,
LastName,
FirstName,
FathersName,
From Data

drlex · Jun 18, 2013

I think that there's a stray comma after the second "FathersName" reference.
(Also wondering whether the partition by and order by need to be reversed)

Code:

; With [Data] As
(
  Select Column1,
         LastName,
         FirstName,
         FathersName,
         Row_Number() Over (Partition By Column1 Order By LastName, FirstName, FathersName) As RowId
  From   YourTableNameHere
)

Select   Column1 + Char(65 + RowId / 1000) As OutputColumn,
         Column1,
         LastName,
         FirstName,
         FathersName
From     [Data]

soi là, soi carré

daredavil · Jun 18, 2013

Hello,
the code "works", only comma by the last FathersName has to be deleted before but, outcome of query is just e.g., if column1 value was 001A001 and total number of occurencies (count) of it was approx. 4500 (count of column1 = count column3, it is negligible which column it counts, result is the same), what I get is this: 001A001A. The column was just updated by added a character "A" at the end i.e., I just got renamed value 001A001 into 001A001A throughout all the rows! I should get 001A001A for the first 1000, upon that 001A001B for the second 1000 occurencies of it and the source table should be updated with new column named e.g., OutputColumn?

drlex · Jun 18, 2013

Did you try reversing the Order By / Partition By?

I don't presume to speak for George, but I'd reckon that the code that he supplied corresponded with your request for "[...]any code snippet that could lead me towards an solution". I'd be reluctant to post any UPDATE statements as part of an answer here!

soi là, soi carré

daredavil · Jun 18, 2013

Thanks a lot to both of you. The code provided by Mr. GMMASTROS works with remark that Partition By and Order By needs to be reversed i.e., Partition by Column1, Order by LastName etc....as MR.DRLEX proposed.

Thanks!

daredavil · Jul 1, 2013

Hello,
could somebody help to involve IF THEN loop into above code. Namely, I am trying to fine tune it by watching RowId number on such way that IF, RowId is less then 1000, the code remains the same. But, if RowId is equal to 1000 or > 1000 AND first letter of last name does not change (REMAINS THE SAME), to keep these records within the same group (TO KEEP THE RECORDS THAT BEGIN WITH THE SAME LETTER TOGETHER, IN THE SAME GROUP)? (REGARDLESS OF FACT THAT IT WILL CROSS 1000 THRESHOLD)!

THANKS FOR HELP

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Splitting column1 values into new column2 based on COUNT(column3)

daredavil

Programmer

gmmastros

Programmer

daredavil

Programmer

drlex

Technical User

daredavil

Programmer

drlex

Technical User

daredavil

Programmer

daredavil

Programmer

Similar threads

Part and Inventory Search

Sponsor