TopN from single data field

kt1065 · Sep 25, 2007

I have a database field ({tablename.Designation}) that can have multiple values separated by a variety of separators (e.g. "CPA, CFA, CFP"; "CPA"; "CPA,QQ"). I need a TopN (up to 20) count of the number of times each unique designation occurs.

I have been able to parse the data using formulas and now have the following in my report: @desig1, @desig2, @desig3, @desig4, @desig5. I am not interested in any designations beyond these 5 for any one database record so that helps to limit the data. The trouble is that the most frequent designations can change over time (i.e. the next time the report is generated).

Essentially, I have five different fields with the same/similar data. i've tried to group the data by TopN designations but I cannot get a count for each of the five new "fields".

It seems that it would be easier to count without parsing, but I don't know what the topN sort is if I don't parse. (The designations can be in any order within the database field {tablename.desgination}.

I need help! Thanks.
KT

CoSpringsGuy · Sep 25, 2007

are there an unlimited number of possible designations?

kt1065 · Sep 25, 2007

Theoretically yes. This is a data entry field on our website over which we have no control.

CoSpringsGuy · Sep 25, 2007

well this might help but only if I am tracking with your issue:

are you using a crosstab with this? that would seem to be your answer ....

rows:
{tablename.Designation}
summarized fields:
count of {tablename.Designation}
columns:
leave empty

then right click the crosstab and select group sort and you can TopN from there

let me know if that helps

lbass · Sep 25, 2007

Are you trying to count across records or only within each record? What is the content of your formulas? What is the purpose of the topN? Please show how your data looks for a few records, and also show how your report would then look based on the sample.

-LB

kt1065 · Sep 25, 2007

Thanks.

But {tablename.designation} gets parsed to five different fields. I have determined a unique separator for record and created @desig1; @desig2; etc. Where "CPA" for ex. can exist in any of the "fields" @desig1, @desig2...depending on the order that the individual entered his/her designation on the website.

I am considering a subreport to generate the topN values like so:
Whileprintingrecords;
shared stringvar Top1 := NthMostFrequent (1, {@desig1});
shared stringvar Top2 := NthMostFrequent (2, {@desig1});
... thru Top20

Can I then do something like this for each of the top 20?
if uppercase(shared stringvar Top1) in uppercase ({tablename.designation})
then numbervar count1 := count1 + 1
else
numbervar count1

Seems like a lot of formulas, but it appears to work.

lbass · Sep 25, 2007

You didn't respond to my questions and you didn't show the content of your formulas. It might be that creating separate formulas isn't the best approach.

-LB

kt1065 · Sep 25, 2007

LB,
Just saw your post.
I am counting across all records in the db. the field {tablename.designation} would have the following data :
record1: "CPA, CFA, QQ"
record2: "CFA"
record3: "CFA; QQ"
So you see that it is quite random. I have created the follwing from this data with formulas:
record1: @desig1 = "CPA"; @desig2 = "CFA"; @desig3 = "QQ"

the desired output would look like this:
CPA 3
CFA 2
QQ 1

KT

lbass · Sep 25, 2007

And the content of at least one of your formulas?

-LB

kt1065 · Sep 25, 2007

LB,
It's a complicated data set. I don't have a problem with the individual designations. I just need to count them. Nevertheless, here are the formulas as requested.

{@desig1}:
if {@one desig} = "" then
trim((Split ({Name.DESIGNATION}, {@field_sep}) [1]) ) //trim off any excess characters such as spaces
else
{@one desig}

{@onedesig}:
if ({@field separator} = "unknown" or ({@field separator} = "space" and len({Name.DESIGNATION}) < 6 ) )
//assume that designation less than 6 char is one designation - proved by data existing at 28Aug2007
then {Name.DESIGNATION}
else
""

{@field separator}:
if {Name.DESIGNATION} like ["*,*"] then "CSV"
else
if {Name.DESIGNATION} like ["*;*"] then "semicolon"
else
if {Name.DESIGNATION} like ["*/*"] then "slash"
else
if {Name.DESIGNATION} like ["*-*"] then "hyphen"
else
if {Name.DESIGNATION} like ["* *"] then "space"
else
"unknown"

KT

kt1065 · Sep 26, 2007

Here is what I did:

To identify the Top 20 of THE FIRST DESIGNATION listed, I have placed this formula in the detail section:
//@TopN
Whileprintingrecords;
stringvar Top1 := NthMostFrequent (1, {@desig1});
stringvar Top2 := NthMostFrequent (2, {@desig1});
stringvar Top3 := NthMostFrequent (3, {@desig1});
stringvar Top4 := NthMostFrequent (4, {@desig1});
…
stringvar Top20 := NthMostFrequent (20, {@desig1});

To count the number of times that each of these Top 20 occur in the entire data set, 20 formulas different formulas are stacked in detail section b AND in the report footer. Here is an example:

//@CountVar1
whileprintingrecords;
if uppercase(stringvar Top1) = uppercase ({@desig1})
then numbervar CountTop1 := CountTop1 + 1
else
if uppercase(stringvar Top1) = uppercase ({@desig2})
then numbervar CountTop1 := CountTop1 + 1
else
if uppercase(stringvar Top1) = uppercase ({@desig3})
then numbervar CountTop1 := CountTop1 + 1
else
if uppercase(stringvar Top1) = uppercase ({@desig4})
then numbervar CountTop1 := CountTop1 + 1
else
if uppercase(stringvar Top1) = uppercase ({@desig5})
then numbervar CountTop1 := CountTop1 + 1
else
numbervar CountTop1
(i used '=' rather than 'in' here due to the nature of the desinations such as "RE")

Labels are placed in the footer next to @CountVar[x]:

//@Top1
whileprintingrecords;
UpperCase (stringvar Top1)

Results were tested against an Excel export. I'm not happy with with it, but given the nature of the database, this works well enough. The only issue is that due to the manual placement of the twenty formulas, the results are sorted by the @TopN formula above which look only to the first displayed designation on each record. A simple explanation that records are sorted by most popular first designation will have to suffice.

BTW, I find the forums very helpful in solving my reporting dilemmas. I am not a programmer but the explanations are often quite easy to follow...except for this one perhaps. Thanks to all!
KT

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

TopN from single data field

kt1065

Technical User

CoSpringsGuy

IS-IT--Management

kt1065

Technical User

CoSpringsGuy

IS-IT--Management

lbass

Technical User

kt1065

Technical User

lbass

Technical User

kt1065

Technical User

lbass

Technical User

kt1065

Technical User

kt1065

Technical User

Similar threads

Part and Inventory Search

Sponsor