Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Please help! 2

Status
Not open for further replies.

cosmid

Programmer
Feb 14, 2008
73
US
I am stuck here. I have a dataset with a column of mixed city and states. Something like Charleston, West Virginia; NYC, NY; or Los Angeles, California.

What I need to do: I need to extract the cities from this column and generate a histogram for the cities. Like, if 5 of the record are from Charleston, 4 from LA, 0 from NYC, then the histogram would be a graph of these 3 cities with the numbers. This will give an overview of the number cases for each city.

Another problem: Some cities, like NYC, are in 2 different format: New York City, or NYC. How do I get around this?

Thanks a lot!
 
First, you need to take care of the city vs states. This you can do by using the scan function and have the delimiter be a comma. Once you have your city names in one column you need to take care of redundant naming conventions. Such as NYC, New York City, Manhatten etc... There really is no fast way to do this, you need to use the IF statements and rename all cities with identical geographic locations into one name. Here is an example.

Code:
if trim(upcase(new_city_name)) in ('NYC','NEW YORK','MANHATTEN')
then new_city_name = 'NEW YORK CITY';

Repeat this statement for as many cities that you want to consolidate into one.
Then run your graphing routine.

Klaz
 
Thanks for the help! Another thing with the dataset is that some of them are comma delimited and some of them are space delimited. I guess I have to break the dataset into 2 different dataset then use your method?

thanks again!
 
I got all the cities from the dataset now. Now I need to generate a histogram for certain cities, NYC, LA, Las Vegas, All other cities. How do I do that?
 
Ok...I think what I did to solve the problem is not the best way. I wrote different SQL statement for each city so that I can have the total number of count for each city that I wanted. Then store each city in a variable for that city. Then all I need to do is use a proc chart to get the graph that I wanted. I can get the cities and count using the SQL, but storing each city and the count in a variable..I am not sure how to do that.

Is there a better way to solve this?

thanks!
 
Ok...I just realized what I did was an extra and unnecessary step. The cities are already in one column. I just need to generate a histogram for it. The graph should include just the cities that I want and group the rest of the cities into just one bar. For example, if I just want to have Seattle, New York, and then other cities. Where other cities include Las Vegas, Los Angeles...etc. How do I do that?
 
This is some example code I whipped up in Enterprise Guide (if you've got that, it's the best way to whip up graphs etc really quickly).
Code:
PROC GCHART DATA=sashelp.class;
	VBAR3D	 Sex /
	SHAPE=BLOCK
        FRAME	TYPE=FREQ
	COUTLINE=BLACK
	RAXIS=AXIS1
	MAXIS=AXIS2
	GAXIS=AXIS3
	GAXIS=AXIS3
;
RUN; QUIT;

To group your cities, I would use a format.
Code:
proc format;
  value $city
    'Seattle' = 'Seattle'
    'New York' = 'New York'
    other = 'Other Cities'
    ;
quit;
Note "other" is not in quotes, it's a key word.
Then add the line
Code:
  format city $city.;
to the graph step.
There's a bunch of other graph steps around, you might need to search around a bit for something that fits your requirements if you don't like the above.
Do a code search on the SAS Support site, they've got loads of code examples you can use there.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
that format was very smart! I would never thought of that.

Thanks again!
 
No worries cosmid.
There's alot of tools in SAS which seem very simplistic, but which are actually very very powerful once you start to use them in less obvious ways.
Proc Format can also be used to build numeric bands for doing similar jobs with numeric data.

Chris
Business Analyst, Code Monkey, Data Wrangler.
SAS Guru.
 
Yes. After working for a few months, I realized that just reading the book is useless. I need to apply what I have read. I am still not very familiar with SAS and don't have a habit of thinking to use these simple functions. I am just glad to find this website and people like you! Thanks again! I hope I can start make some contributions later to this forum in the near future.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top