Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

XML field vs several columns

Status
Not open for further replies.

iwease

IS-IT--Management
Sep 4, 2006
104
CA
While this may not sound like a very good design practice, I am just curious on how this would affect performance.

Suppose I have 10 - 30 columns in a data table. Almost always, data in some of the columns will be null. Alternatively, I could have just one XML column, and only fill it with data that pertains to a record (ie, I don't need to store nulls). Which of these two cases will perform faster in terms of selection? Are there are things that I should consider (I haven't experimented too much with XML columns but I think I can still do aggregations and other common functions)
 
OK, I have one question back:
What query you will write just to SUM() a value from one field in that XML? In usual table it sould be:
SELECT SUM(FieldHere)
...

But in XML?
What query you thing will be faster?
I don't want to mention JOINs, WHEREs etc.

Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
MVP VFP
 
yup. But this isn't a usual table. As I stated, it is a way of handling the situation where there is a lot of null data. I may not have phrased it will in the first posting and the aggregations was just an example, but what I'm wondering is the difference between using a single XML field to represent several different pieces of data vs several columns to represent the data. And the reason why I am considering XML is because often times, many of those columns will be null.

Ah, here is a good example. Suppose I have a database full of stock quotes. For each quote, there could be as many as 30 pieces of data that would include open, high, low, close, volume, P/E ratio, earnings estimates, debt/equity ratio, analyst, etc. For many stocks, much of this data won't be available from the site where we are getting the data from(debt/equity ratio, analysts, etc).
So, rather than having a bunch of null columns for each quote, does it make sense to use XML?

I wouldn't imagine that an XML based join would be faster, but, would it be that much worse? Especially if it made sense in terms of structure simplicity. It is annoying to select for a 30+ column table
 
Instead of a table with 30 fields you could have a table with 3 fields:
Quote
TypeOfRecord
Value

and when the value is NULL you just don't insert that record in the table so if you have one or two types you will have one or two records in that table.

Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
MVP VFP
 
I know, I thought about that and I think it's a good thought. That method works works well in most situations though and makes for a nicely normalized DB (and it is easy to add new types as well).

But I find it troublesome when I want to retrieve the quote data in one big row (for viewing in a report or table). I have to do 30 joins (once for each TypeOfRecord).

 
Don't put the horse at the back of the waggon. Just for one report you want to make your life harder. You alway could generate a 30 columns query just like that (from that table I suggested)
Code:
SELECT Quote,
       MAX(CASE WHEN TypeOfRecord = 1
                THEN Value
                ELSE 0 END) AS Open,
       MAX(CASE WHEN TypeOfRecord = 2
                THEN Value
                ELSE 0 END) AS Closed,
       MAX(CASE WHEN TypeOfRecord = 3
                THEN Value
                ELSE 0 END) AS High,
.....
FROM ThatTable
GROUP BY Quote

Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
MVP VFP
 
Maybe you should consider the Entity Attribute Value model. If that's too big for you (with all its drawbacks and complexity) then go with the column solution. Don't store stuff in XML just to "save space" or reduce the number of NULL columns.

bborrisov is exactly right in his recommendations to you. You can even write a stored procedure which would automaticaly generate the view that converts your "tall" table to the "wide" one. Every time you add or remove a TypeOfRecord value, rerun the SP and you're done.

[COLOR=#aa88aa black]Cum catapultae proscriptae erunt tum soli proscript catapultas habebunt.[/color]
 
Ok, I've used the EAV model before but I never realized that it was a standard (but makes sense that it would be). In this case, 30 separate pieces of data doesn't seem sparse enough to justify it's use.

I feel like I am always battling this issue. Perhaps 30 is sparse enough...I will have to look into it more. That was a great overview from wikipedia. They gave some really good examples. Thanks.

I guess the new SQL datatype is only good for specific cases.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top