So you want to name your objects. Your tables, your views, your functions. And there are a lot of people out there who name them with what the object type is. Maybe even with more information. For example:
tblOrders
Table_Orders
sp_OrderSend (NEVER prefix SPs with "sp_" because the server will look for a system SP first!)
Proc_OrderSend
OrderStatusFunc
fn_OrderStatus
fn_t_OrderStatus
even worse:
joe_sp_OrderSend
mike_sp_OrderCancel
david_fn_OrderStatus
Then there are column names:
ID
ProductID
keyProductID
pkeyProductID
pkProductID
pkProductIDCode
frnkeyProductID
fkProductID
So does this stuff help at all? Is it useful to put the type or relationship in the name of objects? Can you leave the name off the ID column in your tables? Should you put more information in there? What does it matter, anyway, as these are mere naming conventions, right? Just pick something, and stick with it!
I agree that consistency is important. But consistently helpful is a far improvement over consistently something-less-than-that. Consistently unnecessary is extra work and frustration. Consistently interfering impairs your ability and speed. Consistently mediocre is not the greatest.
No, overall, I think it is best to take a minimalistic approach. Yes, this is just my opinion. I am just one person, and some people have different approaches. But I have had experience in several shops and I have designed databases myself. I have a side database project outside my regular work where I am developing the entire thing myself, front end and back end, the the whole works. I'm sharing with you the benefit of my experience. I hope that these make sense to you, but whatever you do pick, you should have clear and compelling reasons for what you do. Notice the compelling part. Your default should be to avoid clutter and only add things if there is a materially demonstrable reason to violate that rule.
Now, first of all, much of my advice is meaningless if you're using a GUI where one performs the less taxing task of recognition instead of the recall required to write queries in text. If you are writing queries in text, more power to you, and if you keep doing it I predict you'll eventually see the sense of my recommendations.
But if you're using a GUI to write queries you'll only reach a certain level of ability, never stepping into the next level of professional skill and seasoned experience in the SQL realm. I haven't used a GUI to write a serious query in a very long time (I might have started a couple in an Access database but even then switched to SQL view). I recently helped someone who posted his query that was obviously built in a GUI and it had about 10 left joins. I seriously doubt they all needed to be left joins, which means he was killing performance for no reason.
So if you are writing queries in a text editor, here are some thoughts for you:
• Lose the underscores. They clutter up the screen. They take the shift key to type and also your pinky finger which has to move two rows up from the home row. And using CamelCase is just as clear and saves space and that pinky traversing. And if you're really feeling lazy you don't even have to hit the shift key.
order_id
customer_name
device_code_reason
OrderID
CustomerName
DeviceCodeReason
Look at it this way: if you can train your eyes to scan either kind of naming convention just as efficiently, why not use the one that involves less typing and takes less space on the screen?
• The prefix/suffix naming convention is not always understood or followed. In one shop I worked in they named tables with prefixes: tbl for base tables, tlk for lookup tables, and tin for "intersect" tables. But no one really knew what they were for sure. "tin" tables were supposed to be transactional tables that had constantly changing data such as orders and transactions. "tbl" tables were supposed to be things like customers and products. But the problem was that many tables didn't fit a clear definition, and people were careless, or came along years after the creator of the database (who yes used a GUI database creation program), and they got named wrong. So now after all that silliness and extra typing, there was a "tin" table that was really a "tbl" table and a "tlk" table that was really a "tin" table after some changes, and ... so on and so forth. Just leave off that junk.
• You'll eventually become completely sick of typing the extra characters over and over again for NO GOOD REASON. And you'll have these prefixes so wrapped up in your code and database that ripping them out becomes impossible. All those extra letters will come to be so much useless froofroo to you. At the same shop as tbl/tin/tlk they really named columns keyOrderID and frnkeyOrderID. They added nine characters of typing to EVERY SINGLE JOIN I ever had to write. After a year of working there, the pain was no less. It took me about 2 weeks to learn the primary and foreign keys of that database. The other fifty weeks I wished the database designer had left well enough alone.
• Naming is something that is so much more important than most people seem to think. What does it matter what we call things, as long as we're "consistent," right? I disagree. I need to come up with some good examples, but my experience is that the naming in a database the single most important thing that structures how programmers relate to the data and that determines their final speed and facility working with it in the long run.
Mislabeling things is like putting up signs at the cliff edge that say "keep going" and "this is the right way" when it's not. Label things appropriately.
If that sounds like a lot of bugaboo to you, then I can only offer that I think experience will teach the same lesson.
• Scanning and comprehension speed suffer when extra things are tacked on. When you look at a stored procedure or query, if you yourself weren't just working on it yesterday, then you have to digest it to understand it. The more prefix_blob_description_gorp you have in there, the harder it is to see what's going on. The assumption here is, you know your database like the back of your hand. Will that extra stuff speed you up now or slow you down? Who should you code for, the least common denominator or a reasonable level of facility? In fact, this brings up another important point:
• You do not want to give that new developer, who is unfamiliar with your database, a prematurely quick sense of competence and ability in your database. By doing this, you increase the chances that he or she will make some serious mistake. This is the opposite mistake of mislabeling. Now the path over the edge of the cliff is littered with so many friendly and distracting signals that the poor new person gets the signal to "keep going" when the correct response ought to be, "wait, I'm not sure I am doing the right thing." I'm all for helping new developers get going as fast as possible. But they ought to do this by studying the schema and practicing with it. Don't make it so apparently easy that even an ignorant person will believe he knows all about your database.
• The information is unnecessary and ultimately makes developers remember more information, not less. It sounds at first like a great way to expose information about the tables and objects. But the fact is that the period of time where a new developer isn't familiar with the database is actually very short. So for the benefit of two months of instant answers to the beginner, everyone else in the shop is suffering through unnecessary baggage.
• The implementation can change. What if you decide to change the structure of one of your tables, and to support your older versions of software, you build the new table with a new name and convert your old table to a view? Now you have tbl_Whatever that's a view. Whoops. And what if you need to use that really important sp, only you can't remember who wrote it? Now you have to look it up. What a time waster. What if you can't remember if it's an inline table function or a table-variable function? More wheels spinning.
If these seem farfetched, they are not. I have worked with databases that had "tbl" objects that were views and believe me it was a nightmare remembering which was which. Who cares what it is! Views and tables are used in exactly the same way so why call them something different?
• Don't label things with each developer's name. Why would you do that? Do all the developers write different versions of the same SPs? Are they not production SPs but simply a sort of learning environment where each can write whatever he wants so it's important to keep the names straight? In that case, why do the objects have the same owners? Just use a different owner name and don't put your name in the object. But wait, why are they even in the same database? Why does it possibly matter who wrote something in the function of the database as a whole? Things should be named according to what they are or do, not who worked on it first or last.
The effort of naming is not a small one. In fact, I often use a thesaurus when designing a database. I have at times spent (cumulatively over the project) hours working out the correct naming of things. But when I was done, WOW how things flowed. I've worked with databases where the detail table wasn't, and the parent table wasn't, and the order table wasn't, and the Determination column wasn't (it was DeterminationMethod). Every single one of those is a roadblock to proper usage, even to the developer who's been working with that database for years.
The way I do it is to never add anything to table names, and to use carefully chosen words for my functions and SPs that make it clear what they are. For example, all my SPs are either SubjectVerb or VerbSubject, consistently throughout the database (still experimenting with which I like better). Functions have words that make it clear what they do. NumberToHex is a function. OrderNumberToOrderID is a function. ProductDetailCount is a function, ProductDetail is a table, and ProductCreate is a stored procedure. Develop your own pattern. I can go through my own code so fast when I've been careful like this. The meaning nearly leaps out at me.
If you must use a prefix or a suffix (I lean toward suffixes myself because I like things to group by subject instead of function) why not a single character? Tables can still do without any fluff at all. Then your other objects are still distinguishable:
OrderNumber_OrderIDF
• Think of it this way: you're trying to build a race car--a sleek, elegant, efficient thing that can go as fast as possible and be interfaced with well. What good is a racecar that can theoretically go 200mph if the steering wheel has sharp spikes all over it and the driver has to spend extra time avoiding them? He won't be spending as much time at top speed as he ought to be.
If I was offered a consulting job on a database that had every stored procedure and object using different names based on who had written them, I would refuse the work for fear of going absolutely nuts. If I was forced by my situation to take the work, I would curse every day the people who decided that putting their names on their stuff would somehow have any kind of long-term value and I would quit the moment I found something else. Long after Shannon and Victor and Felix are gone from the company, poor developers are toiling away trying to remember if that was a felix function or a victor view or perhaps it was a shannon function and a felix view! GAHHH!
I prefer to standardize things in this way: If there is an ID column, it shares the same name as the table, + ID. Now there's no confusion about what the name is. Now if you have a query with many joins and you need to use ID you're not getting confused.
• You will eventually put the wrong alias on some join. You may not even notice because the query parses fine and you save your SP and off you go, but whoops, you got an effective cartesian product because your join condition referenced the wrong ID. Or just the wrong resultset. What a confusing mess. Much better, when you accidentally put D.ProductID instead of P.ProductID, for the compiler to complain that the table aliased by D has no such column. D.ID though... that would have compiled just fine and given the wrong results farther down the path, maybe even after release. It's much better to catch errors early.
• Selecting is not where the problem is. Yes, it's easier to say SELECT ID FROM Table. Fewer characters. But the real problem is in JOINs. The problem is in old resultsets you saved in Excel that you don't know what table they came from because they so unhelpfully say ID. You don't know what some error message means that a user reported to you because all he remembered was the main column that the constraint violated and he says "there was a foreign key constraint in column ID!" Great. That's every table in your database.
I don't know how to make this point as strongly as it needs to be made. Perhaps some of the other experienced SQL people here can chip in their opinions, if they even make it in to read this thread.
Maybe I'll come up with a compelling example later. For now, all I can tell you is that naming the same thing differently in different places is, long-term, problematic.
tblOrders
Table_Orders
sp_OrderSend (NEVER prefix SPs with "sp_" because the server will look for a system SP first!)
Proc_OrderSend
OrderStatusFunc
fn_OrderStatus
fn_t_OrderStatus
even worse:
joe_sp_OrderSend
mike_sp_OrderCancel
david_fn_OrderStatus
Then there are column names:
ID
ProductID
keyProductID
pkeyProductID
pkProductID
pkProductIDCode
frnkeyProductID
fkProductID
So does this stuff help at all? Is it useful to put the type or relationship in the name of objects? Can you leave the name off the ID column in your tables? Should you put more information in there? What does it matter, anyway, as these are mere naming conventions, right? Just pick something, and stick with it!
I agree that consistency is important. But consistently helpful is a far improvement over consistently something-less-than-that. Consistently unnecessary is extra work and frustration. Consistently interfering impairs your ability and speed. Consistently mediocre is not the greatest.
No, overall, I think it is best to take a minimalistic approach. Yes, this is just my opinion. I am just one person, and some people have different approaches. But I have had experience in several shops and I have designed databases myself. I have a side database project outside my regular work where I am developing the entire thing myself, front end and back end, the the whole works. I'm sharing with you the benefit of my experience. I hope that these make sense to you, but whatever you do pick, you should have clear and compelling reasons for what you do. Notice the compelling part. Your default should be to avoid clutter and only add things if there is a materially demonstrable reason to violate that rule.
Now, first of all, much of my advice is meaningless if you're using a GUI where one performs the less taxing task of recognition instead of the recall required to write queries in text. If you are writing queries in text, more power to you, and if you keep doing it I predict you'll eventually see the sense of my recommendations.
But if you're using a GUI to write queries you'll only reach a certain level of ability, never stepping into the next level of professional skill and seasoned experience in the SQL realm. I haven't used a GUI to write a serious query in a very long time (I might have started a couple in an Access database but even then switched to SQL view). I recently helped someone who posted his query that was obviously built in a GUI and it had about 10 left joins. I seriously doubt they all needed to be left joins, which means he was killing performance for no reason.
So if you are writing queries in a text editor, here are some thoughts for you:
• Lose the underscores. They clutter up the screen. They take the shift key to type and also your pinky finger which has to move two rows up from the home row. And using CamelCase is just as clear and saves space and that pinky traversing. And if you're really feeling lazy you don't even have to hit the shift key.
order_id
customer_name
device_code_reason
OrderID
CustomerName
DeviceCodeReason
Look at it this way: if you can train your eyes to scan either kind of naming convention just as efficiently, why not use the one that involves less typing and takes less space on the screen?
• The prefix/suffix naming convention is not always understood or followed. In one shop I worked in they named tables with prefixes: tbl for base tables, tlk for lookup tables, and tin for "intersect" tables. But no one really knew what they were for sure. "tin" tables were supposed to be transactional tables that had constantly changing data such as orders and transactions. "tbl" tables were supposed to be things like customers and products. But the problem was that many tables didn't fit a clear definition, and people were careless, or came along years after the creator of the database (who yes used a GUI database creation program), and they got named wrong. So now after all that silliness and extra typing, there was a "tin" table that was really a "tbl" table and a "tlk" table that was really a "tin" table after some changes, and ... so on and so forth. Just leave off that junk.
• You'll eventually become completely sick of typing the extra characters over and over again for NO GOOD REASON. And you'll have these prefixes so wrapped up in your code and database that ripping them out becomes impossible. All those extra letters will come to be so much useless froofroo to you. At the same shop as tbl/tin/tlk they really named columns keyOrderID and frnkeyOrderID. They added nine characters of typing to EVERY SINGLE JOIN I ever had to write. After a year of working there, the pain was no less. It took me about 2 weeks to learn the primary and foreign keys of that database. The other fifty weeks I wished the database designer had left well enough alone.
• Naming is something that is so much more important than most people seem to think. What does it matter what we call things, as long as we're "consistent," right? I disagree. I need to come up with some good examples, but my experience is that the naming in a database the single most important thing that structures how programmers relate to the data and that determines their final speed and facility working with it in the long run.
Mislabeling things is like putting up signs at the cliff edge that say "keep going" and "this is the right way" when it's not. Label things appropriately.
If that sounds like a lot of bugaboo to you, then I can only offer that I think experience will teach the same lesson.
• Scanning and comprehension speed suffer when extra things are tacked on. When you look at a stored procedure or query, if you yourself weren't just working on it yesterday, then you have to digest it to understand it. The more prefix_blob_description_gorp you have in there, the harder it is to see what's going on. The assumption here is, you know your database like the back of your hand. Will that extra stuff speed you up now or slow you down? Who should you code for, the least common denominator or a reasonable level of facility? In fact, this brings up another important point:
• You do not want to give that new developer, who is unfamiliar with your database, a prematurely quick sense of competence and ability in your database. By doing this, you increase the chances that he or she will make some serious mistake. This is the opposite mistake of mislabeling. Now the path over the edge of the cliff is littered with so many friendly and distracting signals that the poor new person gets the signal to "keep going" when the correct response ought to be, "wait, I'm not sure I am doing the right thing." I'm all for helping new developers get going as fast as possible. But they ought to do this by studying the schema and practicing with it. Don't make it so apparently easy that even an ignorant person will believe he knows all about your database.
• The information is unnecessary and ultimately makes developers remember more information, not less. It sounds at first like a great way to expose information about the tables and objects. But the fact is that the period of time where a new developer isn't familiar with the database is actually very short. So for the benefit of two months of instant answers to the beginner, everyone else in the shop is suffering through unnecessary baggage.
• The implementation can change. What if you decide to change the structure of one of your tables, and to support your older versions of software, you build the new table with a new name and convert your old table to a view? Now you have tbl_Whatever that's a view. Whoops. And what if you need to use that really important sp, only you can't remember who wrote it? Now you have to look it up. What a time waster. What if you can't remember if it's an inline table function or a table-variable function? More wheels spinning.
If these seem farfetched, they are not. I have worked with databases that had "tbl" objects that were views and believe me it was a nightmare remembering which was which. Who cares what it is! Views and tables are used in exactly the same way so why call them something different?
• Don't label things with each developer's name. Why would you do that? Do all the developers write different versions of the same SPs? Are they not production SPs but simply a sort of learning environment where each can write whatever he wants so it's important to keep the names straight? In that case, why do the objects have the same owners? Just use a different owner name and don't put your name in the object. But wait, why are they even in the same database? Why does it possibly matter who wrote something in the function of the database as a whole? Things should be named according to what they are or do, not who worked on it first or last.
The effort of naming is not a small one. In fact, I often use a thesaurus when designing a database. I have at times spent (cumulatively over the project) hours working out the correct naming of things. But when I was done, WOW how things flowed. I've worked with databases where the detail table wasn't, and the parent table wasn't, and the order table wasn't, and the Determination column wasn't (it was DeterminationMethod). Every single one of those is a roadblock to proper usage, even to the developer who's been working with that database for years.
The way I do it is to never add anything to table names, and to use carefully chosen words for my functions and SPs that make it clear what they are. For example, all my SPs are either SubjectVerb or VerbSubject, consistently throughout the database (still experimenting with which I like better). Functions have words that make it clear what they do. NumberToHex is a function. OrderNumberToOrderID is a function. ProductDetailCount is a function, ProductDetail is a table, and ProductCreate is a stored procedure. Develop your own pattern. I can go through my own code so fast when I've been careful like this. The meaning nearly leaps out at me.
If you must use a prefix or a suffix (I lean toward suffixes myself because I like things to group by subject instead of function) why not a single character? Tables can still do without any fluff at all. Then your other objects are still distinguishable:
OrderNumber_OrderIDF
• Think of it this way: you're trying to build a race car--a sleek, elegant, efficient thing that can go as fast as possible and be interfaced with well. What good is a racecar that can theoretically go 200mph if the steering wheel has sharp spikes all over it and the driver has to spend extra time avoiding them? He won't be spending as much time at top speed as he ought to be.
If I was offered a consulting job on a database that had every stored procedure and object using different names based on who had written them, I would refuse the work for fear of going absolutely nuts. If I was forced by my situation to take the work, I would curse every day the people who decided that putting their names on their stuff would somehow have any kind of long-term value and I would quit the moment I found something else. Long after Shannon and Victor and Felix are gone from the company, poor developers are toiling away trying to remember if that was a felix function or a victor view or perhaps it was a shannon function and a felix view! GAHHH!
• You have certainly chosen a kind of standardization. But you've chosen a kind that makes all the columns in different tables the same name. For a shop that wants to differentiate each thing by putting the type of it in the name, now you're departing radically from that philosophy by calling all the IDs the same thing. From the beginner/new developer perspective, it seems to make things easier in some ways--no looking up column names when selecting, right? But it's not easier, it's harder. If you ever get the experience of doing it both ways in a large and busy production database (that is, ID vs. TableNameID), I suspect you'll eventually agree (unless you've become dogmatic about it by then and don't really care about the actual function or efficency any more).>>Call the same column the same thing in each table. Don't call it ID in one table and BlahID in another. Call it BlahID everywhere.
Again, some might disagree... Using a prefix might be of help sometimes - when writing SQL you get rid of table names if you have odtID as order detail PK and prdID as product PK and you need both of them in a view/sp/function.
To standardize the usage of ID argument, there is no difficulty in naming it @ID in all procs/functions.
I prefer to standardize things in this way: If there is an ID column, it shares the same name as the table, + ID. Now there's no confusion about what the name is. Now if you have a query with many joins and you need to use ID you're not getting confused.
• You will eventually put the wrong alias on some join. You may not even notice because the query parses fine and you save your SP and off you go, but whoops, you got an effective cartesian product because your join condition referenced the wrong ID. Or just the wrong resultset. What a confusing mess. Much better, when you accidentally put D.ProductID instead of P.ProductID, for the compiler to complain that the table aliased by D has no such column. D.ID though... that would have compiled just fine and given the wrong results farther down the path, maybe even after release. It's much better to catch errors early.
• Selecting is not where the problem is. Yes, it's easier to say SELECT ID FROM Table. Fewer characters. But the real problem is in JOINs. The problem is in old resultsets you saved in Excel that you don't know what table they came from because they so unhelpfully say ID. You don't know what some error message means that a user reported to you because all he remembered was the main column that the constraint violated and he says "there was a foreign key constraint in column ID!" Great. That's every table in your database.
I don't know how to make this point as strongly as it needs to be made. Perhaps some of the other experienced SQL people here can chip in their opinions, if they even make it in to read this thread.
Maybe I'll come up with a compelling example later. For now, all I can tell you is that naming the same thing differently in different places is, long-term, problematic.