Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Do we Need Databases Any More? 2

Status
Not open for further replies.

BNPMike

Technical User
Sep 17, 2001
1,818
GB
Databases came in to store data in an accessible way for corporate processes to share. The database also to a degree applied some constraints on keeping data consistent through triggers, assertions, referential integrity, typing etc. At that point applications would be separate from data and make calls using a simple logical language.

Later came the current predominant paradigm of object orientation. Now the main feature of OO is you encapsulate data. If you want to record/retrieve data you need to call an object which makes sure you can only do the correct things with the data involved. Early systems like Smalltalk would hold all their data as an ever increasing run-time of objects. When OO took off with Java, people however reverted to old habits and stuck all their data into Oracle etc. Databases however encourage people to access objects' data not through the objects that are lovingly designed to protect it.

One big factor of databases is they use disc to persist the data. Nowadays you have huge real memory space and you can buy Flash memory for peanuts. The Gigabytes you need for a typical corporate 'database' can come out of your pocket money. With Flash, you don't need to worry. Power down, power up and your data is still their. Crashes - well flash must be hundreds of times more reliable than disc.

Currently of course it's not straight-forward to store your data in main memory. Programming systems tend to think there's just a stack and a heap; they're not built to deal with an additional "flash heap" (I'm assuming you need to keep a lot of your application in volatile RAM as Flash can only be used a limited number of times). However it's not a big step to move towards keeping all your data as directly-accessible objects rather than serializing them back and forth to disc, or other filestores.

So why are we still storing data in databases? Surely we've given enough money to Larry Ellison?.

 
I doubt the MAIN feature of OO is to encapsulate data. Because even with the idea of normalization to store each data only once in your database and prevent redundance, I don't see that this will reflect in that data will only belong in one object which encapsulates it. Perhaps I'm wrong here and this would be the case to "normalize" my classes.

Nevertheless, there actually are OO databases that work as you lay out here. Perhaps take a look at
It always takes longer for such a paradigm change to take place. Having your data separate from objets is still done here, it's only much easier to persist and retrieve back objects with OO databases. I think you can't do statistical thing and datawarehousing with objects as easy as with the pure data. No doubt storage will change to faster memory, like it always did in the past.

Bye, Olaf.
 
Apparently I'm not the first person to question the role of databases. This is from
"In fact there is an intrinsic tension between the notion of encapsulation, which hides data and makes it available only through a published set of interface methods, and the assumption underlying much database technology, which is that data should be accessible to queries based on data content rather than predefined access paths. Database-centric thinking tends to view the world through a declarative and attribute-driven viewpoint, while OOP tends to view the world through a behavioral viewpoint, maintaining entity-identity independently of changing attributes. This is one of the many impedance mismatch issues surrounding OOP and databases."

The new twist on this is the availability of very large non-volatile memory spaces.

Miccrosoft LINQ also seems to be a solution to providing objects with database-like capabilities.
 
Hi Mike,

I see that same impedance mismatch of it, that's true. And OR mapping is a bad practice today, not really good in solving that problem.

There's nothing wrong about the idea to store objects into a database instead of only their attributes. But that will not reflect in databases vanishing, would it? It may change what a databse will be. But for multiuser applications, you'd still need a central server and instead of serving just the data it would serve objects from let's call them "object stores".

That would perhaps shift problems of application enhancements to altering such an "object store". You'd shift the problems of updating data to reflect these enhancements in updating objects to reflect class changes.
And I think that would boil down to striping off attributes to new object structures, wouldn't it, perhaps you'd have the transition of object generations with some new kind of inheritance.

Well, as you mention LinQ, Microsoft addresses that object oriented approach with their Entity Framework:
Bye, Olaf.
 
Mike,

I think this is two separate questions, rolled into one.

Q1 Why do we need databases?
q2 With memory so cheap, why do we need disks?

Dealing with q2 first, we clearly don't need disks, and never have. It would be extremely reliable to use a hammer and chisel to engrave data on to stone tablets. This method has managed to successfully store data, without corruption, for literally thousands of years, but the write time is rather poor.

Nonetheless, disks are a reliable, well-proven, cost-effective and reasonably fast way of storing and retrieving data. When combined with RAM buffering they offer exceptional performance for their cost. Once memory gets cheap enough, the rotating hard disk will cease to be manufactured. Note that some high-specification lap tops no longer have hard disk drives, they just come with lots of non-volatile memory. The question is not if, but when the HDD ceases to be cost effective.

With regard to question 1, this is a classic reinvention of the wheel. A database with a stored procedure interface does all that you need. Only legitimate business functions can be performed, and you only get out meaningful consistent data. Moving all of this out into J2EE is a good way for J2EE programmers to ensure that they stay in business.

Nothing can enforce data integrity like a RDBMS. They are designed from the ground up to do so. J2EE is not. Any application which doesn't rely on the RDBMS for integrity is doomed to succeed in keeping lots of programmers on overtime, and many DBA's on callout.

Objects are not lovingly designed to protect data, as they regularly screw it up. What protects data is constraints, be it check, not null, referential integrity, default, range validation or whatever. Writing procedural code to achieve the same thing as one line of declarative text is bonkers - don't even try to claim that a general purpose programming language is better than a purpose built and designed db at handling and protecting data - sigh.

Regards

T
 
Mike,

I think this is two separate questions, rolled into one.

Q1 Why do we need databases?
q2 With memory so cheap, why do we need disks?

Dealing with q2 first, we clearly don't need disks, and never have. It would be extremely reliable to use a hammer and chisel to engrave data on to stone tablets. This method has managed to successfully store data, without corruption, for literally thousands of years, but the write time is rather poor.

Nonetheless, disks are a reliable, well-proven, cost-effective and reasonably fast way of storing and retrieving data. When combined with RAM buffering they offer exceptional performance for their cost. Once memory gets cheap enough, the rotating hard disk will cease to be manufactured. Note that some high-specification lap tops no longer have hard disk drives, they just come with lots of non-volatile memory. The question is not if, but when the HDD ceases to be cost effective.

With regard to question 1, this is a classic reinvention of the wheel. A database with a stored procedure interface does all that you need. Only legitimate business functions can be performed, and you only get out meaningful consistent data. Moving all of this out into J2EE is a good way for J2EE programmers to ensure that they stay in business.

Nothing can enforce data integrity like a RDBMS. They are designed from the ground up to do so. J2EE is not. Any application which doesn't rely on the RDBMS for integrity is doomed to succeed in keeping lots of programmers on overtime, and many DBA's on callout.

Objects are not lovingly designed to protect data, as they regularly screw it up. What protects data is constraints, be it check, not null, referential integrity, default, range validation or whatever. Writing procedural code to achieve the same thing as one line of declarative text is bonkers - don't even try to claim that a general purpose programming language is better than a purpose built and designed db at handling and protecting data - sigh.

Regards

T
 
Tharg

Disks are being replaced by chips as you say, but so far in the form of SSDs, ie the flash memory is being dressed up to look like something else. People are using flash for physical reasons (power consumption, durability, weight, size, speed) rather than the logical reason of changing serial access to random access. Currently this trend does not make any difference to database theory.

Almost everyone has grown up with filestores for their data, so it's not easy to stand back and say "why?". Databases were so much cleverer than simple filestores, and nowadays work however programmers choose to program, so they impress people (me included) but it doesn't mean they are the only solution to keeping your data when the electricity goes off or some hardware fails.

What you are very specifically saying is a database has more capability to maintain the correctness of data than a programming system. Now bear in mind the database system was itself written in a programming system in the first place, I would be interested to know where that additional power came from. Is there some special gas in the room when they compile SQL Server?

Just in the way many people never cottoned on to SQL, and used a relational database to pull out data with cursors and run through the data procedurally just like they used to, I suspect the majority of corporate OO programmers just use OO languages to create procedural programmes, and never create corporate objects. Well they can't really - once they expose their data into endless tables, their classes are really just old-fashioned code modules.

 
Databases are targeted towards non-tech businesses. Most companies I've known certainly don't have OO ingrained into their IP. Tech companies whose core product is software and systems might be a different story.

Furthermore, many companies would need years to migrate their relational databases from one vendor to another, let alone shift to a completely different paradigm.

I could see OO gaining ground in phases for new projects, but relational will still be here for a long time.
 
I totally agree that even if I am 100% right, nothing would happen for decades, except in specialist areas - in the past that might have been investment banks, but we haven't got any of those anymore.

One practical problem at the moment is operating systems and programming systems assume memory is volatile so there would be no use safely keeping all your data in non-volatile RAM if nothing realised it was there when the electricity came back on.

 
Ha, I like your style, and no, there was no special gas in the room.

However, I think my point is still valid because integrity constraints of whatever nature are declarative and not procedural, therefore no code "runs" when they are used. Wait 'til I finish here mike, I am aware that any declared constraint will be achieved by means of some calculated go/no go value which will require code, but patience please.

The difference in power between the declarative and procedural is immense. For example, I believe that in C it is possible to make an ASSERTION, i.e. in one's source code one may assert that under any and all circumstances one value will be bigger than another. This will always be detected by the language itself, without the programmer writing a single line of code to check. If this constraint is violated (to use rdbms speak) an assertion failure will be generated.

C programmers trust assertions because they are declarative and applied by the system itself. They are relying on the fact that the language has been around for a while now, and that such fundamental elements as the compiler work correctly. They rely on their system for checking, not their code.

Likewise, as a DBA, when I put a check constraint on a field, I declare the constraint to the system and then quit worrying about it. The RDBMS has been around for a while now, and such fundamental elements as the validation and checking routines work correctly. I rely on the system for checking and not my code.

If you browse ask Tom (or its sql server equivalent) you will see scores of posts from those who chose to use code to enforce integrity instead of the RDBMS engine. All their developers get lots of overtime! Now you might consider that a good thing if you're on a good rate, but me, I'd rather go home at the end of the day.

Also, if you think about it, the purpose of a constraint is very simple. Once the code is written to enforce say a "NOT NULL" constraint, that code is run over and over again, in every RDBMS wherever that technology is employed.

Therefore, the constraint checking engine has been subject to several trillion unit tests, each one verifying that it is correct. Although this by no means guarantees that the code is perfect, it does lend it an awful lot of credibility and gives me confidence in it.

Regards

T
 
Tharg

You don't need to be so defensive. If I'm right, DBAs simply won't exist in the future. People will talk about them like people now talk about carrying disc packs around the computer hall. However that won't be for a very long time.

Let's use your word 'assertion'. What you are saying is if you put an assertion (trigger or whatever) in your database you can sleep easily at night because Oracle's (or whoever's) assertion mechanism has been tested more than some programmer's code. Well remember if the programming system is Cobol, the compiler has been around a lot longer than any database code.

What you're actually saying is "I trust myself to protect data more than some programmers". The whole idea in OO however is that only one class can access that data. There is no reason why you're better at protecting your data than the programmer is at protecting his class. Not from where I'm sitting.

Actually I'm probably wrong. The DBA will morph into an object architect. I guess someone will have to make sure that data is 'normalised', accessible and backed up at appropriate frequencies, and all that kind of thing. And then you'll get paid even more than you do now - if that's possible.

 
Mike,

I think terminology may be getting in the way here. I know little or nothing about OO, apart from chats in the pub with developer colleagues, but you obviously do. So, I'll try to clarify what I mean a bit, by reference to your response.
BNPMIKE said:
if you put an assertion (trigger or whatever) in your database you can sleep easily at night...

You said more than you know. A trigger in Oracle (which is what I battle with) is some code that is executed (fired) whenever a specified event occurs. Sadly, they are often used to enforce data integrity in Oracle, and the code that the developers write is known to 'gang aft agley' (to quote the estimable Mr Rabbie Burns). Developers seem to love writing code, instead of using declarative constraints. Again, Tom Kyte repeatedly answers questions with "don't do this, just use a constraint".

I think maybe the problem is more deep seated than any technology. I think it's that developers write code that they're comfortable with, based on their current knowldge. Even if this is just plain wrong, they'll naturally stick to what they're comfortable with, and fight their corner.

Time and again we see developers writing code, only to be told that "Oracle already has a standard feature which does that".

To give you a flavour of the Pandora-like nature of this 'ere thread, I'll give you the following pointers. Would you mind having a look, and seeing if they make better sense that I do?


Don't get bogged down in the Oracle-ish ness of it all, just look at the english language parts of the to and fro. You've just gotta love it, or run screaming in terror.

Regards

T
 
I'll have to read the rest later, but this is just flat-out classic! [ROFL2]
Dealing with q2 first, we clearly don't need disks, and never have. It would be extremely reliable to use a hammer and chisel to engrave data on to stone tablets. This method has managed to successfully store data, without corruption, for literally thousands of years, but the write time is rather poor.


--

"If to err is human, then I must be some kind of human!" -Me
 
Besides the specialising on data storage and integrity, replication, security and statistical computations like OLAP Cubes, Datamining etc. Simply making a program and all it's objects reside in memory by eleminating volatile memory will not help in multiuser environments. You're thinking too short if you think you can eleminate the central storage, because it's also a place to exchange the data with other users.

I'm still saying todays databasse technique could change from record storage to something you'd perhaps call an object store and instead of the atomic INSERT UPDATE and DELETE and constriants and cascades you'd have some standard and some individual custom methods of those objects. Like you now have stored procedures which do an individual thing like a bank transfer, those would be bound to the data objects instead of called by the application. In fact stored procedures are a quite oldstyle functional programming. Being able to use .NET languages as SQL Server Stored procedures at least enables you to handle data more object oriented.

But I think it won't be the same objects you use in your client side application to work on the data and let users interact with them. And it's not simply a persisting by nonvolaile memory that will change the database world.

Bye, Olaf.
 
Another wrong impression you have is about objects protecting data. You can create protected properties and encapsulate them to only be accessed by the object itself. But that's not the kind of data protection you think it is. It's meant as a language internal agreement, and it's not th OS or even hardware limiting access to the memory the object is stored in. The kind of encapsulation and protection protected properties and methods are for in OOP are not technical protetion, it's only enforcing the programmer to use the offical way to work on that object with the interface of public properties and methods the objects exposes. Finally only the data that is object specific is really hidden/private/protected and never will go outside. But something important like the account balance of an account object must be accessible by some way from outside, otherwise you could not make transactions in either way, could you. So even if objects were protecting a property like account balance, there needs to be a way to influence that by external objects.

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top