Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

ORM Article

Status
Not open for further replies.

ca8msm

Programmer
May 9, 2002
11,327
GB
Whilst maybe not directly related to ASP.NET, it's probably something we all have to use as part of our development process and Jeremiah Peschka has just written an interesting article that I found useful and that you may want to read:


I think he sums up my take on the whole process quite well.

Mark,

Darlington Web Design
Experts, Information, Ideas & Knowledge
ASP.NET Tips & Tricks
 
stirring the pot :) this should make for an interesting thread. He seems to focus on data access and how procs/sql is faster and more secure. there are trade offs though.

ORM is one solution to data access. in actuality there are 3 approaches to data access. each is suitable for a particular scenario and (believe it or not) all three could be used within a single system, if the reasons can be justified.

a common example is using an ORM to manage the transactional processes and transactional scripts to pull data for reports.

there are transactional scripts. which as the names implies is just that a string/text file/etc of a complete script.

next is the data table gateway. this is a nice marriage between stored procs (or tables) and OO. in fact this is the concept used by DataSets. for simple domain models this works well. if all your "domain" consists of is dumb containers for database records this works well.

as you progress into complex domain models a simple 1:1 coloration between proc and data set becomes tedious, this is where data mapping (ORMs) come to play. mapping the complexity of the domain model to the database. this complexity exists, so whether ORM or procs are choosen, either way that complexity must be handled.

there are other reasons to use ORM besides not wanting to write sql. there are concerns for: unit of work, identity maps, translating relational data to behavioral objects, database transparency.

if a system is designed around stored procs the system is tied to sql server. Same would be true for Oracle "procs". with an ORM you can attach to just about any relational database. This is a very real concern for 3rd party apps, I do not want the vendor/application dictating which database I will use.

ORMs are not a cure-all, they are a tool designed to solve a specific problem. translating behavioral objects into relational data and vice versa.

with any tool there is a learning curve. NH arguably has the most difficult. NH is very powerful and complex. many steps have been taken to simplify this, but it is a complex tool. the same is true with heavy construction equipment, MRI/xray machines, and rocket ships. The difference is other professions require training and certifications to ensure the operator knows how to use the tool. in our profession this level of stringency doesn't exist. so when the average .net developer looks at NH for the first time, they wonder how can I understand this and why would I use it?

I find LLBL is a good tool for entering the world of domain models/ORM. I can't stand the non-poco nature of domain objects, which is why I stopped using it, but it opened me to a new way of thinking about data access and for that it is a great tool, especially if you like code gen and UI wizards.
...

I also do not believe he is entirely accurate saying DBAs operate differently than developers. there are DBA's who understand the value of Data Mapping (ORM frameworks) and work with the developers to bridge the gaps between data and behavior.
Then you have DBAs who believe they are out of job if procs are replaced with ORM or dynamic sql. That's like developers thinking they are out of job if the language X is replaced with Y. It's not DBA specific, it's a mindset.
....

he mentions that developers think in objects and DBAs think in sets. OK, but the amount of data is also going to play into the preformance of objects vs sets. a large amount of UI interaction deal are things like shopping carts, order processing, account balances, etc. short operations requiring small sets of data.

Then you have bulk operations. loading a new catalog of products. compiling year end summaries, migrating from database X to database Y. Usually there is little to no business logic involved. maybe some number crunching, but nothing complex. for this ETL (extract transform load) is appropriate and most/all ORMs are not designed to solve this problem.

Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190
faq732-7259
 
stirring the pot :) this should make for an interesting thread.

Actually, that's kind of why I posted the link. After your thread855-1546098 thread, I convinced our team that for our next project that we should try to use an O/RM to see if it did bring us any benefits and the system we built has now been live for just over a month.

I just thought I'd share my thoughts now that I have first hand experience of using one and what pros/cons I think it brought us.

To give a brief overview, the system was for a local council (around 2500 users) to record information relating to the type of contact that they receive from the public (i.e. they log each phone call, letter, email etc) for each department. Anyone can log an entry in the system and after a month of running the system we can see that on average there are around 16 entries per second across the system so there is a reasonable level of input into the system. There is also a second "management system" that has built in reports to allow department managers to run reports for their own department. There is also a corporate department which has access to all reports. Both systems run on the existing SQL Server 2005 database server that the council currently uses and it will be accessed using windows authentication.

Initial Development

There was always going to be a bit of a learning curve for our team here (and therefore this stage took a lot longer than normal, but that was to be expected) so it's quite difficult to gauge whether it would have sped up our database and data access layer creation. If I had to guess though, I'd say it was negligible either way between using SSMS Tools Pack to auto generate our stored procedures and then tune them as it was to create the mappings from the ORM so I'd be inclined to say this section was a 50/50 split in term of the benefits. However, I also understand that this is purely an opinion as opposed to hard facts backed up by statistics.

Performance

I don't have two sets of figures to directly compare, so these are just my observations.

There appears to be absolutely no difference between what the O/RM produced for the system that the users use to insert the contacts and what I would have written in a stored procedure. The inserts work quickly and there are no issues at all with performance and speed.

The management system does have some problems though. The reports that they run are usually just "grouping" type of reports so that they can see how many of each type have been entered, and split across departments for certain timescales. Looking at the cached execution plans, the generated sql quite often uses the primary key index on the reports which depending on what options the manager selects isn't necessarily the best one. Using a stored procedure I would simply add a hint to tell it what index to use which would help speed things up, but I'm not sure if this is possible in the ORM.

Security

This was a bit of a concern for us. As we were using SQL Server with Windows Authentication, users and managers connect to the database using there own security context. As the ORM obviously uses dynamic sql we had to allow direct table access so that any SELECT,UPDATE,INSERT/DELETE statements could be executed for the users and managers (essentially, people became data readers and writers). However, this presents two problems:

1. It means anyone with Microsoft Access/Excel (i.e. everyone in the council as it's part of the corporate desktop) can connect to the database and view and modify data.

2. Due to the previous problem, it means managers from one department can see data from another department using the Office method. This has caused a bit of a stir in the council and they weren't happy that this could be done.

We are currently looking into whether we can change the design to put each departments data into different schemas, or somehow partition the data to limit access based on the users windows authentication credentials.

Maintainability

This has been a bit of a mixed bag really. A couple of my developers have liked the OO approach that they have when accessing the data layer and say that it will help maintain the project, especially when they come to view it months after having written it, or when a new developer looks at is as it should be more obvious what is going on.

However, it has also caused us problems in terms of modifications to the data retrieval and the re-deployment of the application. We had to make a change to the user system to exclude some contact types that they now have deemed "inactive" (it wasn't an initial requirement to have a active/inactive flag for the types, so we just appended this field onto the table, updated the values for existing records and set the default for new entries). As this change meant we had to add the where filter to the code (as opposed to in the stored procedure) this meant that we had to recompile and redeploy the application. As we are not allowed to take any systems down during working hours this meant that we had to schedule someone to do this out of hours which also meant that there was an extra cost implication to the business that wouldn't normally be there.

Summary

I realise that not everyone will have the same experiences we have had but overall I think there were a few benefits the ORM brought us that we don't have now (such as the OO approach both in terms of initial development and maintenance going forward). However, and I don't know whether this would be applicable to every project and/or other people in their working environment, I have to say that we found that it caused more problems than it actually solved. The performance problems I'm sure can probably be ironed out to be negligible (or at least I hope they can), but the security and deployment issues have been a real pain for us (and these are issues that we wouldn't have usually had).

So, for us, our analysis proves that it probably isn't in our best interests to use an ORM and that a stored procedure approach would be a more secure, cheaper and less problematic development path.

Mark,

Darlington Web Design
Experts, Information, Ideas & Knowledge
ASP.NET Tips & Tricks
 
Mark, excellent insight! Thanks for sharing.

In response to your pain points...
----------------------------------------------------
deploying code updates can be tedious, even for small changes. I find for web projects this is minimal since there is usually only 1 web server. you can get into webfarm scenarios which is similar to desktop deployments, in that there are multiple target machines. Using an automated deployment process can ease the pain here, so "off-hour" updates only takes minutes, instead of hours.

I wanted to utilize automated builds at my office last year, but that didn't happen. 2010 it will, as our projects become more complex with windows services, messaging and websites deployment configuration is becoming tedious and I need a want reduce the chance of error.

----------------------------------------------------
ORM in terms of reporting is a mixed bag. I think there is where using the best tool for the job really comes into play. I have seen a number of approaches:

1. a single domain with a single database accessed by an ORM. this works with simple reports and when reporting latency isn't an issue. like waiting ten seconds or one minute.

2. a single domain with a single database access by an ORM for data modification. this access is handling the logic and validation of the workflow. reports use raw sql against the same database. since reports are read-only data queries can be fine-tuned at the source. This works well when reporting preformance is critical or the complexity of the reported information is high. there isn't a need for the domain model logic/validation, just get the data and present.

3. 2 domain models and 2 databases All accessed via ORM. this is definitely more complex. one domain & database is optimized for data input. changing a contract, logging a call, removing a customer, etc. the second domain and database are optimized for reads (reporting). I have read about this approach on Udi Dahan's blog. usually it involves high amounts of through put that is maintained asynchronously. they usually work like this:
1. user sends a request to modify entity X
2.1. request received
2.2. database updated
2.3. another request is generated to update the reporting data
2.4. user notified the update was successful
3.1. reporting request received
3.2. data transferred from input database to reporting database.
I have only read about this scenario when working with a service bus as the middle tier between the UI and the database.

----------------------------------------------------
security. I cannot speak much to this, other than repeating what I have read/seen others do. one approach, which is common for ISPs, is to limit who can connect to the database server. for example, only the webserver can connect to the database server. this prevents clients from connecting directly to the database server and accessing the database via Access/Excel.

when there is a single UI (web) the data access control can be integrated in this layer. however if the user needs to access the data in a variety of UI's (web, excel, access, 3rd party software integration) a separate access layer is required. instead of the application directly accessing the database, they are required to call the access layer which will proxy their request to the database.

as for securing data per user by table/row/column on the database. I haven't worked on a project that required this
level of control. If I did, my first approach would be to incorporate Rhino.Security into my application. I would still use the reader/writer sql roles at the db level. I would use Rhino to define user/group security settings within the application. Rhino.Security requires Nhibernate which makes this approach tool specific. However the concepts of Rhino.Security could be ported to work with any data access mechanism.

----------------------------------------------------
I was hoping for more input from the community on this topic. We would love to hear your experience with data access.



Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190
faq732-7259
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top