Determining table relationships from client access code 2

jrbarnett · Sep 26, 2006

I've recently inherited a number of databases (Access, SQL Server and MySQL 5) all set up in a very poor manner by different groups of people with different ways of working.

The common traits are having no referential integrity whatsoever, no primary/foreign keys on tables and nonsensical field names so I can guess that none of them were designed by database professionals.

As a first step in order to understand how the data are stored, I've decided the most beneficial first task would be to to produce entity relationship diagrams for the tables in each database, then I'll see about setting up primary keys for each table and referential integrity between tables, fitting in any data fixes if necessary.

Sensible fieldnames can come last, because it will involve changing client access code as well.

Questions:
1. Examining the source code for the applications that use this, I have some database access code (part in VBA, part in SQL scripts and stored procedures). Is there a way I can use this information to help generate the table relationships without doing it manually (because of the number of tables, to do it manually would take a long time)?

2. Any more advice or hints on how to tackle this?
I have no database diagramming software available (other than what comes out of the box or is available for free download)? Getting my boss to spend money on this is unlikely, but any recommendations for useful tools would be appreciated.
I've got phpMyAdmin up now connecting to the MySQL server, but that's it.

All the systems involved (clients and servers) are Windows based. I should also say that I have no experience with PHP at all before yesterday, but am willing to get my hands dirty to understand the code.

Its a daunting task, but will hopefully result in me being able to understand the applications (all home grown in a similar vein to the databases, a mix of VB6, Access VBA and PHP) a lot better, and so provide a better support service.

John

ceh4702 · Sep 26, 2006

Well you cant just rename field and table names without causing a lot of trouble for Queries and other things that may be accessing the information. You may be better off learning the applications they are using by interviewing users and just see how they are managing the database currently. Make them show you what they are doing.

You should probably just learn what they call the fields and learn to deal with them. It might be possible to build views and just rename the fields in a view so you can understand how things work. It is hard to manage tables with no primary keys or whatever. It is possible to manage all that with applications.

If you do not like my post feel free to point out your opinion or my errors.

harebrain · Sep 26, 2006

The client access code can give you a clue to the relationships between tables: wherever you find a JOIN, the attributes compared are likely primary (or alternate) key-foreign key candidates.

Cmap, while not strictly for ER diagrams (thus without any specific database features) is versatile enough to use for the purpose. Find it at

http://cmap.ihmc.us/

jrbarnett · Sep 27, 2006

I'm aware that join information can be used - but can this be done automatically - ie I feed it the SQL code and it can define the relationships, rather than me doing it manually (there are many tables and the links would take a long time).

I'll certainly take a look at CMAP and see what it can do. Thanks harebrain.

John

harebrain · Sep 27, 2006

I'm not aware of anything that would perform this task automatically, and I'm not sure I'd trust it anyway: some programmers write bad SQL. And you'd still be on your own examining the non-SQL code to determine the cardinality/optionality of any relationships.

With the lack of embedded information you cite, it looks like a labor-intensive job to me--no way around it.

jrbarnett · Sep 27, 2006

Having seen examples of the work done by one of the developers, I think I'm going to have to do this manually anyway, as I've already performed optimisations to code he has written with quite spectacular speed increases as a result.
Its just going to take a lot longer than I would have liked.

Never mind, it was a good idea when it lasted.

John

Olaf Doschke · Sep 27, 2006

Hi John,

even very expensive case tools offer no perfect automatic ERM of a database that has no relations defined so far.

One way may be looking for fields with the same name, and often the primary key is stored in the first field of a table. I assume by what you said about the poor coding and naming you cant identify key fields by some ID suffix, they may often even have no key field or just a compound key.

Bye, Olaf.

DanJR · Sep 27, 2006

just to add a couple of open source tools to the list:
- DBDesigner4
- DBDesigner Fork

cheers,
dan.

PS Good luck!

jrbarnett · Oct 3, 2006

DanJR,

All I can say about those two is wow. Not quite as much functionality as the commercial rival packages.

Within a couple of hours of installing DBDesigner 4, I had ER diagrams of four previously undocumented SQL Server databases generated.

I've now started in some of the systems setting up primary keys on some tables, then will sort out proper referential integrity at a later date, because knowing this system, there are lots of data mismatches between them.

I've still got several others that I've not touched, but its a huge step in the right direction.

John

ceh4702 · Oct 3, 2006

We did a conversion between our legacy mainfram and a new database system and we were using the SSN as a key. You would be surprised how many duplicate and incorrect SSN's we found in our system. You are probably going to have to constantly use queries designed to look for the mistakes and then it takes forever to get user to fix anything they have corrupted.

Whatever a user can do wrong they will try to do, even if it makes no sense to any logical person. You will probably have to implement application changes that limit what they can do at every turn.

If you do not like my post feel free to point out your opinion or my errors.

DanJR · Oct 4, 2006

Hi John,

Glade to hear the software was useful. what would also be really cool is, if the software added some functionality (e.g. as a plug-in) for infering functional dependencies in the presence of messy data. I did a google search and found this application/code. I haven't tried it though, but thought it would be a useful tool to help guide the reverse engineering process.

cheers,
dan.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Determining table relationships from client access code 2

jrbarnett

Programmer

ceh4702

Programmer

harebrain

MIS

jrbarnett

Programmer

harebrain

MIS

jrbarnett

Programmer

Olaf Doschke

Programmer

DanJR

Technical User

jrbarnett

Programmer

ceh4702

Programmer

DanJR

Technical User

Similar threads

Part and Inventory Search

Sponsor