Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Replication of Progress to SQL Server using triggers?? 3

Status
Not open for further replies.

Natalie

Programmer
Jun 6, 2000
29
0
0
US
To date, we have been importing the entire Progress DB into SQL Server every night. This process is now taking over 12 hours. We are looking into replicating our Progress DB into our SQL Server data warehouse in real time.

We are researching the following:
SonicMQ (messaging)
PeerDirect
A warm standby Progress DB
Schema triggers

Does anyone have any experience with any of these and can you give me the pros & cons?

Does anyone replicate their database to a data warehouse on a daily basis? If so, how?

Thanks for any ideas!

Natalie
 
Hi Natalie,

We do a near real-time replication of the most significant tables of our Progress database to a SQL Server database using schema triggers. Basically, this involves two steps that amount to a bare-bones, home-spun message queue:

1. Triggers on Progress tables that write contol information to a queue table in the Progress database. To avoid duplication during the actual data transfer, we collapse multiple similar actions to one entry in the queue table.

2. Scheduled DTS tasks that use SQL Server OPEN QUERY in stored procs that read the Progress control table, retrieve the changed data from the Progress tables, and perform the necessary INSERT, UPDATE, DELETE statements in the SQL Server tables.

PROS:
- It's reliable. After we worked out the implementation, the deployment has been very stable.
- OPEN QUERY is fast. AFAIK, You won't approach this performance with the other methods.
- You can tweak the DTS scheduling to create an acceptable balance between system performance and timeliness of the data.

CONS:
- A lot of languages/tools: Progress, Java, TSQL, DTS.
- A high-volume of changes to a single table (heavy transactions) in the scheduled window overruns the ODBC driver or Progress. (We can't get a straight answer to this problem.) Therefore, we can't use this method for our inventory table. BUT any of the options you're considering are likely to run into the same problem for the same reason. So before you pay for a vendor-supplied solution, make them prove it works in your real-world situation.

We implemented this solution in-house in just two months. That was after more than a year of screwing around with a vendor who was trying to implement a proprietary messaging system (not Sonic.) That vendor effort was an abject failure.

HTH and good luck,
harebrain
(If you find this answer helpful, give me a star!)
 
Thank you for your response. It's very informative. I hope you can answer a few more questions.

Where does the Java come in? I understand the Progress, TSQL, and DTS (we use all of those currently).

Also, can you explain your Queue table a little more? What data is kept there? I'm thinking the info we'd need is:

Action (add/mod/delete)
Table
RecordID (some sort of value to ID record in Progress AND in SQL Server) - What do you use?
Fieldname
Before
After
Date - for auditing
Time - for auditing
ID - for auditing

What sort of transaction volume does your Progress system do? We have about 10,000/day (adds/mods) by users (in a call center) + nightly client imports (potentially up to 1,000,000)

Thank you so much!!
Natalie
 
Hi Natalie,

We have Progress stored procedures written in Java. Your mileage may vary.

Our queue table has the following attributes (pretty much the same as you considered:)
Code:
   table-name
   record-id
   action
   old-keys
   trigger-date
   trigger-time
   trigger-userid
   export-date
   export-table

We don't track the field that was updated: we don't care. Because Progress only has delete and write (insert/update aren't differentiated) triggers, when a write operation occurs in the queue we always copy the whole record. We use the record-id to find the Progress record to copy; the old-keys finds the SQL Server record. ("Old" keys because we allow changes to primary key field(s).)

Oops, gotta go. I'll get back with more later.

HTH and good luck,
harebrain
[Thanks for boosting me back into the "Top Experts" list!]
 
Hi again Natalie,

To continue where I left off, first let me clarify the comment about Java: our Java stored procedures have nothing to do with the data transfer, so don't worry about it. (It's all coming back to me now!)

We don't worry about before/after images of the records: if there's some hiccough and the transfer gets derailed, we just null the export-date stamp in the queue table and the transfer will be re-attempted. (This happened several times during initial deployment, but now that the kinks have been worked out we've had no problems of this sort.)

The real trick is to tune the timing of the DTS scheduled processes: do it often and don't try too many transfers at once. Our procs usually grab something like TOP 100 from the queue. If you're too aggressive with that number, you can overrun the query buffer in the ODBC driver (we did narrow in on that problem since my original reply.) I think 1000 backed-up queries is where the driver dies; we're being conservative with that 100 figure.

Currently our volume is around 50,000 transactions per day. As I said earlier, we don't try this on our most active table, which would generate more than a million triggers per day: that's what first choked the ODBC driver. Doing your daily processing shouldn't be a problem.

That nightly import would be a test of the boundaries of this approach depending on how many tables you're updating: a million transfers distributed over several tables would be easier on the scheme than a million hits on a single table.

But you could also take a different approach with the nightly batch: disable the triggers for the import run and simultaneously do the same import on both databases. (I assume that there's an overnight maintenance window when your databases aren't being pounded by users.)

HTH,
harebrain
 
Hi harebrain,

Just to clarify an earlier point you made:
<harebrain>
Because Progress only has delete and write (insert/update aren't differentiated) triggers, when a write operation occurs in the queue we always copy the whole record.
</harebrain>

Progress also has a separate CREATE trigger, and within your WRITE trigger you can use a NEW function test whether a record has been freshly created. You could also BUFFER-COMPARE between NEW and OLD buffers of the current record, so this opens up the opportunity to just replicate updated fields rather than the whole record.

HTH
Mike.
 
Hi Mike,

That's valuable information that might influence Natalie's approach. Thanks for adding to the discussion, as my standpoint reflects an operation that might be atypical.

We have, perhaps, a different set of problems in that we're building onto a fragile fifteen-year-old legacy system. (Rumor has it that it was written in Progress version 3, which may or may not be true.) After we convinced our DBA to even let us create triggers (heretofore totally absent in this system) we agreed to keep them short, so as to minimally affect system performance. Hence our strategy of doing little more than recording the trigger type and record identifying info.

The ability to synchronize individual fields isn't necessarily a worthy goal. For one thing, it substantially complicates the code required to update the target database. Also, because we always update all the corresponding fields in a target record, we can collapse multiple updates of one source record (multiple trigger firings) to a single entry in the queue table for any given transfer window. This turns out to be very significant because it helps us avoid the ODBC query buffer overrun problem, not to mention that it similarly reduces the number of transfer queries, thus also reducing processing and traffic.

Regards,
David
 
This is a follow-up to statements I made in this thread concerning ODBC problems. These statements include:

CONS:
...
- A high-volume of changes to a single table (heavy transactions) in the scheduled window overruns the ODBC driver or Progress. (We can't get a straight answer to this problem.)
...
If you're too aggressive with that number, you can overrun the query buffer in the ODBC driver (we did narrow in on that problem since my original reply.)

I finally discovered that we did get a straight answer to this problem from Progress's tech support. They determined that our system was overflowing the prepared SQL statement buffer, not the SQL query queue itself.

This is a problem with the ODBC driver. That said, I have to lay some of the blame for our problem at Microsoft's doorstep: they promulgate the notion that it is always preferable to prepare SQL statements before execution because statement preparation is purportedly a bottleneck. Clearly, though, we filled up the prepared statement queue much faster than those queries could be submitted for execution, which caused the process to run away. The moral is that it is not always better to prepare SQL queries: had we sacrificed a little efficiency, we might have met with more success.

Kudos to Progress's tech support team for running this down, especially since the ODBC driver is not their product.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top