Duplicating Disk-to-Disk

hogburn · Oct 18, 2004

Hello Forum -
We are in the early planning stages of migrating to a disk based backup solutions. Using Netbackup (4.5 or 5.0) we intend to backup our clients to a Virtual Disk Libray (VDL_1) and then duplicate the Primary Virtual Disk Library (VDL_1) to a remotely located Secondary Virtual Disk Library (VDL_2). VDL_1 and VDL_2 would be exact clones in terms of their make, model, and configuration. We are strongly considering EMC's Clariion Disk Library. We want to use Netbackup native tools such as Vault or bpduplicate.
For auditing purposes, we would archive to removable tape from VDL_2 on regular intervals.

My question: Is anyone using Vault or duplicating images from one Virtual Disk Library to another Virtual Disk Library? Any thoughts are welcome..... Thanks!

bswip · Oct 18, 2004

The real question is why are you using VTL'S? Are you using them to speed-up backups or restores. Remember your backups will only be saving the mount time and may not in fact be any faster. Why? Well if you have a client that is slow to backup to tape what makes you think it will be any faster to backup to VTL's?

When you duplicate to VTL-2 is this in the same building or is a completely remote location (i.e. D/R purposes)?

Now when you migrate to tape media, if this is for D/R, then you should consider that NBU 5.0 and up can create simulated Full backups. However, they are not multiplexed, so in a D/R situation, you will be restoring your clients one at a time from the tape media. Is this what you want?

When you deploy VTL's will you only be performing Incrementals or Differentials from that point on? (WE would only do a full backup for the first time on any new client). Then you would be doing synthetic fulls for everything else. This is avaialble in 5.0 and 5.1. Perhaps you should look into this and thus reduce the size of your VTL's.

We are exploring VTL's and those questions are what we are asking the vendors. I have seen 2 possible VTL's so far, one from STK, but it is not available now and one from a company called SEPATON

http://www.sepaton.com

I hope this answers some of your questions?

hogburn · Oct 19, 2004

Thanks BSWIP! Excellent Post.
We are currently doing 5 daily incrementals with one full on the weekend. We're using LTO Gen 1 tapes and believe we wou
Our implementation goals are twofold.
1. We desire to improve operational recovery while reducing costs at our primary site by deploying VTL-1. Physical tape has become unreliable (restorability) and we desire to both mitigate our risks associated with faulty tape while also increasing our recovery speeds by having the data available. We have assessed the cost of our current tape environment and we believe we can save money by deploying a virtual tape system. Consumable tapes are expensive and tape handling in our large environment has become too labor intensive. We wish to have our personnel proactively managing the environment rather than handling tapes. Also, we have reviewed our restore trends and estimated downtime costs ($ per/hr of tape v. virtual tape) and find that we can justify implementing VTL.
2. We desire to improve disaster recovery by replicating (hopefully E-VAULTING) to a remotely located VTL-2 about 25 miles away. We believe that we can reduce risk in the event of a disaster and rely less on third party resources to manage our offsite tapes. We estimate that we could improve both recovery time (less than 24 hours to recover) and our recovery point (less than 24 hours worth of change after recovery).

The archive to tape would fulfill regulatory requirements. We would establish appropriate retention policies according to these regulatory requirements. I believe our requirement is to be able

As we are already using Netbackup, I want to use VAULT or bpduplicate to move the data from VTL-1 to VTL-2.

lenski · Oct 20, 2004

Intresting point of view and if you have done the sums then I'm sure it works for you. However I have 18 tapes sitting on my desk which is the total failed tapes from 3 years and a pool of 8000 tapes. Not bad stats I think, when compared with the number of discs that have failed in the same time. Of course I have no human costs because the tapes all sit in 2 big libraries. Our's is a mirrored site so there is no need to move that tapes around. Of course because the tapes are ready to go then restores can be a matter of minutes. But I would look at a Library at your offsite location. As you are going to have to back up to tape in the end anyway. Vault can be configured to kick out the tapes you need to save. But it is important to get the right tapes, DLT is not the one in my opinion.

creakyjoe · Oct 20, 2004

Nice post Lenski. Only 18 failed tapes in 3yrs. impressive, sounds like you invest the same effort that i do. I analyze each failure on a mission to eradicate non-genuine backup failures. In the 2yrs since we went LTO, have had numerous failures, but due to stringent (ok, obsessive !) testing/logging of each & every failure, seem to have eradicated non-genuine failures.

For those that don't know, NetBackup appends to its own 'errors' file for all drive/tape failures, so i've appended comments to each failure in this file. Thus have a history of the environment in one place.

11/10/02 23:44:01 5770L1 4 WRITE_ERROR
#rrb very first problem since this new equipment installed in Sept 2002
<snip> (quite a few WRITE errors up until autumn early July 2003)
#rrb 10/07/03 turned freq-based cleaning OFF for the LTO drives
#rrb 10/07/03 upgraded firmware for L60 Robot & all 6 drives
#rrb 10/07/03 all as part of fault call
<snip> (few WRITE errors late July 2003)
#rrb 29/07/03 drives 0 & 5 swapped out for fault call
<snip> (quite a few WRITE errors Aug 2003 thru Oct 2003)
#rrb 06/10/03 manually cleaned ALL 6 drives due to drives not requesting cleaning & above write errors
10/25/03 07:40:03 5913L1 0 WRITE_ERROR
10/27/03 07:39:56 5734L1 2 WRITE_ERROR
#rrb 27/10/03 to stop overrunning backups failing at 07:30 due to scsi bus resets, stopped running daily 'sgscan' which used to run at 07:30 each morning (this scan causes scsi bus resets)
<snip> (quite a few WRITE/POSITION errors Nov thru Dec 2003)
#rrb 09/12/03 manually cleaned drives 0,1,2,4,5 due to above write errors
#rrb 12/12/03 manually cleaned drive 3 due to above write errors
<snip> (absolutely loads of WRITE/POSITION/OPEN errors Dec 2003 thru Feb 2004)
#rrb - the above 3 write errors approx midnight caused by Sun Explorer 3.6.2
<snip> (loads of OPEN/WRITE errors Feb thru Mar 2004)
#rrb 07/03/04 the above 4 write errors approx midnight caused by Sun Explorer 4.2. Have since amended crontab to run Explorer at quiet time of day
<snip> (loads of POSITION/WRITE/OPEN errors Mar thru Apr 2004)
#rrb 19/04/04 manually cleaned all drives due to above write errors
#rrb 19/04/04 - all drives but 1 had mount time of 900+hrs, the other had 1000+
<snip> (loads of POSITION/WRITE/OPEN errors May thru Jun 2004)
#rrb 18/06/04 - the above errors from 10/06/04 thru 17/06/04 began to be addressed in new fault call
06/18/04 19:35:45 5828L1 1 WRITE_ERROR
#rrb 21/06/04 manually cleaned all drives due to above write errors 10/06/04 thru 18/06/04. Fault call still outstanding
#rrb 21/06/04 - all drives had average mount time of 465hrs before cleaning
#rrb 22/06/04 - as part of fault call 201585, added HP Ultrium-specific info to /kernel/drv/st.conf, after adding patch 108725-16 (no revision of this was previously installed)
#rrb 22/06/04 - patch 108725 addresses position errors
07/01/04 04:39:38 5867L1 1 WRITE_ERROR
#rrb 02/07/04 - drive 1 swapped out for fault call; raised call as previous call didn't seem to cure the write errors
07/08/04 03:11:58 5864L1 3 WRITE_ERROR
#rrb 09/07/04 drive 3 swapped out for fault call
07/18/04 13:15:50 5930L1 5 WRITE_ERROR
#rrb 18/07/04 the above write error was caused by Sun Explorer..Have now removed explorer from crontab altogether.
07/29/04 09:18:26 5848L1 3 WRITE_ERROR
#rrb 29/07/04 the above write error was caused by a (maually run) Sun Explorer..Goddamit !
08/02/04 12:17:22 5863L1 4 POSITION_ERROR
08/12/04 18:28:18 5818L1 2 WRITE_ERROR
#rrb 13/08/04 think this was a genuine write error as 5818L1 has had write errors before.
09/11/04 00:21:33 5788L1 3 WRITE_ERROR
#rrb 13/09/04 this was a genuine write error as no further problems (manually 'froze' tape at time of failure)
10/13/04 18:27:21 5920L1 2 POSITION_ERROR
#rrb ok can't explain this one-off position error. Backup subsequently reran itself ok to same tape. wierd.
#rrb as @ 20/10/04 no more non-genuine write failures since July 2004. Impressive.

Sorry for long post, but tape backups stil lhave their place, just needs a bit of effort to keep 'em in check

Rich

bswip · Oct 20, 2004

Great detailed logs. I have been running a 1500 slot library on LTO's (IBM GEN 1) and have had some errors due to bad a bad drive or a faulty HBA.

We are primarily looking at VTL's in order to reduce network overhead since we will only perform a full backup the first time a client is backed up. Afterwards it will only be daily Differentials (Cummulative Incrementals) and on the weekends we would create a synthetic full.

We still have an obligation to produce daily offsite tapes, so I don't see them going away anytime soon. However, I am trying to convince everyone that a L5500 would be a great HSM long term storage device (Class C storage).

I am also looking into reducing Exchange and SQL backups by using Snapshot for Windows. Volume Shadow Copy does not allow you to create synthetic fulls.

My motto is "Let backup systems do what they do best"

Cheers.

creakyjoe · Oct 21, 2004

bswip, Just checked out the L5500, erm, it's rather bigger than my (feet shuffle uncomfortably) erm, , well, erm 60-slot Sun L60. There i said it. Ah well if i had that much kit at least my scripting /methodology could scale up to accommodate

Rich

lenski · Oct 21, 2004

I'm (puffing out chest) running two STK 9310 powderhorns, 5500 slots each. Not sure what the floor stress is but I'm glad I don't sit on the floor below.

bswip · Oct 21, 2004

lenski:

Do you have them connected with the pass-thru? I will be at STK next week for the unveiling of their replacement for the Powderhorns and L5500. They are no longer a silo construction but more like ADIC et al...they're looking at a much smaller footprint so the floor may be happier

... I'll let you know more later (after the meeting my NDA is released)...

lenski · Oct 25, 2004

No they are in differant buildings. New libraries? mmm, I must get a look at these.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Duplicating Disk-to-Disk

hogburn

IS-IT--Management

bswip

Programmer

hogburn

IS-IT--Management

lenski

Technical User

creakyjoe

IS-IT--Management

bswip

Programmer

creakyjoe

IS-IT--Management

lenski

Technical User

bswip

Programmer

lenski

Technical User

Similar threads

Part and Inventory Search

Sponsor