Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extracting specific files from a folder full of tar.gz files and writing them to a sub-folder 1

Status
Not open for further replies.

theniteowl

Programmer
May 24, 2005
1,975
US
Hi Everyone!

I have hundreds of xxx.tar.gz files that I need to extract two files each from and write those files to a folder of the same name as the original tar file or zip them into a zip file of the same name.

Each file has a file named subcntlfile in the root and a file called report0000.data under /data.

I can use these commands to extract the files
tar xvzf /mypath/999.tar.gz ./999/data/report0000.data --strip-components 3
tar xvzf /mypath/999.tar.gz ./999/subcntlfile --strip-components 2

This will extract the two files I need without creating the sub-folder from within the tar file.

I need to automate storing these two files into either a zip file or a folder of the same name as the beginning of the name of the tar file. In this case the name would be 999.
I would also like to use a shell script to read the names of the tar files one at a time to apply the above commands and any later processing to.
I can then FTP the zip files or folders to another server where I have to work with them.

If there is an easy method to convert .tar.gz to zip and lose the internal path names I could do that as well as I can handle reading the zip file to get the files on the other server but I cannot read tar.gz files on there.

I cannot install any applications to support this and have to work with what I have or what I can write.

Any help would be appreciated. Manually extracting these files would take many hours of tedious and error prone work. If I could only get an application that worked directly with tar files and would operate command line life would be wonderful but getting approval for new software is a long and arduous process.

Thanks.
Trent

At my age I still learn something new every day, but I forget two others.
 
You can write a script like this

unpack_tars.sh
Code:
#!/bin/sh

# source directory
src_dir=./tar_files
# destination directory
dst_dir=./tar_unpacked

for file in $src_dir/*.tar.gz
do
  printf "* extracting data from %s :\n" $file
  fname=$(basename $file)
  tar xvzf $file -C $dst_dir ${fname%.tar.gz}/data/report0000.data
  tar xvzf $file -C $dst_dir ${fname%.tar.gz}/subcntlfile
  printf "done.\n\n"
done

For example: I have a subdirectory ./tar_files which contains the tar files and an empty subdirectory ./tar_unpacked
Code:
$ cd tar_files
$ ls
001.tar.gz  100.tar.gz  300.tar.gz  999.tar.gz

$ cd ../tar_unpacked
$ ls

then I run the script
Code:
$ cd ..

$ unpack_tars.sh
* extracting data from ./tar_files/001.tar.gz :
001/data/report0000.data
001/subcntlfile
done.

* extracting data from ./tar_files/100.tar.gz :
100/data/report0000.data
100/subcntlfile
done.

* extracting data from ./tar_files/300.tar.gz :
300/data/report0000.data
300/subcntlfile
done.

* extracting data from ./tar_files/999.tar.gz :
999/data/report0000.data
999/subcntlfile
done.

and get the result in the directory ./tar_unpacked, i.e. the extracted subdirectories from the tar files
Code:
$ cd tar_unpacked
$ ls
001  100  300  999
 
Thanks for the help mikrom.
I had to make one small change to get it to work, adding ./ in front of the path to the file to extract.
When I ran the code it would create the sub-folder of data and put report0000.data into it.
I modified the code a bit to get rid of the sub-folder so both files were in the same place.
I am not sure it is the most elegant method possible of suppressing the sub-folder but trying to use the --strip-components option caused the file to end up at a higher level folder but never in the same folder with subcntlfile.
It works though. I can automate the zipping up of the new folders and ftp them to another machine where the files will be processed.

Code:
#!/bin/sh

# source directory
src_dir=./tar_files
# destination directory
dst_dir=./tar_unpacked

for file in $src_dir/*.tar.gz
do
  printf "* extracting data from %s :\n" $file
  fname=$(basename $file)
  tar -C $dst_dir -zxf $file ./${fname%.tar.gz}/data/report0000.data
  cd $dst_dir/${fname%.tar.gz}
  find . -type f -exec mv -i {} . \;
  find * -type d -prune -exec rm -rf {} \;
  cd ../..
  tar xvzf $file -C $dst_dir ./${fname%.tar.gz}/subcntlfile
  printf "done.\n\n"
done

Thanks again.

At my age I still learn something new every day, but I forget two others.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top