redapplesonly
Technical User
Hi everyone,
I'm a relative beginner to writing UNIX scripts. In the past, I've been able to hack together simple scripts. Now I need a script which a little more complex than I'm used to, and I really need help. I'm up against a tight deadline and am growing desperate, as I can't seem to find a solution either on the web nor in my UNIX programming book.
Here's the problem: I'm on a SunOS system. On the machine, I have a large number of files scattered across a vast directory structure. I have to copy all those target files into my home directory. Luckily, the directory structure is well-organized. It looks like this:
/root/projects/*ARCHIVE*/date/*SUBARCHIVE*/output/*SUBSUBARCHIVE*/The_Files_I_Need
The directories in lowercase have constant names - I don't have to worry about them ever changing. But the directories I've named with *ALLCAPS* do change names. Think of them as wild cards.
Put another way: If I wanted to exhaustively list every directory, the top tier would looks like this:
/root/projects/PROJECT01/
/root/projects/PROJECT02/
/root/projects/PROJECT03/
/root/projects/PROJECT04/
...
/root/projects/PROJECT20/
The next tier would look like this:
/root/projects/PROJECT01/date/JAN2000/
/root/projects/PROJECT01/date/FEB2000/
/root/projects/PROJECT01/date/MAR2000/
...
/root/projects/PROJECT01/date/MAY2010/
/root/projects/PROJECT02/date/JAN1995/
...
/root/projects/PROJECT20/date/DEC2005/
And the next tier would look like this:
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0002/
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0003/
...
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE1328/
...
/root/projects/PROJECT20/date/DEC1995/output/SAMPLE483822/
And so on. The target files I ultimately need to read are in those final directories.
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/TARGET_102932
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/TARGET_32323
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/TARGET_32999293
...
There are literally thousands of these target files, all with dynamic names.
So the problem I'm having is I can't just do a "cp /root/projects/*/date/*/output/*/*" because the pathnames become too long. I can't hardwire the directory names I don't know because there's obviously too many of them. I've been experimenting with code, but my results have been frankly pitiful. I'm sure there's some way of doing this as a loop-within-a-loop-within-a-loop... but I can't figure out how to do it.
Here's the quasi-code I've been trying to get to work:
==============================================================================
#!/bin/bash
# create tmp directory into which I'll copy the files
mkdir ${HOME}/TMP
# jump into first common directory, start to drill down
cd /root/projects
for i in PROJECT01 PROJECT02 PROJECT03 (...) PROJECT20
do
cd $i/date
ls > SUBARCHIVE_LIST #how to dynamically store the *SUBARCHIVE* values?
for j in SUBARCHIVE_LIST
do
cd $j/output
ls > SUBSUBARCHIVE_LIST #same problem here!
for k in SUBSUBARCHIVE_LIST
do
cd $k
cp * ${HOME}/TMP #here I copy the files
done
done
done
==============================================================================
Can anyone help? I hope so! I'm hoping this is a relatively easy problem for you experienced folks.
PS - sorry for the very long text; I try to be precise
I'm a relative beginner to writing UNIX scripts. In the past, I've been able to hack together simple scripts. Now I need a script which a little more complex than I'm used to, and I really need help. I'm up against a tight deadline and am growing desperate, as I can't seem to find a solution either on the web nor in my UNIX programming book.
Here's the problem: I'm on a SunOS system. On the machine, I have a large number of files scattered across a vast directory structure. I have to copy all those target files into my home directory. Luckily, the directory structure is well-organized. It looks like this:
/root/projects/*ARCHIVE*/date/*SUBARCHIVE*/output/*SUBSUBARCHIVE*/The_Files_I_Need
The directories in lowercase have constant names - I don't have to worry about them ever changing. But the directories I've named with *ALLCAPS* do change names. Think of them as wild cards.
Put another way: If I wanted to exhaustively list every directory, the top tier would looks like this:
/root/projects/PROJECT01/
/root/projects/PROJECT02/
/root/projects/PROJECT03/
/root/projects/PROJECT04/
...
/root/projects/PROJECT20/
The next tier would look like this:
/root/projects/PROJECT01/date/JAN2000/
/root/projects/PROJECT01/date/FEB2000/
/root/projects/PROJECT01/date/MAR2000/
...
/root/projects/PROJECT01/date/MAY2010/
/root/projects/PROJECT02/date/JAN1995/
...
/root/projects/PROJECT20/date/DEC2005/
And the next tier would look like this:
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0002/
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0003/
...
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE1328/
...
/root/projects/PROJECT20/date/DEC1995/output/SAMPLE483822/
And so on. The target files I ultimately need to read are in those final directories.
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/TARGET_102932
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/TARGET_32323
/root/projects/PROJECT01/date/JAN2000/output/SAMPLE0001/TARGET_32999293
...
There are literally thousands of these target files, all with dynamic names.
So the problem I'm having is I can't just do a "cp /root/projects/*/date/*/output/*/*" because the pathnames become too long. I can't hardwire the directory names I don't know because there's obviously too many of them. I've been experimenting with code, but my results have been frankly pitiful. I'm sure there's some way of doing this as a loop-within-a-loop-within-a-loop... but I can't figure out how to do it.
Here's the quasi-code I've been trying to get to work:
==============================================================================
#!/bin/bash
# create tmp directory into which I'll copy the files
mkdir ${HOME}/TMP
# jump into first common directory, start to drill down
cd /root/projects
for i in PROJECT01 PROJECT02 PROJECT03 (...) PROJECT20
do
cd $i/date
ls > SUBARCHIVE_LIST #how to dynamically store the *SUBARCHIVE* values?
for j in SUBARCHIVE_LIST
do
cd $j/output
ls > SUBSUBARCHIVE_LIST #same problem here!
for k in SUBSUBARCHIVE_LIST
do
cd $k
cp * ${HOME}/TMP #here I copy the files
done
done
done
==============================================================================
Can anyone help? I hope so! I'm hoping this is a relatively easy problem for you experienced folks.
PS - sorry for the very long text; I try to be precise