Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

threads and file copying

Status
Not open for further replies.

fenris

Programmer
May 20, 1999
824
CA
I am learning about threads, currently I can't see how it would be advantageous to use threads in copying multiple files from one location to another. I have a class that takes two arguments, the source file and the destination. This class works by doing a byte copy from the original to the copy. It works fine, but I am wondering if there is a way to get the O/S to do this. Is there a facility in java to allow for the creation of OS objects that would have some of the basic properties supported by the O/S yet still remain platform independent. As far as I know every O/S has facilities to move and copy files around. I know that one can use java's shell feature, but this would be platform specific. <br><br>As far as Threads go how would I implement a simple class that uses, say two threads, too copy multiple files, say 100. The only thing I can think of is to pass the first 50 to one thread and the rest to the other one.<br><br>If I am reading things correctly about threads, in order for one thread to be active, all the rest must be inactive. How would this apply to file I/O?<br><br>puzzled,<br><br><br> <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
well at the precise moment of action, only one thread can be run by the processor; but if all threads had to wait for others to finish before they could begin, that would defeat the purpose of threads.<br><br>I think the only way that Java can perform platform-independent IO is through the standard classes you've already seen. Are there any Swing I/O classes? I still have not gotten a chance to even browse Swing documentation.<br><br>Create a class that implements threads... this will do the I/O. You should be able to setDaemon(true), I think. From your main method outside of that class, have 2 arrays of filenames (or something similar, depending on your program demands); one for input source and another for output destination. For each item in the array, start a new thread with the specified source/destination info. You should be able to have 100 threads all doing I/O at the same time. If you run into problems, let me know :eek:) <p>Liam Morley<br><a href=mailto:lmorley@wpi.edu>lmorley@wpi.edu</a><br><a href=] :: imotic :: website :: [</a><br>"light the deep, and bring silence to the world.<br>
light the world, and bring depth to the silence.
 
Thanks Liam, <br><br>I just wanted to start with two threads, just to see how it worked. If I got that working then I would try more and see what kind of demands that placed on the system. Thanks<br><br> <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
That would be slightly more difficult, but not impossible. Pass the entire array to both threads- when a thread processes a certain index for the array, flag that index somehow, so the other thread knows it's already been taken (this would have to be in a global variable). Then, keep processing until there's nothing left unflagged.<br><br>That's better than splitting it up by number of files, since not every file will take the same amount of time.<br><br>That's not the easiest thing to implement, but it's possible, and it sounds like a fun project. <p>Liam Morley<br><a href=mailto:lmorley@wpi.edu>lmorley@wpi.edu</a><br><a href=] :: imotic :: website :: [</a><br>"light the deep, and bring silence to the world.<br>
light the world, and bring depth to the silence.
 
Thanks for the information. I have been doing some reading on threads, and from what I understand if there are two many threads active, it can actually decrease performance. That is why I wanted to start with two threads, and go from there. Hopefully the class I create will allow an input for the number of threads involved in copying.<br><br> <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
fenris,<br><br>Sounds like you are researching thread development. That's cool. In practical application, performance will only be enhanced if symmetrical processors are available to share the work concurrently. Functionality enhancements can be attained using multiple threads even in single processor environments or when the number of concurrent threads is higher than the number of processors.<br><br>When long processes are going to run in a UI environment threads can be used to perform the work while allowing the user to interact with the user interface thread.<br><br>Another reason for using threads is to handle multiple requests in a non-linear fashion as in the case of a web server, chat server etc.<br><br>&quot;But, that's just my opinion... I could be wrong&quot;.<br>-pete<br>
 
I thought in the situation of filecopying, that threads may help speed things up since disk I/O is rather slow and most hardrives can easily handle two or three copies without a significant decrease in performance. I have read four or five chapters from various books, numerous articles and such. Reminds me of when I first started to learn java, I was completely lost :) I can do the copying with one thread :) So I just have to figure out how to get the other thread involved.<br><br>I really appreciate the comments and suggestions.<br><br>Thanks,<br><br> <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
Wait... so you're not going to take a look at my idea of passing the array to both threads and flagging indexes as they are started? I think that was a great idea.. and it pangs me to see it go to waste ;o) It wouldn't be hard to implement...<br><br>or have you thought of something else that fits your needs better? <p>Liam Morley<br><a href=mailto:lmorley@wpi.edu>lmorley@wpi.edu</a><br><a href=] :: imotic :: website :: [</a><br>"light the deep, and bring silence to the world.<br>
light the world, and bring depth to the silence.
 
I will look at your idea, when I better understand how threads work. Right it's as if I am learning java all over again ;) Every other language that I used had no facility for multiple threads and it will take time to learn how to use them effectively. Thanks for the help though.<br><br> <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
fenris,<br><br>I thought of a better solution. The old one would need one array of String for input files, another array of String for output files, and then maybe a Vector of Integer for keeping track of indexes.<br><br>Get rid of all that and just have one Hashtable. The key will be a String representing the input file, and the value will be the corresponding output file. If you create an enumeration loop, each thread will remove the key/value it's currently working on. Then, the next thread won't go looking for that one as it has already been removed. (You could also have it be a Hashtable with File types). One of the bigger benefits of this is the fact that you're not adding to memory, you're releasing it. With two arrays and a vector, you're constantly adding to a vector- creating more memory. With this situation of one hashtable, you are constantly removing items, releasing memory. Sounds better to me.<br><br>have this in your run method:<br><br><FONT FACE=monospace>for (Enumeration e = fileContainer.keys(); e.hasMoreElements();) {<br>&nbsp;&nbsp;&nbsp;&nbsp;String inputFile = e.nextElement().toString();<br>&nbsp;&nbsp;&nbsp;&nbsp;String outputFile = fileContainer.remove(inputFile).toString();<br>&nbsp;&nbsp;&nbsp;&nbsp;// perform file copying<br>}</font><br><br>I'm not sure if the enumeration is automatically updated as keys are removed from the hashtable. You might need this instead (but I'm not sure):<br><br><FONT FACE=monospace>for (Enumeration e = fileContainer.keys(); e.hasMoreElements();) {<br>&nbsp;&nbsp;&nbsp;&nbsp;String inputFile = e.nextElement().toString();<br>&nbsp;&nbsp;&nbsp;&nbsp;String outputFile = null;<br>&nbsp;&nbsp;&nbsp;&nbsp;try {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;outputFile = fileContainer.remove(inputFile).toString();<br>&nbsp;&nbsp;&nbsp;&nbsp;catch (NullPointerException npe) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;continue;<br>&nbsp;&nbsp;&nbsp;&nbsp;}<br>&nbsp;&nbsp;&nbsp;&nbsp;// perform file copying<br>}</font><br><br>I think that would work. You'd need to create your Hashtable variable outside of everything- either that, or you'd have to pass it in to the constructor for your threads.<br><br>Also, you can create as many threads as you'd like- it's completely arbitrary. You might also want to have a timer that starts right before the first thread starts and ends right after the last thread dies. This could tell you to some degree the relationship between number of threads in action and the time it takes to complete file copying. <p>Liam Morley<br><a href=mailto:lmorley@wpi.edu>lmorley@wpi.edu</a><br><a href=] :: imotic :: website :: [</a><br>"light the deep, and bring silence to the world.<br>
light the world, and bring depth to the silence.
 
Liam, thank you for your indepth response. Now I have to learn about hash tables as well as threads ;-) I have started on the first idea that you gave me. When I get it working, and understand how it works, then I will try the second method with the hashtable. Thanks for your help, I really appreciate it.<br> <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
well it's a fun problem :eek:) I don't get to deal with threads that much... but I've been dealing with hashtables more than ever lately (usually hashtables of hashtables, just crazy), so I think in terms of them. They're pretty useful in certain situations... anyways, best of luck, and if you wouldn't mind sending me the code when you're all done, I'm interested to see how it turns out :eek:) <p>Liam Morley<br><a href=mailto:lmorley@wpi.edu>lmorley@wpi.edu</a><br><a href=] :: imotic :: website :: [</a><br>"light the deep, and bring silence to the world.<br>
light the world, and bring depth to the silence.
 
I'll send you a copy of the code when I am done Liam. Thanks for the insight. Might I ask what you are using hashtables for? In my wanderings through the java literature I have come accross them but I didn't really get into them.<br><br><br>Thanks.... <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
Well I'm using them for a few different things right now. Here's one use. I'm trying to compare two different sets of data. These are in text files, but the data is sorted differently in the two different documents; and sorting would be a pain... and as I don't want to see two identical documents (I just want to see the errors), that's all I'm returning. As I go through the data, I put it into two sets of hashtables. Then, after I've gone through all the data, I go through one hashtable, removing the data in both and testing if its equal. Seeing as how <FONT FACE=monospace>myHash.remove(data)</font> will remove the data at any point if it exists, the hashtables don't have to be sorted in the same manner.<br><br>If the second hashtable doesn't have a particular item, it generates a NullPointerException when you go to remove it. That null pointer is an error, and I print it out (along with increment an error counter).<br><br>After I've gone through the first hashtable, I go through the second one. Seeing as how I've been removing data all this time, I should theoretically have nothing in the second table if there are no errors. If there's anything in the second hashtable, I print it out and report the error. Hashtables are the ideal container component for this because I'm working with name/value pairs.<br><br>Another use is that I'm instantiating objects from reading a table and returning the final product as a Vector. I read in the data and plug it into a hashtable for easy reading. Each key represents the name of the attribute to set, with a corresponding value.<br><br>Those are just two uses... hashtables can make life easy :eek:) <p>Liam Morley<br><a href=mailto:lmorley@wpi.edu>lmorley@wpi.edu</a><br><a href=] :: imotic :: website :: [</a><br>"light the deep, and bring silence to the world.<br>
light the world, and bring depth to the silence.
 
Thanks for the explanation Liam, I appreciate.<br><br><br> <p> fenris<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
I finally got around to writing the base class of the threaded filecopying utility. I haven't implemented the above suggestions regarding copying an array of files from one section to another. I am more concerned right now with evaluating the performance of the Threaded copy class versus the non-threaded.<br>//=================================================<br>import java.io.IOException;<br><br><br>public class ThreadedCopy extends Thread <br>{<br> private String fileToCopy[]; //as an array of two elements, the source string and the destination string &quot;c:\this.txt&quot; &quot;c:\backup\this.txt&quot;<br> <br> public ThreadedCopy(String file[]){fileToCopy = file;}<br> <br>&nbsp;&nbsp;&nbsp;&nbsp;public void run() <br> {<br> long startTime;&nbsp;&nbsp;&nbsp;// Starting time of program, in milliseconds.<br> &nbsp;&nbsp;&nbsp;&nbsp;long endTime;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;// Time when computations are done, in milliseconds.<br> double time;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;// Time difference, in seconds.<br> <br> startTime = System.currentTimeMillis();<br> <br> try{<br> new CopyOneFile(fileToCopy); //Class that copies one file from source to destination<br> }<br> catch(IOException ioe){}<br> try {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sleep(10); //sleep for 1 second<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} catch (InterruptedException e) {}<br><br> endTime = System.currentTimeMillis();<br> &nbsp;&nbsp;&nbsp;&nbsp;time = (endTime - startTime) / 1000.0;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(&quot;the Run time in seconds was:&nbsp;&nbsp;&quot; + time + &quot; for &quot; + fileToCopy[0]);<br> <br> }<br> &nbsp;&nbsp;&nbsp;&nbsp;<br> <br>}<br>//============================<br><br>I am not sure that is the proper way to time this sort of thing?<br><br>If this is the proper way, then it works well! I tested it against a class that copies three files (8MB each) one after the other and obtained a time of 17.5s.<br>The threaded version doing the same three files achieved times of less then 12s (the lowest time of five runs). Any advice on how to improve the class would be appreciated...<br><br> <p>Troy Williams B.Eng.<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top