Java code for recursive directory zipping, specially useful for Condor

What is this about?

Needing a lot of computing power it often pays to use distributed grid computing. Condor is a very handy software to do this, but unfortunately does only transfer back result files in the basic computing directory. As it is sometimes much more handy to put results in subdirectories I ran into the problem of how to get those directories with my results back. You could write a shell script to compress all directories, but that means loosing Condor's direct Java support. Luckily Java offers the zip capabilities we need, so in the following you find the general purpose Java source code to zip all subdirectories into one file.

The recursive Java directory zipping code

The script should work Java 1.4 and newer, add import java.util.zip.*; to your .java file!
Note that this function works on both windows and linux machines, however will output linux compatible archives. To have windows compatible output change "replace('\\','/')" to "replace('/','\\')" (occurs two times)!

//recursive function to add a direcotry with all sub-directories to zip z
//output format (in the zip) is linux compatible, i.e. with "/" seperating dirs
//while the input might be from windows machines, as might happen in mixed
//condor networks
public static void zip(File x,String Dir,ZipOutputStream z){
try{
if(!x.exists())
  System.err.println("file not found");
if(!x.isDirectory()){
 z.putNextEntry(new ZipEntry((Dir+x.getName()).replace('\\','/')));
 FileInputStream y=new FileInputStream(x);
 byte[] a=new byte[(int)x.length()];
 int did=y.read(a);
 if(did!=x.length())
   System.err.println("DID NOT GET WHOLE FILE "+Dir+x.getName()+" ; only "+ did+ " of "+x.length());
 z.write(a,0,a.length);
 z.closeEntry();
 y.close();
 x=null;
 }
else  //recurse
 {
 String nnn=Dir+x.getName()+File.separator;
 x=null;
 z.putNextEntry(new ZipEntry(nnn.replace('\\','/')));
 z.closeEntry();
 String[] dirlist=(new File(nnn)).list();
 for(int i=0;i<dirlist.length;i++){
   zip(new File(nnn+dirlist[i]),nnn,z);
   }
 }
}catch(Exception e){System.err.println("Error in zip-Method!!"+e);}
}

//only creates the zip and initiates the recursive zipping
//here all folders beginning with dirsStartingWith are included
public static void zipAll(String dirsStartingWith, String name){
try{
File here=new File(".");
String[] dirlist=here.list();
ZipOutputStream z=new ZipOutputStream(new FileOutputStream(name));
for(int i=0;i<dirlist.length;i++)
 if(dirlist[i].startsWith(dirsStartingWith))
  zip(new File("."+File.separator+dirlist[i]),"."+File.separator,z);
z.close();
}catch(Exception e){System.err.println("Error in zipAll-Method!!"+e);}
}
call this, for example, with zipAll("results","res"+args[0]+".zip"); to put result directories (beginning with the String "results") in an archive named resX.zip, where X could be the process number passed by condor, using a condor script looking like the following:

Example Condor script

For a jar archive with your code called GRN_clocks.jar, split into 160 jobs, GRN_clocks.jar 1 to GRN_clocks.jar 160:

universe       = Java
executable     = GRN_clocks.jar
jar_files      = GRN_clocks.jar
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
output         = test.out.part$(Process)
error          = test.err.part$(Process)
log            = condor.log
notification = Never
image_size = 60000
Rank = JavaMFlops
# NiceUser = True
Hold = False
arguments = main $(Process)
queue 160

Recreate directory structue

Finally, to reassemble your result directory structure I recommend a simple script like this:

#!/bin/bash
for((i=0;i<160;i++)) do
  unzip res$i.zip;
  done
rm *.zip
rm *part*


I do not guarantee the correctness/adequateness of the data and the information given on this side and thus deny any responsibility for your use of it.
Johannes Knabe (jknabe@panmental.de)
v0.1 St Albans, United Kingdom, 16.01.2006 (2006/01/16)
My Homepage is http://panmental.de