Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Strange characters in directory names crash my script

Status
Not open for further replies.

kmcferrin

MIS
Jul 14, 2003
2,938
US
I have written the script attached at the bottom of this post that crawls a directory tree, parses a specified filesystem based on provided criteria, and uses that criteria to create a text file listing the full path of all files that have not been accessed for a certain period of time. We then pass this text file as an argument to a backup application that makes archive copies of the files and then purges them from the disk to free up space. We're talking about a filesystem that is about 750 GB in size currently.

At any rate, when I run the script everything appears to work fine from a logical perspective, but when I hit certain directories or files the script crashes with an error. Those files and directories all have unusual characters for their names when viewed in explorer. When the script echoes the file/directory name to the screen it displays them as question marks. But it errors out when trying to write it to a text file. The characters all look like little boxes. However, I tried posting the characters in this message, and they came out looking like "??????".

So now I'm thinking that they must be Japanese, and someone from our Tokyo office is putting the files there. I'm thinking that the issue is that my script can't handle foreign characters, or possibly unicode or something. The Wscript.Echo command just substitutes the question marks in place of the characters, but the output to a text file doesn't know what to do. Is it possible to change the format of the text file to allow it to accept unicode/foreign characters?

Any suggestions?

Code:
'  Generate_Archive_List.vbs
'
'  Script that iterates through a directory tree to locate all files that have
'  a 'last accessed' date older than a certain date (or age in days).  It then
'  logs the files and their full path to an output file of your choosing.
'
'  DIRECTIONS:  
'  Use with command line arguments.
'  /path: followed by the full directory path to the target directory
'  /age: followed by the age of the files in days
'  /date: followed by the cutoff date in the dd/mm/yyyy format
'  /output: followed by the path and name of the output file (i.e., c:\output.txt)
'         if /output: is not specified it will default to c:\output.txt

Option Explicit

Dim strTargetDir, strErrorFile, strOutputFile, intCutoffAge, intOldLogExists, objFSO, objFile, strNotificationMessage, colNamedArguments, _
    strTargetDate, intFileCount, intFileSizeCount, intFileSize
                
Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
Const OverwriteExisting = True
intFileCount = 0
intFileSizeCount = 0
intFileSize = 0

Set colNamedArguments = WScript.Arguments.Named

If colNamedArguments.Exists("output") Then
    strOutputFile = colNamedArguments.Item("output")
Else
    strOutputFile = "c:\output.txt"
End If

If colNamedArguments.Exists("path") Then
    strTargetDir = colNamedArguments.Item("path")
Else
    Wscript.Echo "Please specify a target directory via the /path: command line argument."
    Wscript.Quit 1
End If

If colNamedArguments.Exists("date") and colNamedArguments.Exists("age") Then
    Wscript.Echo "You cannot use the /date: and /age: command line arguments at the same time.  Please use one or the other, but not both."
    Wscript.Quit 1
Else
End If

If colNamedArguments.Exists("date") Then
    strTargetDate = CDate(colNamedArguments.Item("date"))
    intCutoffAge = DateDiff("d", strTargetDate, Now)
Else
End If

If colNamedArguments.Exists("age") Then
    intCutoffAge = CInt(colNamedArguments.Item("age"))
Else
End If

If intCutoffAge = "" and strTargetDate = "" Then
    Wscript.Echo "Please specify either a cutoff date or age in days using the /date: or /age: command line argument."
    Wscript.Quit 1
Else
End If



Set objFSO = CreateObject("Scripting.FileSystemObject")

' Checks for old output file and deletes it, then creates a new one
If objFSO.FileExists(strOutputFile) Then 
	objFSO.DeleteFile(strOutputFile)
Else
End If
Set objFile = objFSO.CreateTextFile(strOutputFile) 
objFile.Close

Wscript.Echo "Directory Tree Crawler started at " & Now
Wscript.Echo " Hit CTRL-C to exit."
Wscript.Echo "Target:   " & strTargetDir 
Wscript.Echo "Date:     " & strTargetDate
Wscript.Echo "Age:      " & intCutoffAge

CrawlTree strTargetDir
intFileSizeCount = FormatNumber(intFileSizeCount / 1024, 2)
Wscript.Echo intFilecount & " files at " & intFileSizeCount & " kilobytes meet the selected criteria."

Sub CrawlTree(strTargetDir)
    Dim objFolder, arrFolders, objFiles, Item, Item2
    Set objFolder=objFSO.GetFolder(strTargetDir)
    Set arrFolders=objFolder.SubFolders
    Set objFiles=objFolder.Files

    ' Get all sub-folders in this folder
    For Each Item In arrFolders
        If Right(Item, 3) = "-NP" Then
            Wscript.Echo Now & " -- " & item & " ends with -NP.  Skipping directory tree."
        Else
            If Len(Item) < 256 Then
                CrawlTree(item)
            Else
                Wscript.Echo Now & " -- " & item & " is deeper than 256 characters (" & Len(Item) & " characters).  Skipping directory tree."
            End If
        End If
    Next
    Item2=0

    'Scan through  the files collection, find files older than the target age and adds to output.
    For Each Item2 in objFiles
    Dim strAccessDate, strCreatedate, objFileName, intDaysOld
    Set objFileName = objFSO.GetFile(Item2)
    strAccessDate = objFileName.DateLastAccessed
    intDaysOld = DateDiff("d", strAccessDate, Now)
    If intDaysOld > intCutoffAge Then
        Wscript.Echo Now & " -- " & objFileName.Path & " is " & intDaysOld & " days old." & "  Depth=" & Len(objFileName.Path)
	    Set objFile = objFSO.OpenTextFile(strOutputFile, ForAppending)
	    objFile.Writeline objFileName.Path
	    objFile.Close
        intFileCount = intFileCount + 1
        intFileSize = objFileName.Size
        intFileSizeCount = intFileSizeCount + intFileSize
    Else
    End If
    Next
End Sub
 
Well, I did find some documentation about changing the file format to Unicode and working with it in that format. Unfortunately, I still get the same error on the same line (121,6) which is "objFile.Writeline objFileName.Path".

The Unicode info I found was here:


I am creating the file as Unicode and opening it for appending as Unicode as well. I have checked the output file after it has been generated, and it's all filled with the little squares, which all apparently translate to Japanese characters. So now I'm not sure if I'm headed the right way with this or not.
 
OK, after installing Japanese language support on my PC, when I browse those directories I now see Kanji characters instead of just the boxes. Considering that some of the contents of the folders are documentation for an application that is sold in Japan I'm now 100% positive that is what the issue is.

Unfortunately, my script still doesn't work. It still crashes when it gets to directories or files with Japanese characters.

I know that Kanji characters require Unicode, so they are 2 bytes per character instead of 1. I can create the output text file in a Unicode format, but when I open it in Notepad (by just double-clicking on the .txt file) Notepad displays Kanji characters for everything, even the stuff that should be in English. If I go into Notepad and try to open the file from there and tell it to use ANSI instead of Unicode I get the correct English text, except that the first line begins with "ÿþ". If I try to open the text file in Word it assumes that it is Unicode (which makes the entire file appear in Kanji), but it gives me the option to use MS-DOS or Windows encoding, which results in the correct English text.

The only thing that I can gather from this is that I can create a Unicode file, I can open it as a Unicode file, but when I'm writing to it I'm still writing to it as ANSI text (using the objFile.WriteLine).

So I guess what I need is:

1. To be able to convert ANSI text of file and directory names to Unicode before writing them to the output file, and this probably includes the carriage return at the end of the line.

2. To be able to tell the difference between Unicode and ANSI text of file and directory names so that I don't try to convert Unicode to Unicode and result in junk.

Alternatively, if I could find a way to make the script just bypass directories that aren't ANSI I would be happy. The script wouldn't work as intended, but it would have enough functionality to get by for now.
 
Anyone out there have anything on this?

I have verified that it is definitely Japanese characters causing the script to crash. It can output the directory path names to the screen but it substitutes "???????" for the Japanese characters, but then it bombs when it gets to the next line where it is supposed to write the directory path to a file. I'm sure that people in Japan use VBScript, so I'm sure that someone somewhere has to know of a way to make this work. Any ideas?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top