Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Problem storing strings into a Tree 1

Status
Not open for further replies.

Belcebu

Programmer
Aug 3, 2005
9
US
I am trying to read a text file that contains a paragraph and my goal is to put each of the words of the text file into a tree and keep count of how many times each word appears in the file.
All of my libraries have been tested and there seems to be no problem and I read each one of the words one by one from the file with no problem.
The problem arises when I try to send the word from the file(which is stored in a temporary char*) into the tree. instead of sending the string of words for example "joe" the tree somehow manages to store "" inside the tree.

I cannot understand why this is happening, if anyone has some type of advise, please let me know.

Here is my code:

--------------------------------------------------------

char myString[MAX];
char* pMyString;
TreeEntry tempEntry;
TreeNode *root, *temp;
WordType sword;
char ans;
FILE *iFile;

CreateTree(&root);
if (TreeEmpty(root))
puts("Tree created and empty");

iFile=fopen("Easycase1.txt", "r");
if(iFile==NULL)
printf("ERROR Opening Input File\n");

start=clock();
while (!feof(iFile))
{ pMyString = myString;
*pMyString = toupper(fgetc (iFile));
while(*pMyString != '\n' && !feof(iFile) && *pMyString != ' ' && *pMyString != '.' && *pMyString != ',' && *pMyString != ';' && *pMyString != '"' )
{ pMyString++;
*pMyString = toupper(fgetc(iFile));
}
*pMyString = '\0';
if (isalpha(myString[0]))
{ printf("String = %s\n", myString);
tempEntry.word=myString;
tempEntry.count=1;
printf(" Word = %s\n", tempEntry.word);
root=InsertTree(root,MakeTreeNode(tempEntry));
}
}
fclose(iFile);


printf("\n\n");
Inorder(root,Print);
end=clock();

---------------------------------------------------------

I know that the tree works because if I create a string of strings and initialize them the following way: {"blue","green","yellow"} then they get inputed into the tree just fine.

also I know that my input works because I print out the words from the file and they are stored into the string and the TreeEntries just fine.

Thank You in advance for your help
 
Can you show the code for InsertTree()?
I think that might be where the problem is.
 
The code for InsertTree:

TreeNodePtr InsertTree(TreeNodePtr root, TreeNodePtr newnode)
{
if (root==NULL)
root=newnode;
else if(LT(newnode->entry.word,root->entry.word))
root->left=InsertTree(root->left,newnode);
else if(EQ(newnode->entry.word,root->entry.word))
root->entry.count++;
else
root->right=InsertTree(root->right,newnode);
return root;
}

also, here are my definitions:


typedef struct treenode
{ TreeEntry entry;
struct treenode* right;
struct treenode* left;
} TreeNode;

typedef TreeNode* TreeNodePtr;

typedef char* WordType;

typedef struct treeentry
{ int count;
WordType word;
} TreeEntry;
 
Maybe my brain isn't working too good at 3am, but I don't quite understand these lines:
Code:
else if(LT(newnode->entry.word,root->entry.word))
    root->left=InsertTree(root->left,newnode);

else
    root->right=InsertTree(root->right,newnode);
If InsertTree() always returns root (which is the first parameter), you're basically saying: root->right = root->right; In which case, I don't see the point of the assignment?

BTW, in the following code, you should also be checking the number of characters you read in to make sure you don't overflow your array:
Code:
while(*pMyString != '\n' && !feof(iFile) && *pMyString != ' ' && *pMyString != '.' && *pMyString != ',' && *pMyString != ';' && *pMyString != '"' )
 
1. Please use the [tt][ignore]
Code:
[/ignore][/tt]
tags when posting code.

2. while (!feof(iFile))
See
Start with
Code:
char buff[BUFSIZ];
while ( fgets( buff, BUFSIZ, iFile ) != NULL ) {
  // extract word(s) from buff
  // 
}

> *pMyString != '\n' && *pMyString != ' '
Consider using isspace() in ctype.h
> *pMyString != '.' && *pMyString != ','
ispunct() in ctype.h


3. tempEntry.word=myString;
This does NOT make a copy of the string.
Given your declarations, all your tree entries end up pointing at the same local variable.
So everything always ends up being equal to the current word.

4. EQ / LT macros are not helpful.

--
 
Thank You for the inputs. Sorry for not using the code tags.
You are right I overlooked that fact that I am trying to set a string to something instead of actually copying the string. I now copy the string and I get the desired results, however, I cannot use char* WordType, I have to use char WordType[MAX] which makes me uncomfortable because this way I am unable to malloc any space for the strings.

Also I would use ispunct() but that way words that are connected by a hyphen like WAR-TORN and words with 's, like THERE'S will not be read right, instead they will be separated into 2 different words.

The EQ and LT functions work just fine for me, why are they not helpfull? I will post them here so you can see them:
Code:
int LT(WordType target, WordType word)
{ 	int retval=1;
	if (strcmp(target, word) >= 0) 
		retval=0;
	return retval;
}

int GT(WordType target, WordType word)
{ 	int retval=1;
	if (strcmp(target, word) <= 0) 
		retval=0;
	return retval;
}

int EQ(WordType target, WordType word)
{	int retval=0;
	if (strcmp(target, word) == 0)
		retval =1;
	return retval;
}

int NE(WordType target, WordType word)
{	int retval=1;
	if (EQ(target,word))
	    retval=0;
	return retval;
}

as for the code below:
Code:
TreeNodePtr InsertTree(TreeNodePtr root, TreeNodePtr newnode)
{
    if (root==NULL)
        root=newnode;
    else if(LT(newnode->entry.word,root->entry.word))
        root->left=InsertTree(root->left,newnode);
    else if(EQ(newnode->entry.word,root->entry.word))
        root->entry.count++;
    else
        root->right=InsertTree(root->right,newnode);
    return root;
}
What this does is that if the value is greater than or less than the root then it looks in the right child or left child of the root. It does this by converting the right side or left side of the root into the new root inside the recursive function and it does this repeatedly until the assigned root in the recursive function is NULL, and then it assigns it the value of the newnode. For example: I have the following Tree:
9
5 10
3 6 null null
null null null null

newnode is 8. So I call the function insertTree to insert 8 into the tree. 9 is the root. Since 8 is less than the root it calls InsertTree again this time making the new root 5. Since 8 is greater than 5 it goes to its right child making the new root 6. Since 8 is still greater than 6 it goes to its right child and since this is NULL the value of 8 will be stored there.
9
5 10
3 6 null null
null null null 8

At the end the function will still return initial root which in this case is 9 to assure than I can keep adding more treeEntries in the right order.

Once Again THANK YOU for the inputs to both of you SALEM and CPJUST
 
> Also I would use ispunct() but that way words that are connected by a hyphen
Ok, so how about
Code:
// true if A-Za-z0-9 or one of "-'"
int isWord ( int ch ) {
  const char *others = "-'";
  return isalnum(ch) || strchr(others,ch) != NULL;
}
You can generalise this into any set of characters you consider to be part of a word.

> The EQ and LT functions work just fine for me, why are they not helpfull?
Well you didn't post them for one thing, so we're left guessing what they were.
Plus the fact that now you have posted them, I see that they add very little value except to just basically rename strcmp(). Essentially, readers have to look up and remember yet another bit of information to figure out the code (or make a guess).

What happens if you also want an 'LT' for integers, floats, doubles, times, or anything else with an ordering relationship?

> I have to use char WordType[MAX]
You can still have char *WordType, just remember to call malloc to allocate the right amount of space before you strcpy the input into your struct.

--
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top