@rharsh:
I take the pathname from my XP-window so there should be no typos... Using your code I get empty lines when I run the program, which means that the problems come in:
[code] open IN, "< $file" or die "Cannot open $file for read\n$!"; [\code]
when trying to open $file and the response...
@rharsh:
I take the pathname from my XP-window so there should be no typos... Using your code I get empty lines when I run the program, which means that the problems come in:
[code]open IN, "< $file" or die "Cannot open $file for read\n$!";[\code]
when trying to open $file and the response is...
I cannot print out what is in @files because it is simply not reading the files... It claims it cannot open C:./Documents...
And I have tried with " " instead of ' ' and with forward-slashes as well as backslashes and dubble-slashes in case it is interpreting what is inside the quotes.
When I...
Hi again!
Now I have pretty much tried every suggestion posted here.
In the particular version shown below, I get the following error:
Cannot open C:./Documents for read
No such file or directory at ... line 12
Most versions seem to have the same problem, when I want to open the current...
The files are two text files with the following:
computer science ( or computing science ) is the study and the science of the theoretical
and:
foundations of information .
The resulting file should be:
computer
1
science
1
information
1
people
1
Was this what you wanted to know?
-lillyth
Hi again.
Thanks! I changed to the proposed
my @files = glob "C:/... ".
Now it is complaining at the second while loop. I would like to do the following:
for each file i
for each row j in i
do something.
Am I calling the open file right in :while(<$_>)?
Best,
-nina.
Hi!
I need to read text files from a directory and do some operations on all the files at once. This in order to get frequency counts for words over all files. Any guesses to why this code is not working?
The error message is: "Cannot open 'C:\Doc...\*.txt' Invalid argument at line 12"
While...
@ishnid
You are right, the second loop is not there, but I claim that the third loop, where you wrote ( do some operation ), will need t look-ups. That is, to check if a word j within w of the current position is a term, you need to scan through the terms and see if j is in this list.
Here is...
@ishnid,
No the number of words in the data file is larger than the number of terms. It is not that every word (except stop words) become terms. We are only interested in nouns and noun phrases and hence those are the only words in our terms list.
The algorithm proposed is:
for every word a...
Thank you both for your valuable insights. If I am not mistaken the algorithm proposed here will have a complexity of O(N*t*w) where N is the number of words in the datafile, t is the number of terms and w is the window size. The algorithm that I chose I believe only needs O(N*t) to run. Thank...
@ishnid
It seems to me that you do not consider the distances between terms when you remove the rest and only leave the terms in the array. Then the w terms that occur after each other in the document co-occur, but they may not be within a distance of w from each other in the original text...
@steve
We should increment both termA/termB as well as termB/termA. This because the matrix will later map to a graph where the edges have direction and we want both directions to be valid. If termA co-occurs with termB then it also holds the other way around.
I am looking for semantic meaning of words, so occurring in the same document is too wide. I need to say that two words co-occur if they appear within w words from each other and I want to be able to set w as a parameter.
Well, you are a bit right... It is phd-work. The actual aim is to extract semantic information from the graphs build by the co-occurrence matrix, using mathematics. So, because this is not what I specialize in I was hoping someone would help, even though it is not very far from school work.
Hi!
I need to create a co-occurrence matrix from a text file. So far I have a term extractor that given the file ( data.txt ) returns a file with the relevant terms (term.txt). From these two I would now like to create a co-occurrence matrix using a window of size w. I am guessing that the...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.