Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Problem with coda 1

Status
Not open for further replies.

LovecraftHP

Programmer
Dec 3, 2002
15
CN
Hi,

Been working on a program that among others should be able to split up verbs in a onset-nucleus-coda structure (with the help of this forum, thanks guys). Almost finished now, except for a couple of bugs.

Sample code:


BEGIN { FS="," ; fs="[" ; d="=" }
{
printf $1 FS $2 FS $3
n = split($4, a, fs)
split($5, b, fs)
for (x=(n-2); x<=n; x++) {
f=z=0
k=j=l=""
for (y=1; y<length(a[x]); y++) {
p=substr(a[x],y,1)
q=substr(b[x],(y+z),1)
r=substr(b[x],y,(z+2))
s=substr(a[x],(y+1),1)
if (p=="V") {
k=k q
f=1
}
if (p=="C") {
if (r!="dZ"||r!="tS") {
if (f==0) j=j q
if (f==1) l=l q
}
if (r=="dZ"||r=="tS") {
if (f==0) {
j=r
z++
}
else if (f==1) {
l=r
}
}
}
}
printf FS (j?j:d) FS (k?k:d) FS (l?l:d)
}
printf FS $8 "." RS
}


Used on this data sample:


acknowledge,acknowledged,611,[VC][CV][CVC],[@k][nO][lIdZ],[VC][CV][CVCC],[@k][nO][lIdZd],@+d,318
adjudge,adjudged,15,[V][CVC],[@][dZVdZ],[V][CVCC],[@][dZVdZd],@+d,483
challenge,challenged,545,[CV][CVCC],[tS&][lIndZ],[CV][CVCCC],[tS&][lIndZd],@+d,6955
change,changed,8239,[CVVCC],[tSeIndZ],[CVVCCC],[tSeIndZd],@+d,6997
judge,judged,718,[CVC],[dZVdZ],[CVCC],[dZVdZd],@+d,24555


Produces this output:


acknowledge,acknowledged,611,=,@,k,n,O,=,l,I,dZ,@+d.
adjudge,adjudged,15,=,=,=,=,@,=,dZ,V,d,@+d.
challenge,challenged,545,=,=,=,tS,&,=,l,I,dZ,@+d.
change,changed,8239,=,=,=,=,=,=,tS,eI,nd,@+d.
judge,judged,718,=,=,=,=,=,=,dZ,V,d,@+d.


When it should be:


adjudge,adjudged,15,=,=,=,=,@,=,dZ,V,dZ,@+d.
change,changed,8239,=,=,=,=,=,=,tS,eI,ndZ,@+d.
judge,judged,718,=,=,=,=,=,=,dZ,V,dZ,@+d.


So when working on the same syllable (not different ones, eg "challenge") the second tS or dZ sound gets shortened to a simple t or d. If anyone would be so kind as to help? Thanks.
 
This seems wrong:
Code:
if (r!="dZ"||r!="tS") {
The statement in parentheses will always be true.

If further help is needed, explain exactly how your program processes the data.
 
The original code is too convoluted to be worth salvaging.
Code:
BEGIN { FS=OFS="," ; ORS = "" }

{ print $1, $2, $3 OFS

  numsyl = field_to_array( $5, syllables )
  for (i=numsyl-2; i<=numsyl; i++)
  {
    print "=,"
    if (i in syllables)
      print divide_sounds( syllables[i] )
    else
      print "=,"
  }
  print $8 "." RS
}

# s is like "[xxx][xxx][xxx]..."
function field_to_array( s, array )
{ gsub( /^\[|\]$/, "", s )
  return split( s, array, /\]\[/ )
}

function divide_sounds( s     ,pairs,out)
{ pairs = "dZ tS eI"
  while (s)
  { if (length(s)>1 && index(pairs,substr(s,1,2)))
    { out = out substr(s,1,2) ","
      sub(/../, "", s)
    }
    else
    { out = out substr(s,1,1) ","
      sub(/./, "", s)
    }
  }
  return out
}
The output is
[tt]
acknowledge,acknowledged,611,=,@,k,=,n,O,=,l,I,dZ,@+d.
adjudge,adjudged,15,=,=,=,@,=,dZ,V,dZ,@+d.
challenge,challenged,545,=,=,=,tS,&,=,l,I,n,dZ,@+d.
change,changed,8239,=,=,=,=,=,tS,eI,n,dZ,@+d.
judge,judged,718,=,=,=,=,=,dZ,V,dZ,@+d.
[/tt]
 
I'm not sure how to get the correct number of ='s in the output. Perhaps you could explain.

-----
Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn.
-- H. P. Lovecraft
 
Love the signature :) Hehe

Thanks, but you were completely right: the original code was indeed too mixed up. I cleaned it all up and figured it out myself. Apart from throwing out and rearranging a lot of code, the simple answer was:

Code:
r=substr(b[x],(y+z),2)

Been looking way too long for this. Thank you, futurelet, and thank everyone else who's been helping me with this!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top