Problem with coda 1

LovecraftHP · Mar 3, 2005

Hi,

Been working on a program that among others should be able to split up verbs in a onset-nucleus-coda structure (with the help of this forum, thanks guys). Almost finished now, except for a couple of bugs.

Sample code:

BEGIN { FS="," ; fs="[" ; d="=" }
{
printf $1 FS $2 FS $3
n = split($4, a, fs)
split($5, b, fs)
for (x=(n-2); x<=n; x++) {
f=z=0
k=j=l=""
for (y=1; y<length(a[x]); y++) {
p=substr(a[x],y,1)
q=substr(b[x],(y+z),1)
r=substr(b[x],y,(z+2))
s=substr(a[x],(y+1),1)
if (p=="V") {
k=k q
f=1
}
if (p=="C") {
if (r!="dZ"||r!="tS") {
if (f==0) j=j q
if (f==1) l=l q
}
if (r=="dZ"||r=="tS") {
if (f==0) {
j=r
z++
}
else if (f==1) {
l=r
}
}
}
}
printf FS (j?j:d) FS (k?k:d) FS (l?l:d)
}
printf FS $8 "." RS
}

Used on this data sample:

acknowledge,acknowledged,611,[VC][CV][CVC],[@k][nO][lIdZ],[VC][CV][CVCC],[@k][nO][lIdZd],@+d,318
adjudge,adjudged,15,[V][CVC],[@][dZVdZ],[V][CVCC],[@][dZVdZd],@+d,483
challenge,challenged,545,[CV][CVCC],[tS&][lIndZ],[CV][CVCCC],[tS&][lIndZd],@+d,6955
change,changed,8239,[CVVCC],[tSeIndZ],[CVVCCC],[tSeIndZd],@+d,6997
judge,judged,718,[CVC],[dZVdZ],[CVCC],[dZVdZd],@+d,24555

Produces this output:

acknowledge,acknowledged,611,=,@,k,n,O,=,l,I,dZ,@+d.
adjudge,adjudged,15,=,=,=,=,@,=,dZ,V,d,@+d.
challenge,challenged,545,=,=,=,tS,&,=,l,I,dZ,@+d.
change,changed,8239,=,=,=,=,=,=,tS,eI,nd,@+d.
judge,judged,718,=,=,=,=,=,=,dZ,V,d,@+d.

When it should be:

adjudge,adjudged,15,=,=,=,=,@,=,dZ,V,dZ,@+d.
change,changed,8239,=,=,=,=,=,=,tS,eI,ndZ,@+d.
judge,judged,718,=,=,=,=,=,=,dZ,V,dZ,@+d.

So when working on the same syllable (not different ones, eg "challenge") the second tS or dZ sound gets shortened to a simple t or d. If anyone would be so kind as to help? Thanks.

futurelet · Mar 3, 2005

This seems wrong:

Code:

if (r!="dZ"||r!="tS") {

The statement in parentheses will always be true.

If further help is needed, explain exactly how your program processes the data.

futurelet · Mar 4, 2005

The original code is too convoluted to be worth salvaging.

Code:

BEGIN { FS=OFS="," ; ORS = "" }

{ print $1, $2, $3 OFS

  numsyl = field_to_array( $5, syllables )
  for (i=numsyl-2; i<=numsyl; i++)
  {
    print "=,"
    if (i in syllables)
      print divide_sounds( syllables[i] )
    else
      print "=,"
  }
  print $8 "." RS
}

# s is like "[xxx][xxx][xxx]..."
function field_to_array( s, array )
{ gsub( /^\[|\]$/, "", s )
  return split( s, array, /\]\[/ )
}

function divide_sounds( s     ,pairs,out)
{ pairs = "dZ tS eI"
  while (s)
  { if (length(s)>1 && index(pairs,substr(s,1,2)))
    { out = out substr(s,1,2) ","
      sub(/../, "", s)
    }
    else
    { out = out substr(s,1,1) ","
      sub(/./, "", s)
    }
  }
  return out
}

The output is
[tt]
acknowledge,acknowledged,611,=,@,k,=,n,O,=,l,I,dZ,@+d.
adjudge,adjudged,15,=,=,=,@,=,dZ,V,dZ,@+d.
challenge,challenged,545,=,=,=,tS,&,=,l,I,n,dZ,@+d.
change,changed,8239,=,=,=,=,=,tS,eI,n,dZ,@+d.
judge,judged,718,=,=,=,=,=,dZ,V,dZ,@+d.
[/tt]

futurelet · Mar 4, 2005

I'm not sure how to get the correct number of ='s in the output. Perhaps you could explain.

-----
Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn.
-- H. P. Lovecraft

LovecraftHP · Mar 11, 2005

Love the signature

Hehe

Thanks, but you were completely right: the original code was indeed too mixed up. I cleaned it all up and figured it out myself. Apart from throwing out and rearranging a lot of code, the simple answer was:

Code:

r=substr(b[x],(y+z),2)

Been looking way too long for this. Thank you, futurelet, and thank everyone else who's been helping me with this!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Problem with coda 1

LovecraftHP

Programmer

futurelet

Programmer

futurelet

Programmer

futurelet

Programmer

LovecraftHP

Programmer

Similar threads

Part and Inventory Search

Sponsor