Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Binary Wav File

Status
Not open for further replies.

GerritGroot

Technical User
Nov 3, 2006
291
ES
Hi,

I want to read a binary wav file in Fortran. The only thing I know is that it's 16bit (don't know whether it has a header either, but that's not a question for this forum).

How do I decide about the amount of bits and how do I read the binary file then (either in F90 or F77)?

Many thanks,

Gerrit
 
There is a very good description of the format in
To open, try
Code:
          open (
     &        unit = 20,
     &        file = iFilename,
     &        status = 'OLD',
     &        access = 'DIRECT', ! Otherwise it assumes a record header
     &        form = 'UNFORMATTED',
     &        recl = 4,
     &        err = 100)

      read (20, rec=1) (wav(i), i = 1, whatever)
 
Thank you very much! For the link as well, because that's beyond the scope of the forum after all.
 
Hi GerritGroot

In cases like this I usually use the "BINARY" form description:

open(unit=20,
& file = 'Filename',
& status = 'old',
& form = 'binary')

This gives you the flexibility to read any byte length. But on the other hand there are no "record" numbers:

read(20) ChunkID ! 4 bytes (character*4)
read(20) ChunckSize ! 4 bytes (integer*4)
read(20) Format ! 4 bytes (character*4)
read(20) Subchunk1ID ! 4 bytes (charater*4)
read(20) Subchunk1Size ! 4 bytes (integer*4)
read(20) Audioformat ! 2 bytes (integer*2)
read(20) NumChannels ! 2 bytes (integer*2)
read(20) SampleRate ! 4 bytes (integer*4)
read(20) ByteRate ! 4 bytes (integer*4)
read(20) BlockAlign ! 2 bytes (integer*2)
read(20) BitsPerSample ! 2 bytes (integer*2)
read(20) Subchunk2ID ! 4 bytes (character*4)
read(20) Subchunk2Size ! 4 bytes (integer*4)
read(20) data ! byte(Subchunk2Size)

Best wishes
gullipe
 
Thanks Gulllipe,

This is completely new to me, but as I understand the recordlength is the amount of bytes in your number, isn't it?

So, in the first example this is set constant to 4 bytes.

What I don't understand is the difference between direct or not-direct access.

Anyway, I'm sure that with your example it will work, I'll let you know as soon as...

Many thanks,

Gerrit
 
Hi, again

If you open files with access="sequensial" (which is actually default) you can only read in one direction, from start to finish. Good for list files, where the lines have different length and they end in carriage-return.

If a file is opened with access="direct", you also have to specify the record length, as xwb did, and in each READ statement you have to specify the record number. In this way you can read from (or write into) any location in the file (modulus "recl"). Each READ reads the amount of bytes that "recl" defines. This is particularly good for files with fixed record lengths. File length must be a multiple of "recl" (which WAV-files are probably not).

If the file is opened with form='binary', you can actually only read sequencial (from start to finish), because there is no record number in the READ statement and there is no fixed record length either. The amount of bytes read depends on the size of the variable(s) in the READ statement. Binary files like WAV files have headers with records of different length (either 4 or 2, see the web page in xwb's letter), so it is easier to use this kind of opening, rather than "direct" access.

I hope this makes things more clear.

Best wishes
gullipe
 
Thanks a lot, I'm learnign more than I even asked for.

Now I understand, you are very clear.

Direct access maybe faster when changing files or so.

xwb doesm't specify whether it's binary or ascii.

It seems that a direct accessed file is always binary, because of the way of counting records etc. (Can't imagine that doing in ASCII either)

Gerrit
 
The version I used to use about 10-15 years ago didn't have binary as a format option. Maybe the later compilers have it. It probably has been in the standard all this time but you're just at the mercy of the compiler writers. If they do not provide it you have to find some way around it.

If you unformatted it implies binary so it is 2 bytes for an INTEGER*1, 4 bytes for an INTEGER*2 and 8 bytes for an INTEGER*3 or if your compiler is the other sort, 2 bytes for INTEGER*2, 4 for INTEGER*4 and 8 for INTEGER*8.
 
You are right xwb, "form='binary'" is probably extension to the standard. But is has been around for a long time. It was already in Microsoft Fortran F77, Version 5.1 from pre 1987 but marked "blue" in the manual, which meant that it was an extension to the ANSI Fortan 77 full language. It is "green" in the Intel Fortran 95 manual, so it must be an extension to the F95 standard as well.
 
My god! So it depends on your compiler whether INTEGER*2 is really two or four bytes!! That doesn't sound very portable.

I looked in Clive Page's Guide to F77 and FORM='BINARY' is not specified there indeed.

Another strange thing is that the wav format has a mixture of little and big-endians in its header (I had to look up what that means). Apart from the fact that I wonder why the hell would someone mix that definition in one file, I ask myself the question:

How do I know whether my Fortran reads little or big endian stuff??

If I do as gullipe says:

read(20) ChunckSize ! 4 bytes (integer*4)
read(20) Format ! 4 bytes (character*4)

The first READ reads a little-endian and the second READ a big-endian (see xwb's link). If I understood well the definition of those "endians" I'd read at least one of those wrong, wouldn't I?

Shouldn't it be necessary to read separate characters like for example:

read(20) ch1,ch2,ch3,ch4

and then convert them to a number converting: "Ch4.Ch3.Ch2.Ch1" to real??

One of those two reads has the byte order, let's say upside down for a given chip

I'm lost here!! Should I change the code when running on another chip?

I'm sorry if this question is beyond the scope, please tell me if so.

Many thanks,

Gerrit

 
Hi, again

Don't worry about "big endian" or "little-endian". The "big-endians" seem to be confined to locations with pure text (ChunkID, Format, Subchunk1ID and Subchunk2ID). All these locations contain only pure text. These "endians" seem not to be any endings at all (and I do not know what they mean). The first four bytes (ChunkID) contains only the four letters "RIFF" (hex: 52,49,46,46), see a dump of a WAV file below. Next four contain the 4-byte integer (ChunkSize) (hex:72,03,00,00)= Dec:882. The next 4 bytes (Format) is (hex:41,56,45,66)="WAVE" and the next four (Subchunk1ID) (66,6d,74,20)= "fmt ". Them comes a 4-byte integer (Subchunk1Size)(hex 10,00,00,00) = dec:16, then 2-byte integer (AdioFormat) (hex: 01,00)= dec:1 and then again 2-byte (NumChannels) (hex:01,00) = dec:1 ... Etc, etc, see below.


Dumping File: C:SOUND111.WAV

0000: 52 49 46 46 72 03 00 00 57 41 56 45 66 6d 74 20 "RIFFr...WAVEfmt "
0010: 10 00 00 00 01 00 01 00 11 2b 00 00 11 2b 00 00 ".........+...+.."
0020: 01 00 08 00 64 61 74 61 c1 02 00 00 7f 7f 7f 7f "....data...."
0030: 7f 7f 7f 7f 7f 7f 7e 7f 7e 7f 7f 7f 7f 7f 7f 7f "~~"
0040: 7f 7f 80 80 7e 7f 7f 7d 7f 80 7f 7e 7e 80 80 7d "..~}.~~..}"
0050: 7e 81 7f 7c 7d 80 82 7b 7f 81 80 7c 80 81 81 7d "~.|}..{..|...}"
0060: 79 89 7d 6c 9c 79 75 6f 96 82 76 8f 80 7b 65 86 "y.}l.yuo..v..{e."
0070: 91 6f 8d 8f 79 75 93 7d 77 6c 98 85 6e 72 7a 82 ".o..yu.}wl..nrz."
0080: 71 84 91 7f 70 7b 8a 99 73 6e 82 88 7c 77 7c 7d "q..p{..sn..|w|}"
0090: 85 7e 88 7b 7c 7a 83 89 85 7b 79 7d 80 81 7a 80 ".~.{|z...{y}..z."
00a0: 75 86 7f 86 7f 7a 82 88 81 7f 7b 83 82 7a 7e 7e "u..z...{..z~~"
00b0: 82 7b 7b 7c 7b 84 88 79 7c 81 85 84 78 76 83 7f ".{{|{..y|...xv."
00c0: 87 7b 7a 7e 84 80 81 83 7c 7d 7f 7b 81 81 81 7a ".{z~....|}{...z"
00d0: 7c 7e 7e 81 79 7f 87 85 7f 77 80 83 7a 85 7f 7c "|~~.y..w..z.|"
00e0: 7c 82 85 81 7d 7e 81 7a 7d 80 81 85 7d 7c 7a 81 "|...}~.z}...}|z."
00f0: 81 7f 7d 7b 82 84 7f 7a 7d 80 82 80 7a 80 80 82 ".}{..z}...z..."
0100: 81 7c 7a 82 80 7d 80 7e 7b 7e 82 7f 80 83 7f 7f ".|z..}.~{~..."
0110: 7d 80 81 7e 7e 7c 7f 7f 7e 7d 7f 81 81 81 7e 7e "}..~~|~}...~~"
0120: 7e 82 81 7f 81 7d 82 79 7d 80 7e 80 81 7c 81 7e "~...}.y}.~..|.~"
0130: 81 80 7f 76 73 8c 97 7f 7b 82 87 7c 7d 84 6c 6b "..vs..{..|}.lk"
0140: 9d 6f 75 79 7c 85 88 81 85 86 7e 76 7c 7c 86 71 ".ouy|.....~v||.q"
0150: 82 85 7f 7a 84 8a 80 7f 76 78 7c 8a 83 7d 7d 7c "..z...vx|..}}|"
0160: 7b 82 80 81 7e 80 7c 7b 83 7f 80 7b 7d 80 80 7f "{...~.|{..{}.."
0170: 7e 82 84 79 78 81 87 82 7e 81 81 7f 80 7e 85 88 "~..yx...~...~.."
0180: 74 6c 85 8a 86 77 79 76 87 80 79 83 7b 8e 7e 7d "tl...wyv..y.{.~}"
0190: 76 85 88 7d 89 79 7f 81 86 76 7c 81 7c 7e 7c 7e "v..}.y..v|.|~|~"
01a0: 82 7e 83 80 7d 7e 84 7f 7f 7f 7d 7e 7f 7d 7e 7e ".~..}~.}~}~~"
01b0: 7f 83 82 7d 7c 81 82 7e 7f 80 7e 7d 7c 80 81 81 "..}|..~.~}|..."
01c0: 7e 7f 7f 7e 7f 82 7c 7c 80 7d 7c 7c 7f 7e 82 82 "~~.||.}||~.."
01d0: 81 7f 81 7e 7d 7f 7f 7f 7e 7d 7e 7f 80 81 7e 80 "..~}~}~..~."
01e0: 7f 7e 82 80 7e 80 82 7d 7e 7e 7e 7d 81 7f 7c 7c "~..~..}~~~}.||"
01f0: 80 7f 7e 82 80 81 80 7e 82 81 7f 7c 7f 81 7e 7c ".~....~..|.~|"
0200: 7e 7f 7c 81 82 7c 81 81 81 80 80 80 7f 81 7d 7b "~|..|.......}{"
0210: 7d 81 7e 7f 7e 7e 82 80 80 81 7e 7f 80 7d 7f 7d "}.~~~....~.}}"
0220: 7f 7f 7e 7f 81 7f 7f 81 80 80 7d 7e 7f 80 80 7e "~....}~..~"
0230: 7f 7e 7f 81 7f 7e 7f 7e 7f 7f 7e 7f 7f 80 7e 80 "~.~~~.~."
0240: 80 7f 80 7f 7f 81 80 7c 7f 81 7d 7f 80 7d 7f 81 "....|.}.}."
0250: 7f 7f 7f 7f 80 7f 7e 81 7e 7d 80 7f 7d 80 81 7f ".~.~}.}.."
0260: 80 83 80 7e 80 80 7d 80 7f 7e 7f 7e 7f 81 7e 7f "...~..}.~~.~"
0270: 7f 7c 80 81 7e 7f 7e 80 7f 7e 7e 7f 7f 7f 7e 82 "|..~~.~~~."
0280: 7e 7e 7f 7d 7e 81 7d 7e 80 7d 7f 80 7d 7f 7f 7e "~~}~.}~.}.}~"
0290: 7f 7e 80 7e 7f 7f 80 7f 7f 7e 80 7f 7f 7e 7e 7e "~.~.~.~~~"
02a0: 80 80 7e 80 80 80 7e 81 81 7e 7f 7e 80 7f 7e 7e "..~...~..~~.~~"
02b0: 80 7f 7f 7f 7f 7e 80 7f 81 80 80 7e 7e 7e 80 81 ".~....~~~.."
02c0: 80 80 7d 7f 80 7e 80 7e 7f 7c 80 7f 81 7e 80 7e "..}.~.~|..~.~"
02d0: 7f 7f 7f 81 7d 7f 7f 7d 7e 7f 7f 80 81 7e 7d 80 ".}}~..~}."
02e0: 7e 80 7f 7f 7f 7f 7f 7e 7f 80 7d 7e 7f 00 4c 49 "~.~.}~.LI"
02f0: 53 54 4e 00 00 00 49 4e 46 4f 49 43 4f 50 2e 00 "STN...INFOICOP.."
0300: 00 00 44 6f 6e 61 6c 64 20 53 2e 20 47 72 69 66 "..Donald S. Grif"
0310: 66 69 6e 20 2d 20 43 6f 6d 70 75 74 65 72 20 4d "fin - Computer M"
0320: 75 73 69 63 20 43 6f 6e 73 75 6c 74 69 6e 67 00 "usic Consulting."
0330: 49 43 52 44 0c 00 00 00 31 39 39 35 2d 30 35 2d "ICRD....1995-05-"
0340: 30 32 00 5c 44 49 53 50 22 00 00 00 01 00 00 00 "02.\DISP"......."
0350: 70 31 31 31 6e 32 32 38 20 2d 20 44 6f 6e 61 6c "p111n228 - Donal"
0360: 64 20 53 2e 20 47 72 69 66 66 69 6e 00 ff 66 61 "d S. Griffin..fa"
0370: 63 74 04 00 00 00 c1 02 00 00 "ct........
 
Thanks a lot,

It makes human sense to do text in big-endian, otherwise your hexdump would show "EVAW" in stead of "WAVE".

See here:
In the above link it says that Mac uses big-endian, so...

I wonder whether a fortran compiler on the mac corrects for that??? In the case of xwb'f first example in pure ANSI F77 I can imagine that it may not.

Otherwise

read(20) ChunckSize ! 4 bytes (integer*4)

Wouldn't work on a Macintosh

The question is whether a Mac fortran compiler corrects for that, I don't use a Mac, but it would be interesting to know.

If fortran 95 is said to be fully portable with "BINARY" in the standard, a compiler should correct this, shouldn't it?

Regards,

Gerrit

BTW: I regularly try to click on "...for this valuable post" but the link doesn't seem to work and my statistic of having clicked change either. Anyway, thanks for the help, with or without link!
 
Hi, again

Thank you for explaining the "endiand" for me. I quickly made the following program to read a WAV file as a test. The compiler that I use interprets the "endians" correctly. At least the output seems to be correct, The text printed out (by ChunkID and Format) is "RIFF" and "WAVE", but not "FIRR" and "EVAW" and the numbers read are also correct.

If you cannot use "form=bibary", you must solve this by using "access=direct", "recl=4" and read 4 bytes each time, but only the header. Note that the "data size is not necessarily a multiple of 4, so you must open the file again with "recl=1" and read again one byte at a time (the data size of the file that I tested was for instance 705 chatacters).

program WAVread

character*4 ChunkID
integer*4 ChunkSize
character*4 Format
character*4 Subchunk1ID
integer*4 Subchunk1Size
integer*2 AudioFormat
integer*2 NumChannels
integer*4 SampleSize
integer*4 ByteRate
integer*2 BlockAlign
integer*2 BitsPerSample
character*4 Subchunk2ID
integer*4 Subchunk2Size
byte data(10000)

open(unit=1, file='sound111.wav',form='binary')
open(unit=2, file='Output.txt',status='unknown')

read(1) ChunkID
read(1) ChunkSize
read(1) Format
read(1) Subchunk1ID
read(1) Subchunk1Size
read(1) AudioFormat
read(1) NumChannels
read(1) SampleSize
read(1) ByteRate
read(1) BlockAlign
read(1) BitsPerSample
read(1) Subchunk2ID
read(1) Subchunk2Size
do i=1,Subchunk2Size
read(1) data(i)
enddo

write(2,*) ChunkID
write(2,*) ChunkSize
write(2,*) Format
write(2,*) Subchunk1ID
write(2,*) Subchunk1Size
write(2,*) AudioFormat
write(2,*) NumChannels
write(2,*) SampleSize
write(2,*) ByteRate
write(2,*) BlockAlign
write(2,*) BitsPerSample
write(2,*) Subchunk2ID
write(2,*) Subchunk2Size
do i=1,Subchunk2Size
write(2,*) data(i)
enddo

end
 
Works perfectly, thanks!

Let's do some FFT....

Best Regards,

Gerrit
 
Thinking about it, it's amazing that a file with mixed big- and little-endian data in the same file is read okay.

The only conclusion I can draw, is that the compiler assumes little endian for all data except for text, where it assumes big-endian. This also explains why the *.wav file has this mixture of endians.

Maybe a compiler developer could confirm that.

Regards,

Gerrit
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top