ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ <-=-=-=-=- Matthew Mclin to All -=-=-=-=> MM> Does anybody know the format of MOD/SAM/WAV/VOC file? Info on any MM> of those formats (how to read/write/play them using a PC Speaker or MM> LPT 1 with a mono DAC) would be greatly appreciated. You know, you are quite lucky that I just decided to pickup the Pascal echo even though I'm not a Pascal programmer. I have ALL of these file formats! Lucky you! I have had to search high and low all over the place for this junk and you're getting it all in one shot. Not only do I have those file formats, but I also understand how to play them back on the PC's Internal Speaker, LPT DACs, and Sound Blaster. I'll be posting that too. I have been interested in this field for quite a while, that's how I gather up all this information. If I had enough ambition, time, and patience, I'd probably write a book on it all because there is not ONE SINGLE book that explains how to play digital sound directly (ie, without specail drivers), with such drivers, what the file formats are, and includes code to do all that stuff. Gee, I bet that would make a lot of money, perhaps I should do that after all.... Those guys on the 80XXX Assembler echo would probably be able to do a better job as they are more knowledgable on this, but most of them are into writing demos and creating faster/better MOD players.. Ok, since this will take up a lot of room, I'll be splitting it up into seperate messages. The simpilest stuff goes in this message. MM> I would also like info on raw sound data and how to edit/play it. Newbe to Digital Sound, eh? Well, you've come to the right place for information, or rather, the right person has come to you. Ok, the basics. A digital sound file is basically just a bunch of volume settings. On the PC, a volume setting of 128 is normally silence. Values farther away from 128 in either direction are louder depending on its distance from 128. 0 and 255 are the loudest volumes. One thing I should make clear, 128 is not nessicarily silence. When making a recording, there is always background noise. So, what may sound like silence to you, is actually 126-130 or so. Now, you have probably seen those neat little graphs that some programs make when displaying a digital sound file. VEdit (which comes with the Sound Blaster) shows the waveform in the modify part of it. If you wanted to display a graph yourself, you could just load in a byte from the file, then, use that byte for the Y location. The X location is where in the file you are at (which byte). You just keep loading in bytes until the end of the screen. I could go on and on, but this is just a message, not a book! Hmm, you said you wanted to play a digital sound file on the PC's Internal Speaker and on a printer port DAC. Well, here comes that part. I'll explain usage of printer port DACs first because they are easier to understand. To play a VOC, WAV, SND, etc file on the DAC, you just read in one byte from the file, output it to the printer port, and do it again but on the next byte. To get the I/O address of the printer port, read the word at memory location 40h:8h for LPT1, 40h:0Ah for LPT2, 40h:0Ch for LPT3, and if on a non-ps/2, 40h:0Eh for LPT4. The internal speaker is a bit more tricky, you have to do certain things to set it up correctly before outputting sound. Before you do ANY sound output, you must do the following (sorry, I'm not a Pascal programmer, so this is in Assembler): Out 43h, 0B6h ;Please make note: This code was Out 42h, 0FFh ;written by a friend of mine in Out 42h, 0 ;australia named Phil Inch. He Out 43h, 90h ;posted code in the 80x86 Assembler In ax, 61h ;echo (GTPN, not Fido) for the Or ax, 3 ;public domain. Thanks Phil!! Out 61h, ax Ok, the above sets the timer chip up correctly. From there it is pretty simple. Get a byte from the sound file. Divide the byte by a 'shift' number (I'll explain about this later). Then, output this new byte to port 42h. Repeat this for the whole file. Ok, now, about that shift value. The PC's Internal Speaker wasn't designed for playing digital sound on it, it's just that brainy guys like Phil have figured out how to do with software what should have been done with hardware.. Anyway, the PC's Internal Speaker isn't very loud, so the range of volumes is much less than on a Sound Blaster or printer port DAC. This shift value varies from computer to computer, it depends on the size of your speaker and other stuff. Genernally, a shift value of 4 works on all computers. On my computer, I can get anyway with 3 on most files. The smaller the shift value, the louder the file will be played, but too small a shift value will cause distortion. Experiment! After you are finished playing the sound file, you must put the timer chip back the way it was supposed to be, or otherwise the next program that tries to make a noise on the internal speaker will make the noise but will not stop! Here is the code for that (again, sorry about the Assembler, it's just that I'm not a Pascal programmer): Out 43h, 0B6h In ax, 61h And ax, 0FCh Out 61h, ax There, that should do it. I hope I haven't totally confused you. Please write back if you have ANY questions what-so-ever. Gee, I'm already on line 107, time to go to a new message! MM> Note that these .MOD MM> and .SAM files are in the Amiga Module format (just incase there are MM> any others). Oh, there's also the .SND files. Or even .MID/.MDI files MM> if you can play them thru a DAC on an LPT port or the PC Speaker. Note MM> that I don't have a Sound Blaster (or any other sound card). Thanks. SAM Files: As far as I know, these do not contain any header or specific structure. They are just raw sound files. The only trick you have to remember about these files are that they are signed, which means that when the 7th bit is set, the number is negative. When the 7th bit is clear, the number is positive. This is completely different from digital sound files that originated on the PC. Remember, MOD and SAM files originated from the Amiga, so they have this weird encoding. To convert a signed file to an unsigned file, just read in one byte from the original file. Add 128 to that byte. Output the answer to a new file. In the Amiga world, a byte of 0 is equalivilent to silence. A byte of -128 (and +128) is as loud as it gets on the Amiga. On the PC, however, 0 (and 255) is as loud as it gets. A byte of 128 is equalivilent to silence on the PC. So, when we add 128 to a -128, we get a zereo, which is the same volume for a 128 on the Amiga. WAV Files: The following text was written by Edward Schlunder and was based on information provided by Tony Cook on the GT Power Network's 80x86 Assmebler echo. WAV File Format By: Edward Schlunder. 5-17-93 BYTE(S) NORMAL CONTENTS PURPOSE/DESCRIPTION --------------------------------------------------------------------------- 00 - 03 "RIFF" Just an identification block. The quotes are not included. 04 - 07 ??? This is a long integer. It tells the number of bytes long the file is, includes header size. 08 - 11 "WAVE" Just an other I.D. thing. 12 - 15 "fmt " Just an other I.D. thing. 16 - 19 16, 0, 0, 0 Size of header to this point. 20 - 21 1, 0 Format tag. 22 - 23 1, 0 Channels 24 - 27 ??? Sample rate, or (in other words), samples per second. 28 - 31 ??? Average bytes per second. 32 - 33 1, 0 Block align. 34 - 35 8, 0 Bits per sample. Ex: Sound Blaster can only do 8, Sound Blaster 16 can make 16. Normally, the only valid values are 8, 12, and 16. 36 - 39 "data" Marker that comes just before the actual sample data. 40 - 43 ??? The number of bytes in the sample. Information from Tony Cook, Australia. GT Power 80x86 Assembler echo. MM> Does anybody know the format of .MOD/.SAM/.WAV/.VOC file? Info on any MM> of those formats (how to read/write/play them using a PC Speaker or MM> LPT 1 with a mono DAC) would be greatly appreciated. I would also like VOC File Format: This file format was written by Phil Inch on the 80x86 Assembler echo on the GTPN. Thanks Phil!! BYTE(S) NORMAL CONTENTS PURPOSE/DESCRIPTION --------------------------------------------------------------------------- 00 - 19 "Creative Voice File", 26 Just an identification block. The quotes are not included, and the 26 is byte 26 (1Ah) which is an end-of-file marker. There- fore, if you TYPE a VOC file, you will just see Creative Voice File. 20 - 21 26, 00 This is a low byte, high byte sequence which gives the offset of the first block of sound data in the file. Currently this is 26 ( 00 x 256 + 26 ) which is the length of the header, but it's probably good programming practice to read and use this value anyway in case the format changes later. 22 - 23 10,1 These bytes give the version number of the VOC file, subnumber first, then main number. The default, as you can see, is 1.10. 24 - 25 41,17 These bytes are "check digits". These allow you to be absolutely SURE that you are working with a VOC file. To use them, convert the version number (above) and this number to integers. Do this with the formula below, where for convention the above bytes have been listed as byte1, byte2. (byte2*256)+byte1 Therefore, for the default values we get the following integers: (1 x 256)+10 = 266 (17 x 256)+41 = 4393 When you add the two results, you get 4659. If you do these calcs and get 4659, then you can be almost certain you're working with a VOC file. OK, that takes care of the header information. I hope you realise that I'll never get a registration for VOCHDR now! Oh well perhaps people will buy my games! Having gotten to byte 26, we now start encountering data blocks. There are eight types in all, conveniently numbered 0 - 7. For each block, the first byte will always tell you the type. For notational convenience, bx means byte x, eg b5 means byte 5. BLOCK 0 - THE "END BLOCK" Structure: Byte 1: '0' to denote "end block" type This block is located at the END of a VOC file. When a VOC player encounters a block 0, it should stop playing the VOC file. BLOCK 1 - THE "DATA BLOCK" Structure: Byte 1: '1' to denote "data block" type 2: \ 3: | These bytes give the length: 4: / b2 + (b3*256) + (b4*65536) 5: Sampling rate: Calculated as 1000000 / (256-b5) 6: Pack type byte: 0 = data is not packed 1 = data is packed to four bits 2 = data is packed to 2 bits 3 = data is packed to 1 bit 7: Actual sample data starts here BLOCK 2 - THE "MORE DATA BLOCK" Structure: Byte 1: '2' to denote "more data block" type 2: \ 3: | These bytes give the length: 4: / b2 + (b3*256) + (b4*65536) 5: Actual sample data starts here The point of this is simple: If you have a sample that you want to chop up into smaller portions (the maximum block length in a VOC file is 16,842,751 bytes but who's counting?), then define a "more data" block. This "carries over" the previously found sampling rate and pack type byte, so a "data block" should have been encountered earlier somewhere along the line. BLOCK 3 - THE "SILENCE" BLOCK Structure: Byte 1: '3' to denote "silence block" type 2: \ 3: | These bytes give the length: 4: / b2 + (b3*256) + (b4*65536) (Note that this value is usually 3 for a silence block.) 5: Duration ( b5+(b6*255) ). This gives the equivalent 6: number of bytes to "play" during the silence. 7: Sampling rate: Calculated as 1000000 / (256-b5) A silence block is used for long periods of silence. When long silences are required, it's more efficient in size terms to insert one of these blocks, as seven bytes can then represent up to 65,536. BLOCK 4 - THE "MARKER BLOCK" Structure: Byte 1: '4' to denote "marker block" type 2: \ 3: | The length of the block, as usual 4: / 5: Marker value, as low-high (ie b5 + (b6*255) ) 6: The marker block is read by CT-VOICE.DRV. When a marker block is encountered, the value in the marker value bytes (5 and 6) is copied into the status word specified when CT-VOICE was initialized. This allows your program to judge where in the sample you currently are, thus allowing for progress counters and the like. It's also useful if you're trying to synchronize other processes to the playing of the sound. For example, by using appropriate marker blocks, you could send signals to your software to move the lips of a person on-screen in time with the speech in the VOC. However, this does take some doing and a VERY good VOC editor! BLOCK 5 - THE "MESSAGE BLOCK" Structure: Byte 1: '5' to denote "message block" type 2: \ 3: | The length of the block, as usual 4: / 5 - ?: Message, as ASCII text. ?: 0, to denote end of text The message block simply allows you to embed text into a VOC file. Presumably you could use this to detect when other people have pinched your VOC files for their own applications. BLOCK 6 - THE "REPEAT BLOCK" Structure: Byte 1: '6' to denote "repeat block" type 2: \ 3: | The length of the block, as usual 4: / 5: Number of times that data should be repeated 6: Total = 1 + b5 + (b6*255) Every "playable" data block between a block 6 and a block 7 will be repeated the number of times specified in b5 and b6. Note that you add one to this value - the data blocks are ALWAYS played at least once. However, if b5 and b6 are zero, then you really don't need a repeat block, do you! I'm told that you cannot "nest" repeat blocks, but I've never tried it. This limitation would only apply to CT-VOICE.DRV I would have thought, but it depends how good other VOC players are. BLOCK 7 - THE "END REPEAT BLOCK" Structure: Byte 1: '7' to denote "end repeat block" type 2: \ 3: | The length of the block, as usual 4: / This, as explained, marks the end of the block of blocks (!) that you wish to repeat. Note that the "length" is always zero, so I don't know why the length bytes are required at all. --------------------------------------------------------------------- This was picked up off the 80XXX Assembler echo on FidoNet. There are many other file formats for MODs, but I have found this one to be most complete Protracker 2.3A Song/Module Format: ----------------------------------- Offset Bytes Description ------ ----- ----------- 0 20 Songname. Remember to put trailing null bytes at the end... When written by ProTracker this will be only uppercase; there are only historical reasons for this. (And the historical reason is that Karsten Obarski, who made the first SoundTracker, was stupid.) Information for sample 1-31: Offset Bytes Description ------ ----- ----------- 20 22 Samplename for sample 1. Pad with null bytes. Will only be uppercase. The samplenames are often used for storing messages from the author; in particular, samplenames starting with a '#' sign will generally be a message. This convention is a result of a player called IntuiTracker displaying all samples starting with # as a message to the person playing the module. 42 2 A WORD with samplelength for sample 1. Stored as number of words. Multiply by two to get real sample length in bytes. This is a big-endian number; for all PC programmers out there, this means that to get your 8-bit-orginated format, you have to swap the two bytes. 44 1 Lower four bits are the finetune value, stored as a signed four bit number. The upper four bits are not used, and should be set to zero. They should also be masked out reading; you can never be sure what some stupid program could have stored here... 45 1 Volume for sample 1. Range is $00-$40, or 0-64 decimal. 46 2 Repeat point for sample 1. Stored as number of words offset from start of sample. Multiply by two to get offset in bytes. 48 2 Repeat Length for sample 1. Stored as number of words in loop. Multiply by two to get replen in bytes. Information for the next 30 samples starts here. It's just like the info for sample 1. Offset Bytes Description ------ ----- ----------- 50 30 Sample 2... 80 30 Sample 3... . . . 890 30 Sample 30... 920 30 Sample 31... Offset Bytes Description ------ ----- ----------- . 950 1 Songlength. Range is 1-128. 951 1 This byte is set to 127, so that old trackers will search through all patterns when loading. Noisetracker uses this byte for restart, ProTracker doesn't. 952 128 Song positions 0-127. Each hold a number from 0-63 (or 0-127) that tells the tracker what pattern to play at that position. 1080 4 The four letters "M.K." - This is something Mahoney & Kaktus inserted when they increased the number of samples from 15 to 31. If it's not there, the module/song uses 15 samples or the text has been removed to make the module harder to rip. Startrekker puts "FLT4" or "FLT8" there instead. If there are more than 64 patterns, PT2.3 will insert M!K! here. (Hey - Noxious - why didn't you document the part here relating to YOUR OWN PROGRAM? -Vishnu) Offset Bytes Description ------ ----- ----------- 1084 1024 Data for pattern 00. . . . xxxx Number of patterns stored is equal to the highest patternnumber in the song position table (at offset 952-1079). Each note is stored as 4 bytes, and all four notes at each position in the pattern are stored after each other. 00 - chan1 chan2 chan3 chan4 01 - chan1 chan2 chan3 chan4 02 - chan1 chan2 chan3 chan4 etc. Info for each note: _____byte 1_____ byte2_ _____byte 3_____ byte4_ / \ / \ / \ / \ 0000 0000-00000000 0000 0000-00000000 Upper four 12 bits for Lower four Effect command. bits of sam- note period. bits of sam- ple number. ple number. MM> Does anybody know the format of .MOD/.SAM/.WAV/.VOC file? Info on any One thing you should keep in mind about MOD files is that they originated from the Amiga, so the samples are signed, see the discussion about SAM files for more information. Note: Sounder and Sound Tool both use the same file extension, but have different file formats. To tell the difference, Read the first 6 bytes of the file. If it matches the magic number for Sound Tool .SND files, it is a Sound Tool file. Else, it's a Sounder file or a raw file. Sounder File Format: BYTE(S) NORMAL CONTENTS PURPOSE/DESCRIPTION --------------------------------------------------------------------------- 00 - 01 0, 0 Bits per sample. Ex: Sound Blaster can only do 8, Sound Blaster 16 can make 16. Normally, the only valid value is 0, which is the code for an 8 bit sample. Future versions of Sounder and DSOUND.DLL may allow 16 bit samples and such. 02 - 03 ??? Sampling rate. Currently, only 22 KHz, 11 KHz, 7.33 KHz, and 5.5 KHz are valid. If given a value like 9 KHz, it will be played at the next closest rate (in this case, 11 KHz). The sampling rate is calculated as follows: SampRate = Byte1 + (256 * Byte2) 04 - 05 ??? Volume to play the sample back at. Note: On the PC's Internal Speaker, there is a definite upper limit as to the volume, depending on the shift value (see below). The Sound Blaster and the Disney Sound Source aren't quite as restricted, but still are at some high value. 06 - 07 4, 0 Shift value. This is the number that each byte is divided by to "scale" the volume down to a point where the PC's Internal Speaker can handle it. See the discussion on playing back digitalized sound for more details. Information from Sounder text files and Sound Tool help (.HLP) files. Rewritten by Edward Schlunder Sound Tool File Format: BYTE(S) NORMAL CONTENTS PURPOSE/DESCRIPTION --------------------------------------------------------------------------- 00 - 05 "SOUND", 26 Just an identification thing. Helps a lot when you are trying to distinguish between Sounder .SND files and Sound Tool .SND files. 08 - 11 ??? This is the number of bytes in the sample. It is calculated as follows: ByteSam = Byte1 + (256 * Byte2) + (512 * Byte3) + (768 * Byte4) 12 - 15 ??? This points to the first byte to play in the file. It is calculated the same way as the number of bytes in the sample (see above). 16 - 19 ??? This points to the last byte in the sample to play. Calculated the same as above. 20 - 21 ??? Sampling rate of the sample. Valid values are 22 KHz, 11 KHz, 7.33 , and 5.5 K, but if given a number not listed above, it will be played at the closest valid sampling rate. So, 9 KHz would be played at 11 Khz. This is calculated as follows: SamRate = Byte1 + (256 * Byte2) 22 - 23 ??? Bits per sample. Ex: Sound Blaster can only do 8, Sound Blaster 16 can make 16. Normally, the only valid value is 0, which is the code for an 8 bit sample. Future versions of Sounder and DSOUND.DLL may allow 16 bit samples and such. 24 - 25 ??? Volume to play the sample back at. Note: On the PC's Internal Speaker, there is a definite upper limit as to the volume, depending on the shift value (see below). The Sound Blaster and the Disney Sound Source aren't quite as restricted, but still are at some high value. 26 - 27 4, 0 Shift value. This is the number that each byte is divided by to "scale" the volume down to a point where the PC's Internal Speaker can handle it. See the discussion on playing back digitalized sound for more details. 28 - 123 ??? This is the name of the sample. It is followed by an ASCII 0. Information from Sounder text files and Sound Tool help (.HLP) files. Rewritten by Edward Schlunder