Audiovisuals on the Net

Audio on the Web

Sound is an often-overlooked aspect of interactive design. The right sound can make or break a project.

Background music can set an emotional tone, dialogue and narration supplement written passages, and sound effects clue the user in to interactive functions (such as button clicks).

Children with learning difficulties often gain a better understanding of the content if a narrator vocalizes the text. By vocalizing, a narrator can use inflections to put emphasis on words that may not resonate, or convey emotions.

Poorly recorded or poorly chosen sound works against your message.


Transduction — the conversion of an analog signal into discrete digital units.
Analog-to-Digital (A-to-D) — The convertion equipment used to digitize (like your sound card).

AAD – Analog Record, Analog Produce, Digital Master

ADD – Analog Record, Digital Produce, Digital Master

DDD – Digital Record, Digital Produce, Digital Master

DDD is the cleanest

Bits — Like all digital info, ones or zeros. The more bits, the more accurate the sound reproduction.
Sample Rate & Bit-Depth Fineness of the recording & Discrete Digital Steps. The higher the sample rate and bit-depth recorded, the better the quality, and the more of the sounds natural details and range will be preserved. Usually measured in thousands of samples per second.

8-bit Sound uses 256 amplitude levels.

16-bit sound uses 32,768 amplitude levels.

Stereo — Two channels take up more data space than one.
Monophonic — One channel.
Waveform — Graphical depiction of the amplitude of a sound stretched across time.
Sound Format Sampling Rate Sample Size Channels Kb/min MB/min
DVD-Audio & SACD 192kHz Up to 24bit 6 576
CD Stereo 44.1 kHz 16 bit 2 10,560 >10
High-end Mac Stereo 22 kHz 16 bit 2 5,280 >5
Best Mac Stereo 22 kHz 8 bit 2 2,640 >2.5
Best Mac Mono 22 kHz 8 bit 1 1,320 >1.3
Other 11.127 kHz
Other 11.025 kHz
Lowest / Worst 8 kHz


MPEG — (ie MP3) uses a lossy compression that strips out sounds that are not discernable to the human ear to achieve very high compression ratios (from 4:1 to 12:1) while maintaining near-original sound quality.

Audio Tips & Tricks:

  • Keep everything DDD if you can. This will give you cleaner sound.
  • Make sure you normalize your files. Normalizing lets your trim off amplitudes above or below certain levels. This leaves you with a steady and stable signal that doesn’t peak or drop out dramatically, which is very important for digitized audio.
  • Use a good pair of headphones or speakers as your primary monitor while editing sounds.
  • Equalize your files. Equalizing will ensure your output contains the proper ration of treble to bass. Be careful with the bass. Some smaller computer speakers can’t handle it very well.
  • Test your audio on different systems. Play your sound through multiple platforms and computer setups (with different speakers – including internal) just like you normally should do with Web pages.
  • Make a copy and play around with it. Have fun and go nuts. Run backwards, fool around with the EQ, play at half speed, whatever. This is good practice that helps you learn about audio.
  • Keep it simple. Resist the temptation to get carried away will all the special effects digital editors allow you to apply to your sound files.
  • Take advantage of system-wide integration (especially when it comes to working with Flash). Digital audio can be tracked with absolute precision, so remember to focus on the big picture.
  • Use the best equipment you can
  • Use the Internet to learn about your tools — Audio is a very complex area. Consider joining a relevant online discussion group for research purposes (Powell, 1998).

How to Keep Sound File Sizes Small for Quicker Downloads:

  • Keep the length of your audio clips short — the shorter, the better.
  • Limit the number of channels you use — mono is half the size of stereo
  • Reduce your bit depth to only what you need — 8 bits half the size of 16 bits
  • Keep your sampling rate down — 22.05 kHz is half the size of 44.1 kHz. Voice only audio files can be reduced down to 8 kHz without any discernable loss in quality. Sound effects will work at 8 kHz or 11.025 kHz. Music will sound acceptable at 22 kHz.

Sound Sources:

Commercial Recordings require copyright permission. This can be prohibitively expensive.

Hiring professionals is also expensive.

You can also downloadable many clip sounds (often for free):

You can also buy clip sounds on CDs from companies like Creative Support Services:

If you decide to record sounds yourself, here are some software suggestions…


Editing may be required using waveforms to:

  • Replace one garbled word for another.
  • Clean up and condense a narrated passage (trimming out unwanted sounds)
  • Lengthening a music score by copying parts of the notes and tacking them onto the end.
  • Creating seamless music loops that play through periods of inactivity.
  • Combining several soundtracks together to form a more complex soundtrack.
  • The synchronization of sounds with other media elements.

Using Sound in your Document:

Sound should be planned from the very start, at the storyboarding stage.

Let the user have the option of turning off or adjusting the sound volume.

Think of the way sound is used to create atmosphere in films and try to incorporate that level into your design.

Choose the right voice talent for any planned narration to establish a stylistic “feel”. The right voice, accent, or inflection can add a personal warm touch and character to the content. The wrong voice, accent, or inflections can seem inappropriate or even grating to the user.

Voices that are used separately from video will be much more susceptible to scrutiny than if a narrator appears in your project.

Although sound effects can add depth and richness to the user’s experience, if they are used too often or the volume is too loud, they can quickly annoy the user. Choose sound effects wisely, and only use those sounds that add meaning to your project. (Graham, 1999)

Online Delivery Strategies:

Nonstreaming vs. Streaming:

Nonstreaming audio must be downloaded before playback. It usually results in larger files. One minute of CD quality WAV music result. Most common file formats can be played on a variety of external sound players.

File Formats:

WAVE (.wav) — Originally developed as the standard audio format for Windows, but now supported on Mac as well. Most commonly used at 8 kHz to 11.025 kHz at 8 or 16 bit. Similar performance to AIFF.

AIFF (.aif, .aiff) — Audio Interchange File Format, developed as Mac standard, but now supported on PCs as well. Most commonly used at 8 kHz to 11.127 kHz at 8, 16, or 32 bit. Similar performance to WAVE.

u-LAW (.au) (pronounced myoo-law) — Unix standard. Most commonly used at 8.013 kHz, 22.05 kHz, and 44.1 kHz. Waning in popularity.

MPEG (.mpa, .mp2, .mp3) — Maintain pristine sound quality at compression rates as high as 10:1. MPEG Layer 1 was originally developed for video transfer at VHS quality; MPEG Layer 2 was developed as a higher standard for television broadcast; MPEG Layer 3 addresses other needs; MPEG Layers 4-7 are under development. The higher the number, the more complex the coding, and the more powerful a processor you’ll need to compress and decompress. Visit for more info.

MIDI (.mid) — Musical Instrument Digital Interface. A different breed, originally developed as a standard way for electronic instruments to communicate with each other. MIDI files contain no actual audio information, but rather a set of mathematical commands that describe a series of notes.

MIDIs are to other audio formats what vector graphics are to bitmaps. As a result, MIDI files are incredibly compact. They are capable of packing a minute of music into just 10K, which is 1000 times smaller than a one minute WAVE file.

Unfortunately, MIDIs can only contain notes, not real sounds — so they are useful for synthesizer sounding music.

Non-streaming Pros & Cons:


  • Doesn’t require special server software.
  • It’s simple to create audio files in standard formats.


  • Large file size can result in unacceptably long waits for files to download and start playing
  • Because the audio file is copied to the hard drive, it is more difficult for artists and publishers to limit distribution and protect copyrights.

Streaming Audio:

Begins playback almost immediately and continues playing as the audio data is transferred.

Some streaming solutions, such as RealNetworks and Xing Technologies Streamworks, use server software that holds open a special connection (UDP) through which it pushes data continuously. This method prevents the source audio format from being transferred to the users machine. Others, such as QuickTime and Shockwave, do require data is loaded onto a user’s cache, but can allow playback after only a fraction of the total data is loaded.

Streaming Audio Components:

  • The Encoder — used to create the streaming file format.
  • The Player — needed by users to hear the stream.
  • The Server Software — used to provide optimum streaming. (Try Shoutcast)


  • Audio begins playing soon after the stream begins
  • Sound quality doesn’t need to be as severely sacrificed
  • Artists and Publishers can control distribution and protect copyright because the user never gets a copy of the audio file.


  • Potentially high cost of server software
  • Requires a dedicated or preconfigured server, which may be problematic with some hosting services.
  • Sound quality and stream may be adversely affected by low speed or inconsistent Internet connections.

Your choice depends largely on the length of sound and quality you want.

Using Video on the Web

Why use Video?

Video has the ability to show powerful and attention-grabbing moving images.

Video integrated effectively into the project adds a sense of dynamic energy.

Video that plays slowly, runs too long, or has little purpose except to function as “eye candy” can detract from the interactive experience.

  • Show Information Quickly – Show information quickly, easily, and concisely in a way that text or still images might not.
    If it’s hard to explain in words or pictures, consider using video.
  • Add Realism – Add a sense of realism to your document by, for example, depicting historical or contemporary nonfiction events.
  • Artistic – Function as an artistic statement.
  • Dangerous Concepts – Teach concepts too dangerous or expensive for users to experience personally.
  • Exotic Locations & Things – Show places, persons, animals, or events that most individuals would never have the opportunity to view.
  • On-the-Spot Interviews – Add celebrity and on-the-spot interviews, bringing real-life experiences to the document.

Filming Tips:

Video Dos Video Don’ts
Use a tripod (with fluid, movable head)

The steadier the picture, the smaller your final file size (see compression below).

Don’t move the camera too fast
Use tight close-ups whenever possible Don’t fall into the bad habit of jump cuts (storyboard: plan ahead of time)
Use quality materials. There is a difference between brand name video tape, good cables, a decent microphone, and headphones. Don’t forget to disable the timecode in your camera
Pay close attention to your backgrounds. Make sure they aren’t too busy or similar to your subject. Don’t shoot exceedingly dark or light backgrounds
Frame your shots intelligently. Always leave a bit of space around your subject. But not too much! Don’t use dense, layered files. Pick representative stills instead of panning all over the place. Choose music that isn’t to wide ranging (to help with compression). Keep it simple!
Don’t forget to leave some time at the beginning and end of each shot. Couch your footage in at least 15 seconds of black at the beginning, and 30 seconds at the end.

Other Tips & Tricks:

  • Choose how you will capture your video digitally. (for Web or CD?)
  • If you go with a slow frame rate, choose 8 to 15 frames per second. Go with the lowest possible frame rate that doesn’t completely trash the illusion of motion. The quality of each frame is more important than the number of frames per second.
  • Go no bigger than 320 x 240 pixels at 10 frames per second. This is the most the Web can really handle.
  • Use a good video capture board. Again, better equipment will give you better results.
  • Experiment with your capture settings. Configuring the hue, saturation, and brightness up front will save you a lot of time in editing.
  • Bone up on codecs (see following section).
  • Capturing is not a quick stepping stone to digital heaven. Take your time and do it right.
  • Avoid tweaking your capture settings to death (Powell, 1998)

Online Integration

While video and animations integrate easily into multimedia documents, it is more difficult to add to online documents. (Graham, 1999)

Downloading movies and animation can be very time consuming. You have two options:

  • Simply link the video to your Web page for download and playback, or
  • Choose from a variety of streaming solutions. (Niederst, 1999)

For videos that are simply linked, offering low-resolution clips or thumbnails or frames lets the user choose whether to play the full resolution clip or not.

Video Formats:

AVI (short for audio/video interleaved) was introduced by Microsoft in 1992 (Neidherst, 1999). It is the file-type used by Video for Windows, which is the multimedia architecture developed for Windows 95. AVI is supposed to play back faster and smoother than other formats by interleaving the audio data with every video frame (while QuickTime handles audio & video interleaving in larger half-second or second blocks). Cross platform compatible (requires special software on Macs). AVI works best with PCs in Word and Powerpoint documents.

QuickTime was introduced by Apple in 1991 as a video file format and multimedia architecture to handle time-based media. It has become the industry standard for desktop video production. Cross platform compatible (requires special software on PCs). QuickTime is considered by many to be better suited for the Web due to its superior compression algorithms. Netscape 3.0+ and Internet Explorer 3.0+ come with QuickTime plugins. More recent versions of QuickTime support streaming capabilities. Visit for more info).

MPEG is a set of multimedia standards created by the Moving Picture Experts Group (similar to JPEG’s Joint Photographic Experts Group). MPEG supports video, audio, and streaming. MPEG was initially popular online because it was the only format that could be produced on a Unix system. MPEG uses an extremely high compression rate with little loss in quality using a lossy compression technique that strips out data that is not discernible to the human ear or eye. Visit for more information. (Neidherst, 1999)

Compression Techniques

Obviously you’ll want to keep your file size as small as possible. There are two basic ways of doing this: reducing the frame rate (number of video frames displayed per second) and compressing the finished file.

Frame Rates: Hollywood films and television broadcasts play 30 frames-per-second to achieve fluid-looking motion. (Pearce, 1998) Unfortunately, ten seconds of such uncompressed NTSC video (the standard for U.S. television) will fill around 300MBs of disk space.

Fortunately, there have been big advancements in the development of codecs over the past few years. Codecs, the abbreviation for compression/decompression, allow video to be reduced to a reasonable file size. (Pearce, 1998)

For basic editing, the recommended commercial video packages are Adobe Premeire, Strata VideoShop, and Speed Razor Mach III. Recommended Shareware tools include Movie Cleaner Lite, QuickEditor (Mac), or VidEdit (Windows) (Pearce, 1998)

Video Codecs

A codec can either be a software application or a piece of hardware that processes video through complex algorithms — which compress the file and then decompress it for playback. Unlike other kinds of file compression packages (like zip) that require you to decompress a file before viewing, video codecs decompress the video on the fly — allowing the client to view the file from its compressed original.

Codec Compression Schemes :

Codecs work in two ways — using temporal and spatial compression. Both of which generally work with “lossy” compression.

Temporal Compression looks for information that is not necessary for continuity to the human eye or ear. It looks at the video information on a frame-by-frame basis for changes between frames and only keeps the information that changes — deleting the rest. If there is a scene change in your video it marks it as a key-frame (saving everything in that frame) and compares future frames with it to determine what should be deleted.

Spatial Compression , on the other hand, uses a different method of deleting information that is common to the entire file or an entire sequence within a file. While it also looks for redundant information, it works with coordinates (like vector graphics) rather than by pixels (like bitmapped graphics).

Both these compression methods reduce the overall file size, but file sizes can be reduced further by limiting colors, frame rates, and audio quality (Fowler, 1998). An desktop video can get away with half the frame rate of a television broadcast (15fps to 8fps) and still look good (Pierce, 1998).

A word of caution: A good audio track can go a long way toward distracting people from a low frame rate. Jerky sound is actually much more jarring than a discontinuous picture — so don’t skimp on the audio. A mono 8-bit sound at 22.05Hz is decent for music, and much more that adequate for voice. (Pearce, 1998)

Hardware vs. Software:

Hardware codecs are the most efficient way to compress and decompress video files. They are faster and require fewer CPU resources than their software counterparts. Hardware codecs are expensive, but deliver high-quality video footage — as long as viewers have the same decompression device. Hardware codecs are used in video conferencing — where the equipment of the audience and the broadcaster are configured the same way.

Software codecs are less expensive, and freeware versions are readily available. Unfortunately, software codecs are CPU-intensive and take a long time to analyze and compress files.

Architectures: the software package that allows information to be traded in a standard format (Fowler, 1998).

Here are some sample architectures that support streaming video:

QuickTime —

VDOLive —

RealProducer Basic is available FREE from here (if you look hard enough – check the bottom of the page carefully):


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s