For the longest time, I've been baffled by the concept of sound in computing. How in the world is sound store? How is it played back? In classic Coding4Fun style, we'll learn by doing in this article—by building a wave oscillator application. Optional ReadingI cover the basics of this article in a multi-part blog series, which you should check out if you have trouble: Part 1 - How Audio Data is Represented What's An Oscillator?An oscillator is a device or application that generates a waveform. In electrical engineering terms, it's a device that outputs an electrical current with varying voltage. If you plot the voltage over time, you get a regular wave in a particular form, such as a sine, square, triangle or sawtooth. An oscillator is the most basic type of synthesizer. Analog synths use electrical circuits to output a sound wave. Digital synthesizers do the same thing, but with software. You can create a pretty neat sounding instrument by combining the outputs of multiple oscillators. For example, if you have three oscillators oscillating at a frequency of 440Hz (concert A pitch), but each of them has a different waveform (saw, square, sine) you get a very interesting, layered sound. But before we get too deep into this subject, let's briefly explore the physics of sound. The Physics of SoundSound happens when air pressure changes on your ear drum. When you clap in an empty room, pressure waves bounce all over the place and dance on your eardrum. The changes in pressure are detected continuously by your ear. Digitally, “pressure” is referred to by a scalar value called amplitude. The amplitude (loudness) of the wave is measured thousands of times per second (44,100 times per second on CDs). Every measurement of pressure (aka amplitude) is called a sample—CDs are recorded with 44,100 samples per second, each with a value between the minimum and maximum amplitude for the bit depth. Think about 44,100 samples per second. That's a lot of stuff for your ear to detect. That's how we're able to hear so much stuff going on in the mix of a song, especially in stereo tracks where you have 44,100 samples per second, per ear. It turns out that there is a horribly intense mathematical theorem which basically tells us that 44,100 samples per second is enough to accurately represent a pitch as high as 22 KHz. The human ear can really hear only up to 20KHz, so a 44.1KHz sampling rate is a more than high-enough sampling rate. This whole section is expanded in detail on my blog: Part 1 - How Audio Data is Represented TerminologySo now you have a rather glancing overview of how sound works, and perhaps some clues as to how we should go about representing it in computers. Let's go over all this new terminology (plus some even newer terms) in delicious, bulleted format: · Sample: A measurement of a sound wave at a very small point in time. 44,100 of these measurements in a row form a single channel of CD-quality audio. · Amplitude: The value of a sample. Max and min values are dependent upon the bit depth. · Bit depth: The number of bits used to represent a sample. 16-bit, 32-bit, etc. Max amplitude is (2^depth) / 2 – 1. · Sample rate (aka sampling rate, aka bit rate): The number of samples per second of audio. 44,100 is standard for CD-quality audio. How sound is representedBy now, you've probably surmised that a second of audio data is somehow represented by an array of some integer data type, which has a length of 44,100. You would be correct in that assumption. However, if you want sound to play from a computer's sound card, that data has to be accompanied with a bunch of format information. WAV is probably the easiest format to deal with. See more in the following article: Part 2 - Demystifying the WAV Format You can also see how to build out a WAV file, old school and binary style, in the 3rd part of that series: Part 3 - Synthesizing Simple WAV Audio Using C# However, we are taking a slightly easier route, by using DirectSound. DirectSound gives us a lot of nice classes for all the format information, abstracting all that stuff away and allowing us to pump a stream of data into a DirectSound object and play it. Perfect for a synthesizer app! So, let's get started! Building the appI learned some Blend while working with this app, since it's built on WPF. The image buttons are just radio buttons. I had to differentiate the group number per instance of the user control at runtime (in the constructor of the Oscillator class). I'm a terrible UI designer for the most part, so this is about as sexy as I'm willing to make this application. But feel free to make it look and act better! Designing the UIThere's a dirty little secret in this application. It says it can oscillate 3 waves, but in truth, there's a constant (set to 3) that you can modify. You could have six if you wanted. How did I accomplish this? Each synth that you see is an instance of a WPF user control called Oscillator.xaml: I have a StackPanel called Oscs in the main window. In the Window_Loaded event handler of the main window, I use this bit of code to add instances of the usercontrol: C# // Add 3 oscillators Oscillator tmp; for (int i = 0; i < NUM_GENERATORS; i++) { tmp = new Oscillator(); Oscs.Children.Add(tmp); mixer.Oscillators.Add(tmp); } VB ' Add 3 oscillators Dim tmp As Oscillator Dim i As Integer = 0 While i < NUM_GENERATORS tmp = New Oscillator() Oscs.Children.Add(tmp) mixer.Oscillators.Add(tmp) System.Math.Max(System.Threading.Interlocked.Increment(i),i - 1) End While The long rectangular canvas is used to plot the values of the generated wave, so you can visualize the wave as it's played. It is scaled along the X axis so you can see the general shape of the wave, which would be impossible without scaling it with 44,100 samples per second. Earlier in the article, I noted that a sound file is basically a really, really long array of 16- or 32-bit floating point numbers between -1 and 1. We use this data to plot the graph as well. More on that later. Now that we have the UI figured out (dynamic addition of oscillators), let's take a look at exactly how the sound is produced. Bzzzzt! Making Sounds and the MixerOne of the many cool things about DirectSound is that it basically wraps the WAV format for you. You set the buffering/format options and then shove a bunch of data into it, and it will play. Magic. The way I've architected the solution is a little more modular. None of the oscillators has the ability to play itself—rather, uses its UI to control some values such as frequency, amplitude and wave type. These values are tied to public properties. The Oscillator component does virtually no audio work at all. The generation of audio data is handled by the custom Mixer class, which takes a collection of Oscillators and, based on their properties, creates a composite of all the generators. This is done by averaging the samples in every oscillator and putting them into a new array of data. The Mixer class looks like this: One of the workhorses of the Mixer class is the methodGenerateOscillatorSampleData. This takes an Oscillator as an argument to give access to the public properties set in the UI. From there, the algorithm generates 1 second of sample data (specified by the member bufferDurationSeconds) based on the wave type that has been selected in the UI. This is where the mathy stuff comes in to play. Check out this method and the different cases in the switch statement that determine what kind of wave to create below. C# public short[] GenerateOscillatorSampleData(Oscillator osc) { // Creates a looping buffer based on the params given // Fill the buffer with whatever waveform at the specified frequency int numSamples = Convert.ToInt32(bufferDurationSeconds * waveFormat.SamplesPerSecond); short[] sampleData = new short[numSamples]; double frequency = osc.Frequency; int amplitude = osc.Amplitude; double angle = (Math.PI * 2 * frequency) / (waveFormat.SamplesPerSecond * waveFormat.Channels); switch (osc.WaveType) { case WaveType.Sine: { for (int i = 0; i < numSamples; i++) // Generate a sine wave in both channels. sampleData[i] = Convert.ToInt16(amplitude * Math.Sin(angle * i)); } break; case WaveType.Square: { for (int i = 0; i < numSamples; i++) { // Generate a square wave in both channels. if (Math.Sin(angle * i) > 0) sampleData[i] = Convert.ToInt16(amplitude); else sampleData[i] = Convert.ToInt16(-amplitude); } } break; case WaveType.Sawtooth: { int samplesPerPeriod = Convert.ToInt32( waveFormat.SamplesPerSecond / (frequency / waveFormat.Channels)); short sampleStep = Convert.ToInt16( (amplitude * 2) / samplesPerPeriod); short tempSample = 0; int i = 0; int totalSamplesWritten = 0; while (totalSamplesWritten < numSamples) { tempSample = (short)-amplitude; for (i = 0; i < samplesPerPeriod && totalSamplesWritten < numSamples; i++) { tempSample += sampleStep; sampleData[totalSamplesWritten] = tempSample; totalSamplesWritten++; } } } break; case WaveType.Noise: { Random rnd = new Random(); for (int i = 0; i < numSamples; i++) { sampleData[i] = Convert.ToInt16( rnd.Next(-amplitude, amplitude)); } } break; } return sampleData; } VB.Net Public Function GenerateOscillatorSampleData(ByVal osc As Oscillator) As Short() ' Creates a looping buffer based on the params given ' Fill the buffer with whatever waveform at the specified frequency Dim numSamples As Integer = Convert.ToInt32( bufferDurationSeconds * waveFormat.SamplesPerSecond) Dim sampleData As Short() = New Short(numSamples - 1) {} Dim frequency As Double = osc.Frequency Dim amplitude As Integer = osc.Amplitude Dim angle As Double = (Math.PI * 2 * frequency) / (waveFormat.SamplesPerSecond * waveFormat.Channels) Select Case osc.WaveType Case WaveType.Sine If True Then For i As Integer = 0 To numSamples - 1 ' Generate a sine wave in both channels. sampleData(i) = Convert.ToInt16(amplitude * Math.Sin(angle * i)) Next End If Exit Select Case WaveType.Square If True Then For i As Integer = 0 To numSamples - 1 ' Generate a square wave in both channels. If Math.Sin(angle * i) > 0 Then sampleData(i) = Convert.ToInt16(amplitude) Else sampleData(i) = Convert.ToInt16(-amplitude) End If Next End If Exit Select Case WaveType.Sawtooth If True Then Dim samplesPerPeriod As Integer = Convert.ToInt32(waveFormat.SamplesPerSecond / (frequency / waveFormat.Channels)) Dim sampleStep As Short = Convert.ToInt16((amplitude * 2) / samplesPerPeriod) Dim tempSample As Short = 0 Dim i As Integer = 0 Dim totalSamplesWritten As Integer = 0 While totalSamplesWritten < numSamples tempSample = CShort(-amplitude) i = 0 While i < samplesPerPeriod AndAlso totalSamplesWritten < numSamples tempSample += sampleStep sampleData(totalSamplesWritten) = tempSample totalSamplesWritten += 1 i += 1 End While End While End If Exit Select Case WaveType.Noise If True Then Dim rnd As New Random() For i As Integer = 0 To numSamples - 1 sampleData(i) = Convert.ToInt16( rnd.[Next](-amplitude, amplitude)) Next End If Exit Select End Select Return sampleData End Function The Mixer is the heart of the app, and it's a beautiful example of object orientation and cohesion. Give it three things (oscillators) and it spits out a new thing you can use (an array of sample data). Now that we have the sample data, all we have to do is play it back using DirectSound. Sound Playback with DirectSoundAs I mentioned, DirectSound provides a wrapper over the WAV format. You set up your buffer and format information and then feed it a bunch of data in the form of an array of shorts (arrays of trousers are known to cause errors). First, we initialize the format information and buffer in the Window_Loaded event handler of the main form. The values below are not really arbitrary; there is an explanation of them in the Optional Reading section above (see Demystifying the WAV Format). This code also contains the code to add the oscillators, as shown earlier in the article. C# private void Window_Loaded(object sender, System.Windows.RoutedEventArgs e) { WindowInteropHelper helper = new WindowInteropHelper(Application.Current.MainWindow); device.SetCooperativeLevel(helper.Handle, CooperativeLevel.Normal); waveFormat = new Microsoft.DirectX.DirectSound.WaveFormat(); waveFormat.SamplesPerSecond = 44100; waveFormat.Channels = 2; waveFormat.FormatTag = WaveFormatTag.Pcm; waveFormat.BitsPerSample = 16; waveFormat.BlockAlign = 4; waveFormat.AverageBytesPerSecond = 176400; bufferDesc = new BufferDescription(waveFormat); bufferDesc.DeferLocation = true; bufferDesc.BufferBytes = Convert.ToInt32( bufferDurationSeconds * waveFormat.AverageBytesPerSecond / waveFormat.Channels); // Add 3 oscillators Oscillator tmp; for (int i = 0; i < NUM_GENERATORS; i++) { tmp = new Oscillator(); Oscs.Children.Add(tmp); mixer.Oscillators.Add(tmp); } } VB Private Sub Window_Loaded(ByVal sender As Object, ByVal e As System.Windows.RoutedEventArgs) Dim helper As New WindowInteropHelper(Application.Current.MainWindow) device.SetCooperativeLevel(helper.Handle, CooperativeLevel.Normal) waveFormat = New Microsoft.DirectX.DirectSound.WaveFormat() waveFormat.SamplesPerSecond = 44100 waveFormat.Channels = 2 waveFormat.FormatTag = WaveFormatTag.Pcm waveFormat.BitsPerSample = 16 waveFormat.BlockAlign = 4 waveFormat.AverageBytesPerSecond = 176400 bufferDesc = New BufferDescription(waveFormat) bufferDesc.DeferLocation = True bufferDesc.BufferBytes = Convert.ToInt32( bufferDurationSeconds * waveFormat.AverageBytesPerSecond / waveFormat.Channels) ' Add 3 oscillators Dim tmp As Oscillator For i As Integer = 0 To NUM_GENERATORS - 1 tmp = New Oscillator() Oscs.Children.Add(tmp) mixer.Oscillators.Add(tmp) Next End Sub When you click the Play button, the application takes its collection of oscillators and passes the values of the UI controls to the Mixer (which is initialized on each click with a reference to the main form window, so it can grab the Oscillator user controls). The mixer outputs an array of shorts, which we write to a DirectSound buffer. Here is the code for the Play button's click event handler: C# private void btnPlay_Click(object sender, System.Windows.RoutedEventArgs e) { mixer.Initialize(Application.Current.MainWindow); short[] sampleData = mixer.MixToStream(); buffer = new SecondaryBuffer(bufferDesc, device); buffer.Write(0, sampleData, LockFlag.EntireBuffer); buffer.Play(0, BufferPlayFlags.Default); GraphWaveform(sampleData); } VB Private Sub btnPlay_Click(sender As Object, e As System.Windows.RoutedEventArgs) mixer.Initialize(Application.Current.MainWindow) Dim sampleData As Short() = mixer.MixToStream() buffer = New SecondaryBuffer(bufferDesc, device) buffer.Write(0, sampleData, LockFlag.EntireBuffer) buffer.Play(0, BufferPlayFlags.[Default]) GraphWaveform(sampleData) End Sub Drawing Pretty GraphsAll that's left is to draw the graph of the waveform on the canvas. Below is the GraphWaveform method. This method could graph anything it wanted to, as long as it was an array of shorts (not trousers). It's reminiscent of trying to graph things using Flash back in the day, when you had to actually figure out points and lines (most likely on paper), but WPF's Polyline object makes this rather trivial. C# private void GraphWaveform(short[] data) { cvDrawingArea.Children.Clear(); double canvasHeight = cvDrawingArea.Height; double canvasWidth = cvDrawingArea.Width; int observablePoints = 1800; double xScale = canvasWidth / observablePoints; double yScale = (canvasHeight / (double)(amplitude * 2)) * ((double)amplitude / MAX_AMPLITUDE); Polyline graphLine = new Polyline(); graphLine.Stroke = Brushes.Black; graphLine.StrokeThickness = 1; for (int i = 0; i < observablePoints; i++) { graphLine.Points.Add( new Point(i * xScale, (canvasHeight / 2) - (data[i] * yScale) )); } cvDrawingArea.Children.Add(graphLine); } VB Private Sub GraphWaveform(ByVal data As Short()) cvDrawingArea.Children.Clear() Dim canvasHeight As Double = cvDrawingArea.Height Dim canvasWidth As Double = cvDrawingArea.Width Dim observablePoints As Integer = 1800 Dim xScale As Double = canvasWidth / observablePoints Dim yScale As Double = (canvasHeight / CDbl((amplitude * 2))) * (CDbl(amplitude) / MAX_AMPLITUDE) Dim graphLine As New Polyline() graphLine.Stroke = Brushes.Black graphLine.StrokeThickness = 1 For i As Integer = 0 To observablePoints - 1 graphLine.Points.Add( New Point(i * xScale, (canvasHeight / 2) - (data(i) * yScale))) Next cvDrawingArea.Children.Add(graphLine) End Sub ConclusionThis was a really fun little project that took way less time to code than it does to explain. It's a great exercise because it requires you to think about an ancillary field of science before you can sit down and code, which is really what coding for fun's all about, anyway! If you want to try this out, the download link for the source code is at the top of the article. About The AuthorDan Waters is an Academic Evangelist at Microsoft, covering schools in the Pacific Northwest, Alaska, and Hawaii. He is based in Bellevue, WA. Dan has way too many guitars at home and tries to entice both of his young daughters to learn how to play them. Music, technology, and music+technology are among his favorite hobbies, along with snowboarding and trying to maintain cool dad status. You can find his blog atwww. or follow him on Twitter at www.twitter.com/danwaters. |
|
来自: icecity1306 > 《开发资料》