音訊取樣
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
從 Android 5.0 (Lollipop) 開始,音訊重新取樣器現在完全是以 FIR 濾波器為基礎,而此濾波器衍生自 Kaiser 窗型 sinc 函式。Kaiser 窗型 sinc 函式具有以下屬性:
- 可以輕鬆地計算設計參數 (抑制頻帶波紋、轉換頻寬、截止頻率、濾波器長度)。
- 相對於整體能量而言,Kaiser 窗型 sinc 函式幾乎是減少抑制頻帶能量的最佳選擇。
請參閱 P.P. Vaidyanathan 撰寫的 Multirate Systems and Filter Bank 第 50 頁,瞭解凱斯視窗、凱斯視窗最佳性,以及此視窗與長球面視窗間的關係。
設計參數會根據內部品質決策和所需的取樣率自動計算。窗型 sinc 濾波器則會根據設計參數產生。如果是音樂用途,則比起任意頻率轉換,產生的 44.1 到 48 kHz (反之亦然) 的重新取樣器品質更佳。
為達到此品質目標,音訊重新取樣器提供更優質的品質,以及更快的速度。不過,重新取樣器可能會導入少量的頻帶波紋和頻疊諧波噪音,並有可能在轉換頻帶中造成高頻率損耗,因此,如果不是必要情況,請避免使用重新取樣器。
取樣和重新取樣的最佳做法
本節提供一些最佳做法,可幫助您避免取樣率問題。
選擇適合裝置的取樣率
一般來說,最好選擇適合裝置的取樣率 (通常是 44.1 kHz 或 48 kHz)。使用大於 48 kHz 的取樣率通常會導致品質降低,這是因為必須透過重新取樣器播放檔案。
使用簡單的重新取樣率 (固定與內插式多相位)
重新取樣器會以下列其中一種模式運作:
- 固定多相位模式。每個多相位的濾波器係數已預先計算。
- 內插式多相位模式。每個多相位的濾波器係數必須從最接近的兩個預先計算的多相位插入。
在固定多相位模式下,重新取樣器的速度最快,這時候輸入速率與輸出速率的比率 L/M (除以最大公約數) 中的 M 小於 256。例如,對於 44,100 至 48,000 的轉換,L = 147,M = 160。
在固定多相位模式下,取樣率會是固定值,不會改變。在內插式多相位模式下,取樣率為近似值。在 48 kHz 裝置上播放時,取樣率偏移通常是幾小時一個樣本。這通常不是問題,因為近似誤差遠低於內部石英振盪器、熱漂移或時基誤差 (通常是數十 ppm) 造成的頻率誤差。
在 48 kHz 裝置上播放時,請選擇 24 kHz (1:2) 和 32 kHz (2:3) 等簡單比率的取樣率 (即使 AudioTrack 可能允許其他取樣率和比率)。
透過上取樣 (而非降低取樣) 來變更取樣率
可即時變更取樣率。這類變更的精細程度是以內部緩衝處理作業 (通常是數百個樣本) 為基礎,而非逐個樣本變更。這可用於音效。
降低取樣率時,請勿動態變更取樣率。如果在建立音軌後變更取樣率,則在降低取樣率時若與原始取樣率有約 5% 至 10% 的差異,可能會觸發濾波器重新計算 (以利妥善抑制頻疊情況)。這會耗費計算資源,而且如果立即更換濾波器,還可能會聽到咔的一聲。
將降低的取樣率限制在 6:1 以下
降低取樣通常是由硬體裝置要求觸發。如果在降低取樣率時採用取樣率轉換器,為獲得良好的頻疊抑制效果,請嘗試將降低的取樣率限制在 6:1 以下 (例如,低於 48,000:8,000 的降低的取樣率)。為符合降低的取樣率,濾波器長度會隨之調整,但是如果降低的取樣率較高,便須犧牲更多轉換頻寬,以免濾波器長度過度地增加。上取樣則沒有類似的頻疊問題。請注意,音訊管道的某些部分可能會不允許降低的取樣率大於 2:1。
如果您有延遲時間上的考量,請勿重新取樣
重新取樣可避免將音軌置於 FastMixer 路徑中,這會讓延遲時間大幅增加,因為一般 Mixer 路徑中有其他更大的緩衝區。此外,重新取樣器的濾波器長度有隱式延遲,不過這類延遲通常為一毫秒或更短,沒有一般 Mixer 路徑的額外緩衝處理時間 (通常為 20 毫秒) 長。
使用浮點音訊
如果使用浮點數表示音訊資料,可大幅提升高效能音訊應用程式的音質。浮點具有以下優勢:
- 更大的動態範圍。
- 動態範圍中一致的準確率。
- 更多預留空間,可避免在中間計算和暫態期間發生截斷。
雖然浮點可提升音質,但也有一些缺點:
- 浮點數會占用更多記憶體。
- 浮點運算具有非預期屬性,例如加法不遵守結合律。
- 浮點計算有時會因為四捨五入,或採用數值不穩定的演算法而導致計算精準度變差。
- 如要有效使用浮點,需要進一步瞭解相關知識,才能取得準確且可再現的結果。
過去,浮點因為無法使用或速度太慢而備受批評,如今,低階和嵌入式處理器仍有這些問題。但是,對於現代行動裝置上的處理器來說,硬體浮點的效能已經與整數相近 (某些情況下甚至更快)。現代 CPU 還支援 SIMD (單指令多資料),這種技術可進一步改善效能。
浮點音訊的最佳做法
以下提供最佳做法,有助於您避免浮點計算相關問題:
- 對於不常執行的計算 (例如計算濾波器係數),請使用雙精度浮點。
- 留意運算順序。
- 為中間值宣告明確的變數。
- 大量使用括號。
- 如果收到 NaN 或無限大結果,請使用二進位搜尋從運算中找出導致這種情況的位置。
對於浮點音訊,音訊格式編碼 AudioFormat.ENCODING_PCM_FLOAT
的使用方式類似於使用 ENCODING_PCM_16_BIT
或 ENCODING_PCM_8_BIT
指定 AudioTrack 資料格式。對應的過載方法 AudioTrack.write()
會使用浮點陣列來提供資料。
Kotlin
fun write(
audioData: FloatArray,
offsetInFloats: Int,
sizeInFloats: Int,
writeMode: Int
): Int
Java
public int write(float[] audioData,
int offsetInFloats,
int sizeInFloats,
int writeMode)
更多資訊
本節提供與取樣和浮點相關的其他資源。
取樣
取樣率
重新取樣
高位元深度與高 kHz 爭論
浮點
以下維基百科頁面對於理解浮點音訊相當實用:
以下文章介紹浮點在哪些層面,對電腦系統設計人員有直接影響:
這個頁面中的內容和程式碼範例均受《內容授權》中的授權所規範。Java 與 OpenJDK 是 Oracle 和/或其關係企業的商標或註冊商標。
上次更新時間:2025-07-26 (世界標準時間)。
[null,null,["上次更新時間:2025-07-26 (世界標準時間)。"],[],[],null,["# Sampling audio\n\nAs of Android 5.0 (Lollipop), the audio resamplers are now entirely based\non FIR filters derived from a Kaiser windowed-sinc function. The Kaiser windowed-sinc\noffers the following properties:\n\n- It is straightforward to calculate for its design parameters (stopband ripple, transition bandwidth, cutoff frequency, filter length).\n- It is nearly optimal for reduction of stopband energy compared to overall energy.\n\nSee P.P. Vaidyanathan, [*Multirate Systems and Filter Banks*](https://books.google.com/books/about/Multirate_Systems_and_Filter_Banks.html?id=pAsfAQAAIAAJ), p. 50 for discussions of the\nKaiser Window and its optimality and relationship to Prolate Spheroidal\nWindows.\n\nThe design parameters are automatically computed based on internal\nquality determination and the sampling ratios desired. Based on the\ndesign parameters, the windowed-sinc filter is generated. For music use,\nthe resampler for 44.1 to 48 kHz and vice versa is generated at a higher\nquality than for arbitrary frequency conversion.\n\nThe audio resamplers provide increased quality, as well as speed\nto achieve that quality. But resamplers can introduce small amounts\nof passband ripple and aliasing harmonic noise, and they can cause some high\nfrequency loss in the transition band, so avoid using them unnecessarily. \n\nBest practices for sampling and resampling\n------------------------------------------\n\nThis section describes some best practices to help you avoid problems with sampling rates.\n\n#### Choose the sampling rate to fit the device\n\nIn general, it is best to choose the sampling rate to fit the device,\ntypically 44.1 kHz or 48 kHz. Use of a sample rate greater than\n48 kHz will typically result in decreased quality because a resampler must be\nused to play back the file.\n\n### Use simple resampling\nratios (fixed versus interpolated polyphases)\n\nThe resampler operates in one of the following modes:\n\n- Fixed polyphase mode. The filter coefficients for each polyphase are precomputed.\n- Interpolated polyphase mode. The filter coefficients for each polyphase must be interpolated from the nearest two precomputed polyphases.\n\nThe resampler is fastest in fixed polyphase mode, when the ratio of input\nrate over output rate L/M (taking out the greatest common divisor)\nhas M less than 256. For example, for 44,100 to 48,000 conversion, L = 147,\nM = 160.\n\nIn fixed polyphase mode, the sampling rate is locked and does not change. In interpolated\npolyphase mode, the sampling rate is approximate. When playing on a 48-kHz device the sampling rate\ndrift is generally one sample over a few hours. This is not usually a concern because the\napproximation error is much less than the frequency error contributed by internal quartz\noscillators, thermal drift, or jitter (typically tens of ppm).\n\nChoose simple-ratio sampling rates such as 24 kHz (1:2) and 32 kHz (2:3) when playing back\non a 48-kHz device, even though other sampling\nrates and ratios may be permitted through AudioTrack.\n\n### Use upsampling rather\nthan downsampling to change sample rates\n\nSampling rates can be changed on the fly. The granularity of\nsuch change is based on the internal buffering (typically a few hundred\nsamples), not on a sample-by-sample basis. This can be used for effects.\n\nDo not dynamically change sampling rates when\ndownsampling. When changing sample rates after an audio track is\ncreated, differences of around 5 to 10 percent from the original rate may\ntrigger a filter recomputation when downsampling (to properly suppress\naliasing). This can consume computing resources and may cause an audible click\nif the filter is replaced in real time.\n\n### Limit downsampling to no more than 6:1\n\nDownsampling is typically triggered by hardware device requirements. When the\nSample Rate converter is used for downsampling,\ntry to limit the downsampling ratio to no more than 6:1 for good aliasing\nsuppression (for example, no greater downsample than 48,000 to 8,000). The filter\nlengths adjust to match the downsampling ratio, but you sacrifice more\ntransition bandwidth at higher downsampling ratios to avoid excessively\nincreasing the filter length. There are no similar aliasing concerns for\nupsampling. Note that some parts of the audio pipeline\nmay prevent downsampling greater than 2:1.\n\n### If you're concerned about latency, don't resample\n\nResampling prevents the track from being placed in the FastMixer\npath, which means that significantly higher latency occurs due to the additional,\nlarger buffer in the ordinary Mixer path. Furthermore,\nthere is an implicit delay from the filter length of the resampler,\nthough this is typically on the order of one millisecond or less,\nwhich is not as large as the additional buffering for the ordinary Mixer path\n(typically 20 milliseconds).\n\nUse of floating-point audio\n---------------------------\n\nUsing floating-point numbers to represent audio data can significantly enhance audio\nquality in high-performance audio applications. Floating point offers the following\nadvantages:\n\n- Wider dynamic range.\n- Consistent accuracy across the dynamic range.\n- More headroom to avoid clipping during intermediate calculations and transients.\n\nWhile floating-point can enhance audio quality, it does present certain disadvantages:\n\n- Floating-point numbers use more memory.\n- Floating-point operations employ unexpected properties, for example, addition is not associative.\n- Floating-point calculations can sometimes lose arithmetic precision due to rounding or numerically unstable algorithms.\n- Using floating-point effectively requires greater understanding to achieve accurate and reproducible results.\n\n\nFormerly, floating-point was notorious for being unavailable or slow. This is\nstill true for low-end and embedded processors. But processors on modern\nmobile devices now have hardware floating-point with performance that is\nsimilar (or in some cases even faster) than integer. Modern CPUs also support\n[SIMD](https://en.wikipedia.org/wiki/SIMD)\n(Single instruction, multiple data), which can improve performance further.\n\n### Best practices for floating-point audio\n\nThe following best practices help you avoid problems with floating-point calculations:\n\n- Use double precision floating-point for infrequent calculations, such as computing filter coefficients.\n- Pay attention to the order of operations.\n- Declare explicit variables for intermediate values.\n- Use parentheses liberally.\n- If you get a NaN or infinity result, use binary search to discover where it was introduced.\n\nFor floating-point audio, the audio format encoding\n`AudioFormat.ENCODING_PCM_FLOAT` is used similarly to\n`ENCODING_PCM_16_BIT` or `ENCODING_PCM_8_BIT` for specifying\nAudioTrack data\nformats. The corresponding overloaded method `AudioTrack.write()`\ntakes in a float array to deliver data. \n\n### Kotlin\n\n```kotlin\nfun write(\n audioData: FloatArray,\n offsetInFloats: Int,\n sizeInFloats: Int,\n writeMode: Int\n): Int\n```\n\n### Java\n\n```java\npublic int write(float[] audioData,\n int offsetInFloats,\n int sizeInFloats,\n int writeMode)\n```\n\nFor more information\n--------------------\n\nThis section lists some additional resources about sampling and floating-point.\n\n### Sampling\n\nSample rates\n\n- [Sampling (signal processing)](https://en.wikipedia.org/wiki/Sampling_%28signal_processing%29) at Wikipedia.\n\nResampling\n\n- [Sample-rate conversion](https://en.wikipedia.org/wiki/Sample_rate_conversion) at Wikipedia.\n- [Sample Rate Conversion](https://source.android.com/devices/audio/src.html) at source.android.com.\n\nThe high bit-depth and high kHz controversy\n\n- [D/A and A/D \\| Digital Show and Tell](https://www.youtube.com/watch?v=cIQ9IXSUzuM) video by Christopher \"Monty\" Montgomery of Xiph.Org.\n- [The Science of Sample Rates (When Higher Is Better - And When It Isn't)](http://www.trustmeimascientist.com/2013/02/04/the-science-of-sample-rates-when-higher-is-better-and-when-it-isnt/).\n- [Audio Myths \\& DAW Wars](http://www.image-line.com/support/FLHelp/html/app_audio.htm)\n- [192kHz/24bit vs. 96kHz/24bit \"debate\"- Interesting revelation](http://forums.stevehoffman.tv/threads/192khz-24bit-vs-96khz-24bit-debate-interesting-revelation.317660/)\n\n### Floating point\n\nThe following Wikipedia pages are helpful in understanding floating-point audio:\n\n- [Audio bit depth](https://en.wikipedia.org/wiki/Audio_bit_depth)\n- [Floating-point arithmetic](https://en.wikipedia.org/wiki/Floating_point)\n- [IEEE 754 floating-point](https://en.wikipedia.org/wiki/IEEE_floating_point)\n- [Loss of significance](https://en.wikipedia.org/wiki/Loss_of_significance) (catastrophic cancellation)\n- [Numerical stability](https://en.wikipedia.org/wiki/Numerical_stability)\n\nThe following article provides information on those aspects of floating-point that have a\ndirect impact on designers of computer systems:\n\n- [What every\n computer scientist should know about floating-point arithmetic](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) by David Goldberg, Xerox PARC (edited reprint)."]]