Using the Microphone in Mobile Apps
Voice recording, real-time audio analysis, speech recognition and call-quality processing — all powered by the device microphone array.
The microphone is one of the most expressive sensors on a phone. Modern devices ship two to four microphones used for beamforming, noise suppression and voice isolation. Once you have permission, you can record voice memos, transcribe speech, build instrument tuners or run real-time audio analysis — all with battery-friendly APIs that have matured significantly across iOS and Android.
Key Takeaways
- iOS uses AVAudioRecorder / AVAudioEngine; Android uses MediaRecorder / AudioRecord; Expo wraps both in expo-audio.
- Always request runtime permission; both platforms now require a clear usage description.
- Use 44.1 kHz mono PCM for speech, 48 kHz stereo AAC for music and richer audio.
- Speech-to-text is built in on both platforms — no need to ship your own model for most apps.
Microphones at a Glance
What It Is & How It Works
What it is. A small electret/MEMS sensor (often several) that converts sound pressure into a digital signal. Modern phones ship signal-processed audio paths optimised for voice, with raw "unprocessed" mode available on most devices.
How it works. You configure an audio session/recorder, request permission, then either save audio to a file (MediaRecorder/AVAudioRecorder) or stream PCM frames in real time (AudioRecord/AVAudioEngine).
Units & signal. Sample rate (Hz), bit depth (16 / 24 bit), number of channels, and decibel level (dBFS) for monitoring.
What You Can Build With It
Voice memos
Record and play back short audio notes with auto-trim.
Example: A note-taking app with attached voice clips.
Speech-to-text
Transcribe spoken input for search, dictation and accessibility.
Example: A messaging app with a "dictate" button.
Audio analysis
FFT-based pitch detection, decibel meters, instrument tuners.
Example: A guitar tuner that listens for the closest note.
Voice activity / loud event detection
Trigger actions when speech is detected or a loud noise occurs.
Example: A baby monitor app that pings the parent's phone.
Permissions & Setup
Both platforms always require an explicit user prompt the first time you start recording.
iOS · Info.plist
NSMicrophoneUsageDescription
Android · AndroidManifest.xml
android.permission.RECORD_AUDIO
Code Examples
Setup
- Expo: `npx expo install expo-audio`
- iOS: add NSMicrophoneUsageDescription to Info.plist
- Android: add RECORD_AUDIO to AndroidManifest.xml and request at runtime
import { useAudioRecorder, RecordingPresets } from 'expo-audio';
import { useEffect } from 'react';
import { Button } from 'react-native';
export function VoiceMemo() {
const recorder = useAudioRecorder(RecordingPresets.HIGH_QUALITY);
useEffect(() => {
return () => {
if (recorder.isRecording) recorder.stop();
};
}, [recorder]);
return (
<Button
title={recorder.isRecording ? 'Stop' : 'Record'}
onPress={async () => {
if (recorder.isRecording) {
await recorder.stop();
console.log('Saved to', recorder.uri);
} else {
await recorder.prepareToRecordAsync();
recorder.record();
}
}}
/>
);
}Tip: With Newly, you describe the feature you want and the AI agent wires up the sensor, permissions, and UI for you. Try it free.
Best Practices
Stop recording when the app backgrounds
Battery and privacy concerns make this essential; iOS will throttle you anyway.
Pick the right preset
Use 16 kHz mono for speech-to-text; 44.1 kHz stereo AAC for music. Smaller files mean faster uploads.
Always show a recording indicator
Both iOS (orange dot) and Android (status bar icon) show one anyway, but a clear in-app indicator builds trust.
Handle interruptions
Phone calls, alarms and Bluetooth swaps interrupt recording; subscribe to the audio-session interruption notifications.
Common Pitfalls
Forgetting permission strings
Apps without `NSMicrophoneUsageDescription` are rejected at submission and crash on first use.
Mitigation: Add the string with a clear, plain-English explanation.
Recording while the screen is locked
iOS suspends the audio session; Android may kill background services.
Mitigation: Use background audio entitlements or foreground services for true background capture.
Choosing the wrong source
Android `VOICE_RECOGNITION` and `MIC` apply different DSP; the wrong one ruins music recordings.
Mitigation: Use `MIC` for general audio, `VOICE_RECOGNITION` for speech-to-text, `UNPROCESSED` for raw analysis.
When To Use It (And When Not To)
Good fit
- Voice memos, voice notes, podcast recording
- Speech-to-text and voice commands
- Real-time audio analysis (tuners, meters)
- Voice / video calling apps
Look elsewhere if…
- Always-on background listening (battery + privacy)
- Recording without explicit user trigger
- Tasks better served by speech APIs (just use them)
- Replacing professional studio capture
Frequently Asked Questions
What's the best library for cross-platform recording?
expo-audio is the new official recommendation; it replaces the older expo-av and supports both record and playback.
How do I do speech-to-text?
On iOS use the Speech framework (SFSpeechRecognizer); on Android use SpeechRecognizer or Google ML Kit. Both are free and on-device for short utterances.
Can I record while the screen is off?
Yes, with background audio entitlements on iOS and a foreground service on Android. Plan for OS-level kill behaviour anyway.
What sample rate should I use?
16 kHz mono for speech recognition, 44.1 kHz mono for voice memos, 48 kHz stereo for music or content creation.
Build with the Microphone on Newly
Ship a microphone-powered feature this week
Newly turns a description like “use the microphone to voice memos” into a real React Native app — permissions, native modules and UI included. Full source code is yours, and you can publish to the App Store and Google Play directly from the dashboard.
Want a deeper dive on the underlying APIs? See Expo Sensors, Apple Core Motion and Android sensor framework.
