Using the Microphone in Mobile Apps

Voice recording, real-time audio analysis, speech recognition and call-quality processing — all powered by the device microphone array.

Timothy Lindblom

Founder, Newly

The microphone is one of the most expressive sensors on a phone. Modern devices ship two to four microphones used for beamforming, noise suppression and voice isolation. Once you have permission, you can record voice memos, transcribe speech, build instrument tuners or run real-time audio analysis — all with battery-friendly APIs that have matured significantly across iOS and Android.

Record and Play Audio in Expo React NativeWatch on YouTube ↗

Key Takeaways

  • iOS uses AVAudioRecorder / AVAudioEngine; Android uses MediaRecorder / AudioRecord; Expo wraps both in expo-audio.
  • Always request runtime permission; both platforms now require a clear usage description.
  • Use 44.1 kHz mono PCM for speech, 48 kHz stereo AAC for music and richer audio.
  • Speech-to-text is built in on both platforms — no need to ship your own model for most apps.

Microphones at a Glance

2-4
Mics on a modern phone
48 kHz
Typical sample rate
< 50 ms
Round-trip latency (low-latency mode)
99%+
Devices supported

What It Is & How It Works

What it is. A small electret/MEMS sensor (often several) that converts sound pressure into a digital signal. Modern phones ship signal-processed audio paths optimised for voice, with raw "unprocessed" mode available on most devices.

How it works. You configure an audio session/recorder, request permission, then either save audio to a file (MediaRecorder/AVAudioRecorder) or stream PCM frames in real time (AudioRecord/AVAudioEngine).

Units & signal. Sample rate (Hz), bit depth (16 / 24 bit), number of channels, and decibel level (dBFS) for monitoring.

What You Can Build With It

Voice memos

Record and play back short audio notes with auto-trim.

Example: A note-taking app with attached voice clips.

Speech-to-text

Transcribe spoken input for search, dictation and accessibility.

Example: A messaging app with a "dictate" button.

Audio analysis

FFT-based pitch detection, decibel meters, instrument tuners.

Example: A guitar tuner that listens for the closest note.

Voice activity / loud event detection

Trigger actions when speech is detected or a loud noise occurs.

Example: A baby monitor app that pings the parent's phone.

Permissions & Setup

Both platforms always require an explicit user prompt the first time you start recording.

iOS · Info.plist

  • NSMicrophoneUsageDescription

Android · AndroidManifest.xml

  • android.permission.RECORD_AUDIO

Code Examples

Setup

  • Expo: `npx expo install expo-audio`
  • iOS: add NSMicrophoneUsageDescription to Info.plist
  • Android: add RECORD_AUDIO to AndroidManifest.xml and request at runtime
import { useAudioRecorder, RecordingPresets } from 'expo-audio';
import { useEffect } from 'react';
import { Button } from 'react-native';

export function VoiceMemo() {
  const recorder = useAudioRecorder(RecordingPresets.HIGH_QUALITY);

  useEffect(() => {
    return () => {
      if (recorder.isRecording) recorder.stop();
    };
  }, [recorder]);

  return (
    <Button
      title={recorder.isRecording ? 'Stop' : 'Record'}
      onPress={async () => {
        if (recorder.isRecording) {
          await recorder.stop();
          console.log('Saved to', recorder.uri);
        } else {
          await recorder.prepareToRecordAsync();
          recorder.record();
        }
      }}
    />
  );
}

Tip: With Newly, you describe the feature you want and the AI agent wires up the sensor, permissions, and UI for you. Try it free.

Best Practices

  • Stop recording when the app backgrounds

    Battery and privacy concerns make this essential; iOS will throttle you anyway.

  • Pick the right preset

    Use 16 kHz mono for speech-to-text; 44.1 kHz stereo AAC for music. Smaller files mean faster uploads.

  • Always show a recording indicator

    Both iOS (orange dot) and Android (status bar icon) show one anyway, but a clear in-app indicator builds trust.

  • Handle interruptions

    Phone calls, alarms and Bluetooth swaps interrupt recording; subscribe to the audio-session interruption notifications.

Common Pitfalls

Forgetting permission strings

Apps without `NSMicrophoneUsageDescription` are rejected at submission and crash on first use.

Mitigation: Add the string with a clear, plain-English explanation.

Recording while the screen is locked

iOS suspends the audio session; Android may kill background services.

Mitigation: Use background audio entitlements or foreground services for true background capture.

Choosing the wrong source

Android `VOICE_RECOGNITION` and `MIC` apply different DSP; the wrong one ruins music recordings.

Mitigation: Use `MIC` for general audio, `VOICE_RECOGNITION` for speech-to-text, `UNPROCESSED` for raw analysis.

When To Use It (And When Not To)

Good fit

  • Voice memos, voice notes, podcast recording
  • Speech-to-text and voice commands
  • Real-time audio analysis (tuners, meters)
  • Voice / video calling apps

Look elsewhere if…

  • Always-on background listening (battery + privacy)
  • Recording without explicit user trigger
  • Tasks better served by speech APIs (just use them)
  • Replacing professional studio capture

Frequently Asked Questions

What's the best library for cross-platform recording?

expo-audio is the new official recommendation; it replaces the older expo-av and supports both record and playback.

How do I do speech-to-text?

On iOS use the Speech framework (SFSpeechRecognizer); on Android use SpeechRecognizer or Google ML Kit. Both are free and on-device for short utterances.

Can I record while the screen is off?

Yes, with background audio entitlements on iOS and a foreground service on Android. Plan for OS-level kill behaviour anyway.

What sample rate should I use?

16 kHz mono for speech recognition, 44.1 kHz mono for voice memos, 48 kHz stereo for music or content creation.

Build with the Microphone on Newly

Ship a microphone-powered feature this week

Newly turns a description like “use the microphone to voice memos into a real React Native app — permissions, native modules and UI included. Full source code is yours, and you can publish to the App Store and Google Play directly from the dashboard.

Start Building Your App

Want a deeper dive on the underlying APIs? See Expo Sensors, Apple Core Motion and Android sensor framework.

Continue Learning