Set up audio drivers

We use pulseaudio server as default Reference

Install audio drivers and dependencies

RUN apt-get -y install libportaudio2
RUN apt-get -y install alsa-base 
RUN apt-get -y install alsa-utils #for playing sound files with aplay 
RUN apt-get -y install pulseaudio

Set PULSE_SERVER environment variable, mount the corresponding directory. Mount the cookie directory as well, without which we get alsa permission denied error

echo $XDG_RUNTIME_DIR
'/run/user/1000'
docker run -it \
        -e PULSE_SERVER=unix:${XDG_RUNTIME_DIR}/pulse/native \ #Set environment variable 
        -v ${XDG_RUNTIME_DIR}/pulse/native:${XDG_RUNTIME_DIR}/pulse/native \ #Mount directory
        -v ~/.config/pulse/cookie:/root/.config/pulse/cookie #Set up cookie

Check setup (in the container)

aplay -L | head -n9

Output :

default
    Playback/recording through the PulseAudio sound server
    ...

Check if the speaker works when you play an audio file

aplay sample.wav

Record and play using sounddevice module

In [8]:
import sounddevice as sd

Detecting the device

Make sure --device /dev/snd is added while executing docker run

In [9]:
sd.query_devices()
Out[9]:
   0 HDA Intel PCH: ALC887-VD Analog (hw:0,0), ALSA (2 in, 2 out)
   1 HDA Intel PCH: ALC887-VD Digital (hw:0,1), ALSA (0 in, 2 out)
   2 HDA Intel PCH: ALC887-VD Alt Analog (hw:0,2), ALSA (2 in, 0 out)
   3 HDA NVidia: HDMI 0 (hw:1,3), ALSA (0 in, 2 out)
   4 HDA NVidia: HDMI 1 (hw:1,7), ALSA (0 in, 8 out)
   5 HDA NVidia: HDMI 2 (hw:1,8), ALSA (0 in, 8 out)
   6 HDA NVidia: HDMI 3 (hw:1,9), ALSA (0 in, 8 out)
   7 Webcam C170: USB Audio (hw:2,0), ALSA (1 in, 0 out)
   8 sysdefault, ALSA (128 in, 128 out)
   9 front, ALSA (0 in, 2 out)
  10 surround40, ALSA (0 in, 2 out)
  11 surround51, ALSA (0 in, 2 out)
  12 surround71, ALSA (0 in, 2 out)
  13 iec958, ALSA (0 in, 2 out)
  14 spdif, ALSA (0 in, 2 out)
  15 pulse, ALSA (32 in, 32 out)
  16 dmix, ALSA (0 in, 2 out)
* 17 default, ALSA (32 in, 32 out)
In [10]:
fs=44100
duration = 5  # seconds
In [11]:
myrecording = sd.rec(duration * fs, samplerate=fs, channels=2,dtype='float64') #pass device=.. to select device
sd.wait()
In [12]:
myrecording
Out[12]:
array([[0.00576782, 0.00576782],
       [0.        , 0.        ],
       [0.        , 0.        ],
       ...,
       [0.0027771 , 0.0027771 ],
       [0.0012207 , 0.0012207 ],
       [0.00167847, 0.00167847]])
In [13]:
sd.play(myrecording, fs) #pass device=.. to select device
sd.wait()
In [14]:
myrecording.shape #fs*5 = 220500
Out[14]:
(220500, 2)