medkit.core.audio.audio_buffer#

Classes:

AudioBuffer(sample_rate, nb_samples, nb_channels)

Audio buffer base class.

FileAudioBuffer(path[, trim_start, ...])

Audio buffer giving access to audio files stored on the filesystem (to use when manipulating unmodified raw audio).

MemoryAudioBuffer(signal, sample_rate)

Audio buffer giving acces to signals stored in memory (to use when reading/writing a modified audio signal).

PlaceholderAudioBuffer(sample_rate, ...)

Placeholder representing a MemoryAudioBuffer for which we have lost the actual signal.

class AudioBuffer(sample_rate, nb_samples, nb_channels)[source]#

Audio buffer base class. Gives access to raw audio samples.

Parameters
  • sample_rate (int) – Sample rate of the signal, in samples per second.

  • nb_samples (int) – Duration of the signal in samples.

  • nb_channels (int) – Number of channels in the signal.

Attributes:

duration

Duration of the signal in seconds.

Methods:

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

read([copy])

Return the signal in the audio buffer.

trim(start, end)

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

trim_duration([start_time, end_time])

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds.

property duration: float#

Duration of the signal in seconds.

Return type

float

abstract read(copy=False)[source]#

Return the signal in the audio buffer.

Parameters

copy (bool) – If True, the returned array will be a copy that can be safely mutated.

Return type

ndarray

Returns

np.ndarray – Raw audio samples

abstract trim(start, end)[source]#

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

Parameters
  • start (Optional[int]) – Start sample of the new buffer (defaults to 0).

  • end (Optional[int]) – End sample of the new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.

trim_duration(start_time=None, end_time=None)[source]#

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds. Since start_time and end_time are in seconds, the exact trim boundaries will be rounded to the nearest sample and will therefore depend on the sampling rate.

Parameters
  • start_time (Optional[float]) – Start time of the new buffer (defaults to 0.0).

  • end_time (Optional[float]) – End time of thew new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

class FileAudioBuffer(path, trim_start=None, trim_end=None, sf_info=None)[source]#

Audio buffer giving access to audio files stored on the filesystem (to use when manipulating unmodified raw audio).

Supports all file formats handled by libsndfile (http://www.mega-nerd.com/libsndfile/#Features)

Parameters
  • path (Union[str, Path]) – Path to the audio file.

  • trim_start (Optional[int]) – First sample of audio file to consider.

  • trim_end (Optional[int]) – First sample of audio file to exclude.

  • sf_info (Optional[Any]) – Optional metadata dict returned by soundfile.

Methods:

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

read([copy])

Return the signal in the audio buffer.

trim([start, end])

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

trim_duration([start_time, end_time])

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds.

Attributes:

duration

Duration of the signal in seconds.

read(copy=False)[source]#

Return the signal in the audio buffer.

Parameters

copy (bool) – If True, the returned array will be a copy that can be safely mutated.

Return type

ndarray

Returns

np.ndarray – Raw audio samples

trim(start=None, end=None)[source]#

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

Parameters
  • start (Optional[int]) – Start sample of the new buffer (defaults to 0).

  • end (Optional[int]) – End sample of the new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.

property duration: float#

Duration of the signal in seconds.

Return type

float

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

trim_duration(start_time=None, end_time=None)#

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds. Since start_time and end_time are in seconds, the exact trim boundaries will be rounded to the nearest sample and will therefore depend on the sampling rate.

Parameters
  • start_time (Optional[float]) – Start time of the new buffer (defaults to 0.0).

  • end_time (Optional[float]) – End time of thew new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.

class MemoryAudioBuffer(signal, sample_rate)[source]#

Audio buffer giving acces to signals stored in memory (to use when reading/writing a modified audio signal).

Parameters
  • signal (ndarray) – Samples constituting the audio signal, with shape (nb_channel, nb_samples).

  • sample_rate (int) – Sample rate of the signal, in samples per second.

Methods:

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

read([copy])

Return the signal in the audio buffer.

trim([start, end])

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

trim_duration([start_time, end_time])

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds.

Attributes:

duration

Duration of the signal in seconds.

read(copy=False)[source]#

Return the signal in the audio buffer.

Parameters

copy (bool) – If True, the returned array will be a copy that can be safely mutated.

Return type

ndarray

Returns

np.ndarray – Raw audio samples

trim(start=None, end=None)[source]#

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

Parameters
  • start (Optional[int]) – Start sample of the new buffer (defaults to 0).

  • end (Optional[int]) – End sample of the new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.

property duration: float#

Duration of the signal in seconds.

Return type

float

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

trim_duration(start_time=None, end_time=None)#

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds. Since start_time and end_time are in seconds, the exact trim boundaries will be rounded to the nearest sample and will therefore depend on the sampling rate.

Parameters
  • start_time (Optional[float]) – Start time of the new buffer (defaults to 0.0).

  • end_time (Optional[float]) – End time of thew new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.

class PlaceholderAudioBuffer(sample_rate, nb_samples, nb_channels)[source]#

Placeholder representing a MemoryAudioBuffer for which we have lost the actual signal.

This class is only here so that MemoryAudioBuffer objects can be converted into json/yaml serializable dicts and then unserialized, but no further processing can be performed since the actual signal is not saved. Calling :meth`~read()` or :meth`~.trim()` will raise.

Parameters
  • sample_rate (int) – Sample rate of the signal, in samples per second.

  • nb_samples (int) – Duration of the signal in samples.

  • nb_channels (int) – Number of channels in the signal.

Methods:

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

read([copy])

Return the signal in the audio buffer.

trim(start, end)

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

trim_duration([start_time, end_time])

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds.

Attributes:

duration

Duration of the signal in seconds.

read(copy=False)[source]#

Return the signal in the audio buffer.

Parameters

copy (bool) – If True, the returned array will be a copy that can be safely mutated.

Return type

ndarray

Returns

np.ndarray – Raw audio samples

trim(start, end)[source]#

Return a new audio buffer pointing to portion of the signal in the original buffer, using boundaries in samples.

Parameters
  • start (Optional[int]) – Start sample of the new buffer (defaults to 0).

  • end (Optional[int]) – End sample of the new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.

property duration: float#

Duration of the signal in seconds.

Return type

float

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

trim_duration(start_time=None, end_time=None)#

Return a new audio buffer pointing to a portion of the signal in the original buffer, using boundaries in seconds. Since start_time and end_time are in seconds, the exact trim boundaries will be rounded to the nearest sample and will therefore depend on the sampling rate.

Parameters
  • start_time (Optional[float]) – Start time of the new buffer (defaults to 0.0).

  • end_time (Optional[float]) – End time of thew new buffer, excluded (default to full duration).

Return type

AudioBuffer

Returns

AudioBuffer – Trimmed audio buffer with new start and end samples, of same type as original audio buffer.