Spectral entropy

Prior to calculating spectral entropy, the spectrum needs to be centroided, meaning that each fragment ion should only have one corresponding peak. When focusing on fragment ion information, it may be necessary to remove the precursor ion from the spectrum before performing the spectral entropy calculation.

The calculate_spectral_entropy function carries out the centroiding step and then computes the spectral entropy. Here’s an example:

import numpy as np
import ms_entropy as me

peaks = np.array([[69.071, 7.917962], [86.066, 1.021589], [86.0969, 100.0]], dtype=np.float32)

entropy = me.calculate_spectral_entropy(peaks, clean_spectrum = True, min_ms2_difference_in_da = 0.05)

print(f"Spectral entropy is {entropy}.")

If you want to separate the centroiding and entropy calculation steps, you can use the clean_spectrum and calculate_spectral_entropy functions respectively. Here’s how to do it:

import numpy as np
import ms_entropy as me

peaks = np.array([[69.071, 7.917962], [86.066, 1.021589], [86.0969, 100.0]], dtype=np.float32)

peaks = me.clean_spectrum(peaks, min_ms2_difference_in_da = 0.05)

entropy = me.calculate_spectral_entropy(peaks, clean_spectrum = False)

print(f"Spectral entropy is {entropy}.")

If your spectrum is already centroided, you can skip the clean_spectrum step and directly use the entropy function from scipy.stats to compute the spectral entropy. Here’s an example:

import numpy as np
import scipy.stats

peaks = np.array([[41.04, 37.16], [69.07, 66.83], [86.1, 999.0]], dtype = np.float32)

entropy = scipy.stats.entropy(peaks[:, 1])

print(f"Spectral entropy is {entropy}.")

References

ms_entropy.clean_spectrum(peaks, min_mz: float = -1.0, max_mz: float = -1.0, noise_threshold: float = 0.01, min_ms2_difference_in_da: float = 0.05, min_ms2_difference_in_ppm: float = -1.0, max_peak_num: int = -1, normalize_intensity: bool = True, **kwargs) ndarray[source]

Clean, centroid, and normalize a spectrum with the following steps:

  1. Remove empty peaks (m/z <= 0 or intensity <= 0).

  2. Remove peaks with m/z >= max_mz or m/z <= min_mz.

  3. Centroid the spectrum by merging peaks within min_ms2_difference_in_da.

  4. Remove peaks with intensity < noise_threshold * max_intensity.

  5. Keep only the top max_peak_num peaks.

  6. Normalize the intensity to sum to 1.

Parameters:
peaksnp.ndarray in shape (n_peaks, 2), dtype=np.float32 or list[list[float, float]]

A 2D array of shape (n_peaks, 2) where the first column is m/z and the second column is intensity.

min_mzfloat, optional

The minimum m/z to keep. Defaults to None, which will skip removing peaks with m/z <= min_mz.

max_mzfloat, optional

The maximum m/z to keep. Defaults to None, which will skip removing peaks with m/z >= max_mz.

noise_thresholdfloat, optional

The minimum intensity to keep. Defaults to 0.01, which will remove peaks with intensity < 0.01 * max_intensity.

min_ms2_difference_in_dafloat, optional

The minimum m/z difference between two peaks in the resulting spectrum. Defaults to 0.05, which will merge peaks within 0.05 Da. If a negative value is given, the min_ms2_difference_in_ppm will be used instead.

min_ms2_difference_in_ppmfloat, optional

The minimum m/z difference between two peaks in the resulting spectrum. Defaults to -1, which will use the min_ms2_difference_in_da instead. If a negative value is given, the min_ms2_difference_in_da will be used instead. ** Note either min_ms2_difference_in_da or min_ms2_difference_in_ppm must be positive. If both are positive, min_ms2_difference_in_ppm will be used. **

max_peak_numint, optional

The maximum number of peaks to keep. Defaults to None, which will keep all peaks.

normalize_intensitybool, optional

Whether to normalize the intensity to sum to 1. Defaults to True. If False, the intensity will be kept as is.

**kwargsoptional

Those keyword arguments will be ignored.

_

Returns:
np.ndarray in shape (n_peaks, 2), dtype=np.float32

The cleaned spectrum will be guaranteed to be sorted by m/z in ascending order.

ms_entropy.calculate_spectral_entropy(peaks, clean_spectrum=True, **kwargs) float[source]

Calculate the spectral entropy of a spectrum.

Parameters:
peaksnp.ndarray in shape (n_peaks, 2), np.float32 or list[list[float, float]]

The spectrum to calculate spectral entropy for. The first column is m/z, and the second column is intensity.

clean_spectrumbool, optional

Whether to clean the spectrum before calculating spectral entropy. Defaults to True. If the spectrum is already cleaned, set this to False to save time.

**kwargsoptional

The arguments and keyword arguments to pass to clean_spectrum().

_

Returns:
float

The spectral entropy of the spectrum.