Peptidoform
Defines the Peptide
class and associated utilities for handling peptidoforms.
This module provides a Peptide
class for representing modified peptide sequences,
and their site localization probabilities. It offers methods to access and manipulate
peptide information, summarize isoform probabilities, and retrieve modification sites.
Additionally, it includes utility functions for parsing modified sequence strings and
converting site localization probabilities to and from a standardized string format.
Classes:
Name | Description |
---|---|
Peptide |
Representation of a peptide sequence identified by mass spectrometry. |
Functions:
Name | Description |
---|---|
parse_modified_sequence |
Returns the plain sequence and a list of modification positions and tags. |
modify_peptide |
Returns a string containing the modifications within the peptide sequence. |
make_localization_string |
Generates a site localization probability string. |
read_localization_string |
Converts a site localization probability string into a dictionary. |
Peptide
Peptide(
modified_sequence: str,
localization_probabilities: Optional[
dict[str, dict[int, float]]
] = None,
protein_position: Optional[int] = None,
)
Representation of a peptide sequence identified by mass spectrometry.
Methods:
Name | Description |
---|---|
make_modified_sequence |
Returns a modified sequence string. |
count_modification |
Returns how often the a specified modification occurs. |
isoform_probability |
Calculates the isoform probability for a given modification. |
get_peptide_site_probability |
Return the modification localization probability of the peptide position. |
get_protein_site_probability |
Return the modification localization probability of the protein position. |
list_modified_peptide_sites |
Returns a list of peptide positions containing the specified modification. |
list_modified_protein_sites |
Returns a list of protein positions containing the specified modification. |
Source code in msreport\peptidoform.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
make_modified_sequence
Returns a modified sequence string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
include
|
Optional[list[str]]
|
Optional, list of modifications that are included in the modified sequence string. By default all modifications are added. |
None
|
Returns:
Type | Description |
---|---|
str
|
A modified sequence string where modified amino acids are indicated by |
str
|
square brackets containing a modification tag. For example |
str
|
"PEPT[phospho]IDE" |
Source code in msreport\peptidoform.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
count_modification
Returns how often the a specified modification occurs.
Source code in msreport\peptidoform.py
61 62 63 64 65 |
|
isoform_probability
Calculates the isoform probability for a given modification.
Returns:
Type | Description |
---|---|
float | None
|
The isoform probability for the combination of the assigned modification |
float | None
|
sites. Calculated as the product of the single modification localization |
float | None
|
probabilities. If no localization exist for the specified 'modification', |
float | None
|
None is returned. |
Source code in msreport\peptidoform.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
|
get_peptide_site_probability
Return the modification localization probability of the peptide position.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
position
|
int
|
Peptide position which modification localization probability is returned. |
required |
Returns:
Type | Description |
---|---|
float | None
|
Localization probability between 0 and 1. Returns None if the specified |
float | None
|
position does not contain a modification or if no localization probability |
float | None
|
is available. |
Source code in msreport\peptidoform.py
84 85 86 87 88 89 90 91 92 93 94 95 96 |
|
get_protein_site_probability
Return the modification localization probability of the protein position.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
position
|
int
|
Protein position which modification localization probability is returned. |
required |
Returns:
Type | Description |
---|---|
float | None
|
Localization probability between 0 and 1. Returns None if the specified |
float | None
|
position does not contain a modification or if no localization probability |
float | None
|
is available. |
Source code in msreport\peptidoform.py
98 99 100 101 102 103 104 105 106 107 108 109 110 |
|
list_modified_peptide_sites
Returns a list of peptide positions containing the specified modification.
Source code in msreport\peptidoform.py
112 113 114 |
|
list_modified_protein_sites
Returns a list of protein positions containing the specified modification.
Source code in msreport\peptidoform.py
116 117 118 |
|
parse_modified_sequence
parse_modified_sequence(
modified_sequence: str, tag_open: str, tag_close: str
) -> tuple[str, list[tuple[int, str]]]
Returns the plain sequence and a list of modification positions and tags.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
modified_sequence
|
str
|
Peptide sequence containing modifications. |
required |
tag_open
|
str
|
Symbol that indicates the beginning of a modification tag, e.g. "[". |
required |
tag_close
|
str
|
Symbol that indicates the end of a modification tag, e.g. "]". |
required |
Returns:
Type | Description |
---|---|
str
|
A tuple containing the plain sequence as a string and a sorted list of |
list[tuple[int, str]]
|
modification tuples, each containing the position and modification tag |
tuple[str, list[tuple[int, str]]]
|
(excluding the tag_open and tag_close symbols). |
Source code in msreport\peptidoform.py
172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
|
modify_peptide
modify_peptide(
sequence: str,
modifications: list[tuple[int, str]],
tag_open: str = "[",
tag_close: str = "]",
) -> str
Returns a string containing the modifications within the peptide sequence.
Returns:
Type | Description |
---|---|
str
|
Modified sequence. For example "PEPT[phospho]IDE", for sequence = "PEPTIDE" and |
str
|
modifications = [(4, "phospho")] |
Source code in msreport\peptidoform.py
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
|
make_localization_string
make_localization_string(
localization_probabilities: dict[str, dict[int, float]],
decimal_places: int = 3,
) -> str
Generates a site localization probability string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
localization_probabilities
|
dict[str, dict[int, float]]
|
A dictionary in the form {"modification tag": {position: probability}}, where positions are integers and probabilitiesa are floats ranging from 0 to 1. |
required |
decimal_places
|
int
|
Number of decimal places used for the probabilities, default 3. |
3
|
Returns:
Type | Description |
---|---|
str
|
A site localization probability string according to the MsReport convention. |
str
|
Multiple modifications entries are separted by ";". Each modification entry |
str
|
consist of a modification tag and site probabilities, separated by "@". The |
str
|
site probability entries consist of f"{position}:{probability}" strings, and |
str
|
multiple probability entries are separted by ",". |
str
|
For example "15.9949@11:1.000;79.9663@3:0.200,4:0.800" |
Source code in msreport\peptidoform.py
236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 |
|
read_localization_string
Converts a site localization probability string into a dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
localization_string
|
str
|
A site localization probability string according to the MsReport convention. Can contain information about multiple modifications, which are separted by ";". Each modification entry consist of a modification tag and site probabilities, separated by "@". The site probability entries consist of f"{peptide position}:{localization probability}" strings, and multiple entries are separted by ",". For example "15.9949@11:1.000;79.9663@3:0.200,4:0.800" |
required |
Returns:
Type | Description |
---|---|
dict[str, dict[int, float]]
|
A dictionary in the form {"modification tag": {position: probability}}, where |
dict[str, dict[int, float]]
|
positions are integers and probabilitiesa are floats ranging from 0 to 1. |
Source code in msreport\peptidoform.py
268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 |
|