Qtable
Defines the Qtable
class, the central container for quantitative proteomics data.
The Qtable
class serves as the standardized data structure for msreport
,
storing a main table with quantitative values and associated metadata for its entries;
it also maintains the name of the unique ID column for the main table. Additionally,
it stores an experimental design table that links sample names to experimental
conditions and replicate information.
Qtable
provides convenience methods for creating subtables and accessing design
related information (e.g., samples per experiment), and instances of Qtable
can be
easily saved to disk and loaded back. As the central data container, the Qtable
facilitates seamless integration with the high-level modules analyze
, plot
and
export
, which all directly operate on Qtable
instances.
Classes:
Name | Description |
---|---|
Qtable |
Stores and provides access to quantitative proteomics data in a tabular form. |
Qtable
Qtable(data: DataFrame, design: DataFrame, id_column: str)
Stores and provides access to quantitative proteomics data in a tabular form.
Qtable contains proteomics data in a tabular form, which is stored as 'qtable.data', and an experimental design table, stored in 'qtable.design'. Columns from 'qtable.data' can directly be accessed by indexing with [], column values can be set with [], and the 'in' operator can be used to check whether a column is present in 'qtable.data', e.g. 'qtable[key]', 'qtable[key] = value', 'key in qtable'.
Attributes:
Name | Type | Description |
---|---|---|
data |
DataFrame
|
A pandas.DataFrame containing quantitative proteomics data. |
design |
DataFrame
|
A pandas.DataFrame describing the experimental design. |
If data does not contain a "Valid" column, this column is added and all its row values are set to True.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
A dataframe containing quantitative proteomics data in a wide format. The index of the dataframe must contain unique values. |
required |
design
|
DataFrame
|
A dataframe describing the experimental design that must at least contain the columns "Sample" and "Experiment". The "Sample" entries should correspond to the Sample names present in the quantitative columns of the data. |
required |
id_column
|
str
|
The name of the column that contains the unique identifiers for the entries in the data table. |
required |
Raises:
Type | Description |
---|---|
KeyError
|
If the specified id_column is not found in data. |
ValueError
|
If the specified id_column does not contain unique identifiers. |
Methods:
Name | Description |
---|---|
add_design |
Adds an experimental design table. |
get_data |
Returns a copy of the data table. |
get_design |
Returns a copy of the design table. |
get_samples |
Returns a list of samples present in the design table. |
get_experiment |
Looks up the experiment of the specified sample from the design table. |
get_experiments |
Returns a list of experiments present in the design table. |
get_expression_column |
Returns the expression column associated with a sample. |
make_sample_table |
Returns a new dataframe with sample columns containing the 'tag'. |
make_expression_table |
Returns a new dataframe containing the expression columns. |
set_expression_by_tag |
Selects and sets expression columns from those that contain the 'tag'. |
set_expression_by_column |
Sets as expression columns by using the keys from 'columns_to_samples'. |
add_expression_features |
Adds expression features as new columns to the proteomics data. |
temp_design |
Context manager to temporarily modify the design table. |
save |
Save qtable to disk, creating a data, design, and config file. |
load |
Load a qtable from disk by reading a data, design, and config file. |
to_tsv |
Writes the data table to a .tsv (tab-separated values) file. |
to_clipboard |
Writes the data table to the system clipboard. |
copy |
Returns a copy of this Qtable instance. |
Source code in msreport\qtable.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
|
add_design
add_design(design: DataFrame) -> None
Adds an experimental design table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
design
|
DataFrame
|
A dataframe describing the experimental design that must at least contain the columns "Sample" and "Experiment". The "Sample" entries should correspond to the Sample names present in the quantitative columns of the table. |
required |
Source code in msreport\qtable.py
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
get_data
get_data(exclude_invalid: bool = False) -> DataFrame
Returns a copy of the data table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
exclude_invalid
|
bool
|
Optional, if true the returned dataframe is filtered by the "Valid" column. Default false. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
A copy of the qtable.data dataframe. |
Source code in msreport\qtable.py
134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
get_design
get_design() -> DataFrame
Returns a copy of the design table.
Source code in msreport\qtable.py
149 150 151 |
|
get_samples
Returns a list of samples present in the design table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
experiment
|
Optional[str]
|
If specified, only samples from this experiment are returned. |
None
|
Returns:
Type | Description |
---|---|
list[str]
|
A list of sample names. |
Source code in msreport\qtable.py
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
|
get_experiment
Looks up the experiment of the specified sample from the design table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
str
|
A sample name. |
required |
Returns:
Type | Description |
---|---|
str
|
An experiment name. |
Source code in msreport\qtable.py
174 175 176 177 178 179 180 181 182 183 184 185 |
|
get_experiments
Returns a list of experiments present in the design table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
samples
|
Optional[list[str]]
|
If specified, only experiments from these samples are returned. |
None
|
Returns:
Type | Description |
---|---|
list[str]
|
A list of experiments names. |
Source code in msreport\qtable.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 |
|
get_expression_column
Returns the expression column associated with a sample.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample
|
str
|
A sample name. |
required |
Returns:
Type | Description |
---|---|
str
|
The name of the expression column associated with the sample. |
Source code in msreport\qtable.py
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 |
|
make_sample_table
make_sample_table(
tag: str,
samples_as_columns: bool = False,
exclude_invalid: bool = False,
) -> DataFrame
Returns a new dataframe with sample columns containing the 'tag'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tag
|
str
|
Substring that must be present in selected columns. |
required |
samples_as_columns
|
bool
|
If true, replaces expression column names with sample names. Requires that the experimental design is set. |
False
|
exclude_invalid
|
bool
|
Optional, if true the returned dataframe is filtered by the "Valid" column. Default false. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
A new dataframe generated from self.data with sample columns that also contained the specified 'tag'. |
Returns:
Type | Description |
---|---|
DataFrame
|
A copied dataframe that contains only the specified columns from the |
DataFrame
|
quantitative proteomics data. |
Source code in msreport\qtable.py
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 |
|
make_expression_table
make_expression_table(
samples_as_columns: bool = False,
features: Optional[list[str]] = None,
exclude_invalid: bool = False,
) -> DataFrame
Returns a new dataframe containing the expression columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
samples_as_columns
|
bool
|
If true, replaces expression column names with sample names. Requires that the experimental design is set. |
False
|
features
|
Optional[list[str]]
|
A list of additional columns that will be added from qtable.data to the newly generated datarame. |
None
|
exclude_invalid
|
bool
|
Optional, if true the returned dataframe is filtered by the "Valid" column. Default false. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
A copy of tbhe qtable.data dataframe that only contains expression columns |
DataFrame
|
and additionally specified columns. |
Source code in msreport\qtable.py
254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 |
|
set_expression_by_tag
Selects and sets expression columns from those that contain the 'tag'.
A copy of all identified expression columns is generated and columns are renamed to "Expression sample_name". Only columns containing a sample name that is present in qtable.design are selected as expression columns. For all samples present inqtable.design an expression column must be present in qtable.data. When this method is called, previously generated expression columns and expression features are deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tag
|
str
|
Columns that contain 'tag' as a substring are selected as potential expression columns. |
required |
zerotonan
|
bool
|
If true, zeros in expression columns are replace by NaN. |
False
|
log2
|
bool
|
If true, expression column values are log2 transformed and zeros are replaced by NaN. Evaluates whether intensities are likely to be already in log-space, which prevents another log2 transformation. |
False
|
Source code in msreport\qtable.py
285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
|
set_expression_by_column
set_expression_by_column(
columns_to_samples: dict[str, str],
zerotonan: bool = False,
log2: bool = False,
) -> None
Sets as expression columns by using the keys from 'columns_to_samples'.
Generates a copy of all specified expression columns and renames them to "Expression sample_name", according to the 'columns_to_samples' mapping. When this method is called, previously generated expression columns and expression features are deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
columns_to_samples
|
dict[str, str]
|
Mapping of expression columns to sample names. The keys of the dictionary must correspond to columns of the proteomics data and are used to identify expression columns. The value of each expression column specifies the sample name and must correspond to an entry of the experimental design table. |
required |
zerotonan
|
bool
|
If true, zeros in expression columns are replace by NaN |
False
|
log2
|
bool
|
If true, expression column values are log2 transformed and zeros are replaced by NaN. Evaluates whether intensities are likely to be already in log-space, which prevents another log2 transformation. |
False
|
Source code in msreport\qtable.py
315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 |
|
add_expression_features
add_expression_features(
expression_features: DataFrame,
) -> None
Adds expression features as new columns to the proteomics data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expression_features
|
DataFrame
|
dataframe or Series that will be added to qtable.data as new columns, column names are added to the list of expression features. The number and order of rows in 'expression_features' must correspond to qtable.data. |
required |
Source code in msreport\qtable.py
341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 |
|
temp_design
temp_design(
design: Optional[DataFrame] = None,
exclude_experiments: Optional[Iterable[str]] = None,
keep_experiments: Optional[Iterable[str]] = None,
exclude_samples: Optional[Iterable[str]] = None,
keep_samples: Optional[Iterable[str]] = None,
) -> Generator[None, None, None]
Context manager to temporarily modify the design table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
design
|
Optional[DataFrame]
|
A DataFrame to temporarily replace the current design table. |
None
|
exclude_experiments
|
Optional[Iterable[str]]
|
A list of experiments to exclude from the design. |
None
|
keep_experiments
|
Optional[Iterable[str]]
|
A list of experiments to keep in the design (all others are removed). |
None
|
exclude_samples
|
Optional[Iterable[str]]
|
A list of samples to exclude from the design. |
None
|
keep_samples
|
Optional[Iterable[str]]
|
A list of samples to keep in the design (all others are removed). |
None
|
Yields:
Type | Description |
---|---|
None
|
None. Restores the original design table after the context ends. |
Source code in msreport\qtable.py
368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 |
|
save
Save qtable to disk, creating a data, design, and config file.
Saving the qtable will generate three files, each starting with the specified basename, followed by an individual extension. The generated files are: "basename.data.tsv", "basename.design.tsv" and "basename.config.yaml"
Parameters:
Name | Type | Description | Default |
---|---|---|---|
directory
|
str
|
The path of the directory where to save the generated files. |
required |
basename
|
str
|
Basename of files that will be generated. |
required |
Source code in msreport\qtable.py
412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 |
|
load
classmethod
Load a qtable from disk by reading a data, design, and config file.
Loading a qtable will first import the three files generated during saving, then create and configure a new qtable instance. Each of the filename starts with the specified basename, followed by an individual extension. The loaded files are: "basename.data.tsv", "basename.design.tsv" and "basename.config.yaml"
Parameters:
Name | Type | Description | Default |
---|---|---|---|
directory
|
str
|
The path of the directory where saved qtable files are located. |
required |
basename
|
str
|
Basename of saved files. |
required |
Returns:
Type | Description |
---|---|
Self
|
An instance of Qtable loaded from the specified files. |
Raises:
Type | Description |
---|---|
ValueError
|
If the loaded config file does not contain the "Unique ID column" key. This is due to the qtable being saved with a version of msreport <= 0.0.27. |
Source code in msreport\qtable.py
438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 |
|
to_tsv
Writes the data table to a .tsv (tab-separated values) file.
Source code in msreport\qtable.py
494 495 496 497 498 499 500 501 |
|
to_clipboard
to_clipboard(index: bool = False) -> None
Writes the data table to the system clipboard.
Source code in msreport\qtable.py
503 504 505 |
|
copy
copy() -> Self
Returns a copy of this Qtable instance.
Source code in msreport\qtable.py
507 508 509 |
|