tools._mapping
def bulk_mapping(frac_data,
bulk_adata,
sc_adata,
n_cell=100,
annotation_key="curated_cell_type",
bulk_layer=None,
sc_layer=None,
reorder=True,
multiprocessing=True,
cpu_num=cpu_count()-2,
dataset_name="",
out_dir=".",
normalization=True,
filter_gene=True,
cut_off_value=0.6,
save=True)
Reconstruct bulk data using single-cell data and cell type fractions.
This function maps bulk expression data to single-cell expression data using cell type fraction information and various preprocessing steps.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bulk_adata
|
AnnData
|
An :class: |
required |
sc_adata
|
AnnData
|
An :class: |
required |
n_cell
|
int
|
Number of cells per bulk sample. |
100
|
annotation_key
|
string
|
Key in |
'curated_cell_type'
|
bulk_layer
|
string
|
Layer in |
None
|
sc_layer
|
string
|
Layer in |
None
|
reorder
|
bool, optional (default: True)
|
Whether to reorder genes to ensure consistency between bulk and single-cell data. |
True
|
multiprocessing
|
bool, optional (default: True)
|
Whether to use multiprocessing for efficiency. |
True
|
cpu_num
|
int
|
Number of CPUs to use if multiprocessing is enabled. |
cpu_count() - 4
|
project
|
string
|
Prefix for output files. |
''
|
out_dir
|
string
|
Directory to store output files. |
'.'
|
normalization
|
bool, optional (default: True)
|
Whether to apply CPM normalization to data. |
True
|
filter_gene
|
bool, optional (default: True)
|
Whether to filter genes based on cosine similarity. |
True
|
cut_off_value
|
float, optional (default: 0.6)
|
Threshold for cosine similarity when filtering genes. |
required |
save
|
bool, optional (default: True)
|
Whether to save the result files. |
True
|
Returns:
Name | Type | Description |
---|---|---|
bulk_adata |
AnnData
|
The processed bulk data with mapping results. |
df |
DataFrame
|
DataFrame containing the mapping of bulk samples to single-cell IDs. |
Source code in cytobulk\tools\_mapping.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
def st_mapping(st_adata,
sc_adata,
out_dir,
project,
annotation_key,
**kwargs)
Run spatial transcriptomics mapping with single-cell RNA-seq data.
This function maps spatial transcriptomics (ST) data to single-cell RNA-seq (scRNA-seq) data. It aligns cell type compositions and estimates spatial distributions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
st_adata
|
AnnData
|
An :class: |
required |
sc_adata
|
AnnData
|
An :class: |
required |
seed
|
int, optional (default: 0)
|
Seed for random number generation to ensure reproducibility. |
required |
annotation_key
|
string, optional (default: 'celltype_minor')
|
Key in |
required |
sc_downsample
|
bool, optional (default: False)
|
Whether to downsample scRNA-seq data to a maximum number of transcripts per cell. |
required |
scRNA_max_transcripts_per_cell
|
int, optional (default: 1500)
|
Maximum number of transcripts per cell for downsampling. |
required |
sampling_method
|
string, optional (default: 'duplicates')
|
Method for sampling single cells based on cell type composition. |
required |
out_dir
|
string, optional (default: '.')
|
Directory to save output files. |
required |
project
|
string, optional (default: 'test')
|
Project name for output file naming. |
required |
mean_cell_numbers
|
int, optional (default: 8)
|
Average number of cells per spot used for estimation. |
required |
save_reconstructed_st
|
bool, optional (default: True)
|
Whether to save the reconstructed spatial transcriptomics data. |
required |
Returns:
Name | Type | Description |
---|---|---|
reconstructed_sc |
DataFrame
|
DataFrame containing the mapping of single-cell IDs to spatial spot IDs. |
Source code in cytobulk\tools\_mapping.py
346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 |
|
def he_mapping(image_dir,
out_dir,
project,
lr_data = None,
sc_adata = None,
annotation_key="curated_celltype",
k_neighbor=30,
alpha=0.5,
mapping_sc=True,
**kwargs)
Run H&E-stained image cell type mapping with single-cell RNA-seq data.
This function predicts cell types from H&E-stained histology images and aligns them with single-cell RNA-seq (scRNA-seq) data using optimal transport. It computes spatial distributions and matches cell types between the image and single-cell data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_dir
|
str
|
Path to the directory containing H&E-stained images. |
required |
out_dir
|
str
|
Directory where the output files will be saved. |
required |
project
|
str
|
Name of the project, used for naming output files. |
required |
lr_data
|
pandas.DataFrame, optional (default: None)
|
A DataFrame containing ligand-receptor pair data with columns 'ligand' and 'receptor'. |
None
|
sc_adata
|
anndata.AnnData, optional (default: None)
|
An :class: |
None
|
annotation_key
|
str, optional (default: "curated_celltype")
|
Key in |
'curated_celltype'
|
k_neighbor
|
int, optional (default: 30)
|
Number of neighbors to consider when constructing the graph for H&E image data. |
30
|
alpha
|
float, optional (default: 0.5)
|
Trade-off parameter for the Fused Gromov-Wasserstein optimal transport, controlling the balance between graph structure and feature matching (value between 0 and 1). |
0.5
|
mapping_sc
|
bool, optional (default: True)
|
Whether to perform mapping between H&E image cell data and single-cell RNA-seq data. If False, only H&E image cell type predictions are returned. |
True
|
**kwargs
|
dict
|
Additional arguments (not used in this implementation). |
{}
|
Returns:
Name | Type | Description |
---|---|---|
cell_coordinates |
DataFrame
|
DataFrame containing cell coordinates and their predicted cell types from H&E-stained images. |
df |
DataFrame
|
DataFrame containing matching results between H&E image cells and single-cell data, including spatial coordinates, cell types, and matched single-cell IDs. |
filtered_adata |
AnnData
|
A filtered :class: |
Source code in cytobulk\tools\_mapping.py
412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 |
|