Annotation Input¶
Annotations map categorical terms to labels present in your matrix. HiMaLAYAS filters labels to the matrix universe, then applies term-size filters.
Signature¶
Annotations(
term_to_labels: Dict[str, Iterable[str]],
matrix: Matrix,
*,
min_term_size: int = 2,
max_term_size: int | None = None,
)
Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
term_to_labels |
Dict[str, Iterable[str]] |
required | Mapping from term to labels (genes, recipes, proteins). |
matrix |
Matrix |
required | Provides the label universe. |
min_term_size |
int |
2 |
Minimum number of matrix-overlapping labels required for a term to be retained. |
max_term_size |
int | None |
None |
Maximum number of matrix-overlapping labels allowed for a term to be retained. |
Expected Mapping Format¶
term_to_labels = {
"mitochondrion inheritance": ["GEM1", "PTC1", "ARP2", "ADY3"],
"vacuole inheritance": ["VPS15", "VAC17", "PEP7", "VPS3"],
}
Each key is an annotation term, and each value is a list of matrix label IDs. Labels must match matrix row labels exactly.
Common Attributes¶
| Attribute | Type | Description |
|---|---|---|
annotations.matrix_labels |
set[str] |
Matrix label universe used to filter incoming annotation labels. |
annotations.term_to_labels |
Dict[str, set[str]] |
Filtered term-to-label mapping retained on the object after dropping labels not present in the matrix. |
Common Methods¶
| Method | Description |
|---|---|
annotations.terms |
Returns retained annotation terms after overlap filtering. |
annotations.rebind(matrix) |
Returns a new Annotations object aligned to a new matrix label universe (useful for zoom and subset workflows). |
terms¶
Returns retained annotation terms after overlap filtering.
rebind¶
Returns a new Annotations object aligned to a new matrix label universe.
| Parameter | Type | Default | Description |
|---|---|---|---|
matrix |
Matrix |
required | New matrix whose label universe will be used to re-filter terms. |
Example (GO Biological Process)¶
import json
from himalayas import Annotations
with open("data/go_bp_name_to_orfs.json", "r", encoding="utf-8") as fh:
go_bp = json.load(fh)
annotations = Annotations(go_bp, matrix)
Example (Country to Recipes)¶
country_to_recipes = {
"Italy": ["r_001", "r_105", "r_214"],
"India": ["r_002", "r_003"],
}
annotations = Annotations(country_to_recipes, matrix)
Common Errors¶
Labels for term ... must be an iterable of labelsif a term maps to a string.No annotation terms overlap matrix labelsif none of the labels match the matrix.
Notes¶
- Terms are filtered by matrix overlap and term size limits.
- Keep
min_term_size=2as the default floor for analysis-ready runs.