Welcome to simAIRR’s documentation!
simAIRR provides a simulation approach to generate synthetic AIRR datasets that are suitable for benchmarking machine learning (ML) methods, where undesirable access to ground truth signals in training datasets for ML methods is mitigated. Unlike state-of-the-art approaches, simAIRR constructs antigen-experienced-like baseline repertoires and introduces signals by following the empirical relationship between generation probability and sharing pattern of public sequences calibrated from real-world experimental datasets.
Getting started
To get started:
Read a brief overview of simAIRR’s simulation approach under Tool overview
Consult the descriptions of simAIRR arguments
For installation, see Installation
Consult the Tutorials for tutorials and examples of different workflows