CLI¶
The main entry point is accmix with these subcommands. Run accmix <cmd> --help for the authoritative list. Below are the parameters, defaults, and meanings.
accmix scan¶
Scan a genome FASTA with a PWM and compute inner/outer scores.
accmix scan -f <fasta> -p <pwm.txt> [-o <out_prefix>] [options]
Options
-f, --fasta PATH(required): Genome FASTA (.fa/.fa.gz).-p, --pwm PATH(required): PWM text file with columns Pos,A,C,G,T/U.-o, --out-prefix PATH: Output prefix (default:results/<pwm>.<timestamp>).-M INT(default 50): Inner half-width in bp.-N INT(default 500): Outer half-width in bp; must be >M.-A INT(default 2_000_000): Keep top-A entries per strand.-B INT(default 2_000_000): Keep bottom-B entries per strand.-g, --bg "A,C,G,T": Background base probabilities for LLR (overridden by-G).-G, --bg-from-fasta: Estimate background from the provided FASTA.
Outputs: results/<prefix>_topA.tsv.gz, results/<prefix>_botB.tsv.gz.
accmix annotate-acc¶
Compute accessibility-derived s_l from AS native/fixed and PWM top table.
accmix annotate-acc -n <ASnative> -f <ASfixed> -t <toptable> [-o <out>] [options]
Options
-n, --ASnative PATH(required): AS native file (chr, position, strand, score, depth, motif info).-f, --ASfixed PATH(required): AS fixed BED6 (chr, start, end, strand, score, depth).-t, --toptable PATH(required): PWM scan top-table TSV.gz with inner/outer scores.-o, --out PATH: Output parquet (default:results/sl.<timestamp>.parquet).-M INT(default 50): Inner half-width used in scanning.-N INT(default 500): Outer half-width used in scanning.
Output: parquet with s_l per site.
accmix annotate-tss¶
Add TSS proximity, conservation, and TPM annotations to sites.
accmix annotate-tss -i <input.parquet> [-o <out>] [options]
Options
-i, --input-parquet PATH(required): Input parquet (e.g., output ofannotate-acc).-o, --out PATH: Output parquet (default:results/annotated.<timestamp>.parquet).-r, --rna-seq-parquet PATH(defaultdata/expression/ENCFF364YCB_HeLa_RNAseq_Transcripts_count_curated.parquet): Transcript TPM parquet.-c, --phastcons-bed PATH(defaultdata/conservation_score/phastCons100way1.bed): phastCons BED for regional conservation.-p, --phastcons-parquet PATH(defaultdata/conservation_score/hg38.phastCons100way.1.parquet): Base-level phastCons scores.-y, --phylop-parquet PATH(defaultdata/conservation_score/hg38.phyloP100way.1.parquet): Base-level phyloP scores.-R, --reference TEXT(defaultGRCh38): Reference genome name.
Output: parquet with added TSS distance, conservation, and expression features.
accmix model¶
Fit Gaussian EM with logistic prior and write priors/posteriors plus model JSON.
accmix model -i <annotated.parquet> -o <out_prefix> -r <RBP> -m <MotifID> [options]
Options
-i, --input-parquet PATH(required): Annotated parquet withid,s_l, and features.-o, --out-prefix PATH: Output prefix (default:results/<rbp>_<motif>.<timestamp>).-r, --rbp-name TEXT(defaultRBP): RBP label for outputs.-m, --motif-id TEXT(defaultMotif): Motif ID label for outputs.-F, --feat-cols TEXT: Comma-separated feature names (default: built-in set).-s, --site-col TEXT(defaultid): Site identifier column.-S, --s-col TEXT(defaults_l): Accessibility score column.-M, --mode TEXT(defaultem): Modeling mode (currently onlyem).
Outputs: <prefix>.model.parquet (priors/posteriors) and <prefix>.model.json (parameters).
accmix evaluate¶
Evaluate fitted models against CLIP/PIP-seq using the model parquet directly.
accmix evaluate \
-M <model.parquet> \
-b <CLIP.bed> \
-p <PIPseq.parquet> \
-r <RBP_Name> \
-m <Motif_ID> \
-o results \
[-L <logo.png>] [-t 1.0] [-R 50]
Options
-M, --model-parquet PATH(required): Parquet produced byaccmix model(containsprior_p/posterior_r).-b, --clipseq-bed PATH(required): CLIP-seq peaks BED.-p, --pipseq-parquet PATH(required): PIP-seq parquet.-r, --rbp-name TEXT: RBP label.-m, --motif-id TEXT: Motif identifier.-o, --output-root PATH: Output directory (defaultresults).-L, --motif-logo PATH: Optional motif logo for heatmaps.-t, --score-phastcons100-threshold FLOAT: Conservation cutoff (default 1.0).-R, --motif-range INT: Half-window around site for overlaps (default 50).
Outputs: plots under results/plots/, logs and tables under results/logs/.