Reproducibility¶
This guide shows how to regenerate the figures and model outputs in this repository, including the batch scripts used for the full Midway3 run.
Prerequisites¶
- Install the package (PyPI or source). See Installation.
- Prepare data as described in Data. Paths below assume the repository layout (
data/andresults/).
End-to-end workflow (single motif/RBP)¶
1) Scan PWM
accmix scan \
-f data/fasta/test.fa \
-p data/pwms/M00124_example.txt \
-o results/M00124_example
2) Compute s_l (accessibility)
accmix annotate-acc \
-n data/AS/ANC1C.hisat3n_table.bed6 \
-f data/AS/ANC1xC.hisat3n_table.bed6 \
-t results/M00124_example_topA.tsv.gz \
-o results/M00124_example_sl.parquet \
--M 50 --N 500
3) Annotate TSS / conservation / TPM
accmix annotate-tss \
-i results/M00124_example_sl.parquet \
-o results/M00124_example_annotated.parquet \
-r data/evaluation/RNAseq_HeLa_TPM.parquet \
-c data/evaluation/phastCons100way.bed.gz \
-p data/evaluation/phastCons100way.parquet \
-y data/evaluation/phyloP100way.parquet \
-R data/fasta/test.fa
4) Fit EM model
accmix model \
-i results/M00124_example_annotated.parquet \
-o results/RBP_Motif \
-r ExampleRBP \
-m M00124
5) Evaluate
Run evaluation directly on the model parquet:
accmix evaluate \
-M results/RBP_Motif.XXXXXX.model.parquet \
-b data/clipseq/ELAVL1_HeLa.bed \
-p data/evaluation/PIPseq_HeLa.parquet \
-r ExampleRBP \
-m M00124 \
-o results \
-L data/logos/M00124_fwd.png \
-t 1.0 \
-R 50
Outputs include evaluation plots and tables. Example artifacts from this repo:
- Heatmap:
assets/figures/ExampleRBP_M00124_heatmap_with_title_logo.png - Distribution:
assets/figures/ExampleRBP_M00124.dist.val.png
Batch runs used for figures¶
The figures in this repository were generated by running two batch scripts from the repo root:
1) Fit models across Midway3 parquets
python scripts/run_accmix_models_over_midway3_results.py
- Discovers
data/midway3_results/*.parquetand matching CLIP beds. - Calls
accmix modelinternals for each RBP/motif pair. - Writes per-run model parquet/JSON into
results/.
2) Evaluate all fitted models
python scripts/run_accmix_evaluation_over_midway3_results.py
- For each RBP/motif pair, finds the latest
results/<RBP>_<Motif>*.model.parquet. - Invokes
accmix evaluate -M <model.parquet> -b <CLIP.bed> -p <pipseq.parquet> -r <RBP> -m <Motif>. - Plots and logs are written under
results/plots/andresults/logs/.
Running these two scripts sequentially reproduces the model fits and evaluation plots shown in the repository.
Rendered examples:

