.. _manual-main:

=====================================
Welcome to Snakemake's documentation!
=====================================

.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg
    :target: https://bioconda.github.io/recipes/snakemake/README.html

.. image:: https://img.shields.io/pypi/pyversions/snakemake.svg
    :target: https://www.python.org

.. image:: https://img.shields.io/pypi/v/snakemake.svg
    :target: https://pypi.python.org/pypi/snakemake

.. image:: https://quay.io/repository/snakemake/snakemake/status
       :target: https://quay.io/repository/snakemake/snakemake

.. image:: https://img.shields.io/circleci/project/bitbucket/snakemake/snakemake.svg
    :target: https://circleci.com/bb/snakemake/snakemake/tree/master

.. image:: https://img.shields.io/badge/stack-overflow-orange.svg
    :target: http://stackoverflow.com/questions/tagged/snakemake

Snakemake is an MIT-licensed workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style.
Snakemake workflows are essentially Python scripts extended by declarative code to define **rules**.
Rules describe how to create **output files** from **input files**.


.. _manual-quick_example:

-------------
Quick Example
-------------

.. code-block:: python

    rule targets:
        input:
            "plots/dataset1.pdf",
            "plots/dataset2.pdf"

    rule plot:
        input:
            "raw/{dataset}.csv"
        output:
            "plots/{dataset}.pdf"
        shell:
            "somecommand {input} {output}"


* Similar to GNU Make, you specify targets in terms of a pseudo-rule at the top.
* For each target and intermediate file, you create rules that define how they are created from input files.
* Snakemake determines the rule dependencies by matching file names.
* Input and output files can contain multiple named wildcards.
* Rules can either use shell commands, plain Python code or external Python or R scripts to create output files from input files.
* Snakemake workflows can be executed on workstations and clusters without modification. The job scheduling can be constrained by arbitrary resources like e.g. available CPU cores, memory or GPUs.
* Snakemake can use Amazon S3, Google Storage, Dropbox, FTP and SFTP to access input or output files and further access input files via HTTP and HTTPS.

.. _main-getting-started:

---------------
Getting started
---------------

To get started, consider the :ref:`tutorial <tutorial-welcome>`, the `introductory slides <http://slides.com/johanneskoester/snakemake-tutorial-2016>`_, and the :ref:`FAQ <project_info-faq>`.

.. _main-support:

-------
Support
-------

* In case of questions, please post on `stack overflow <http://stackoverflow.com/questions/tagged/snakemake>`_.
* To discuss with other Snakemake users, you can use the `mailing list <https://groups.google.com/forum/#!forum/snakemake>`_.
* For bugs and feature requests, please use the `issue tracker <https://bitbucket.org/snakemake/snakemake/issues>`_.
* For contributions, visit Snakemake on `bitbucket <https://bitbucket.org/snakemake/snakemake>`_ and read the :ref:`guidelines <project_info-contributing>`.

--------
Citation
--------

`Köster, Johannes and Rahmann, Sven. "Snakemake - A scalable bioinformatics workflow engine". Bioinformatics 2012. <http://bioinformatics.oxfordjournals.org/content/28/19/2520>`_

See :doc:`Citations <project_info/citations>` for more information.

----------------
Related Projects
----------------

`Snakemake Wrappers Repository <https://snakemake-wrappers.readthedocs.org>`_
    The Snakemake Wrapper Repository is a collection of reusable wrappers that allow to quickly use popular command line tools from Snakemake rules and workflows.

`Snakemake Workflow Repository <https://bitbucket.org/snakemake/snakemake-workflows>`_
    This repository provides a collection of high quality modularized and re-usable rules and workflows.
    The provided code should also serve as a best-practices of how to build production ready workflows with Snakemake.
    Everybody is invited to contribute.

`Bioconda <https://bioconda.github.io/>`_
    Bioconda can be used from Snakemake for creating completely reproducible workflows by pin pointing the used software version and providing binaries.


.. project_info-publications_using:

----------------------------
Publications using Snakemake
----------------------------

In the following you find an incomplete list of publications making use of Snakemake for their analyses.
Please consider to add your own.

* Etournay et al. 2016. `TissueMiner: a multiscale analysis toolkit to quantify how cellular processes create tissue dynamics <https://elifesciences.org/content/5/e14334>`_. eLife Sciences.
* Townsend et al. 2016. `The Public Repository of Xenografts Enables Discovery and Randomized Phase II-like Trials in Mice <http://www.cell.com/cancer-cell/abstract/S1535-6108%2816%2930090-3>`_. Cancer Cell.
* Burrows et al. 2016. `Genetic Variation, Not Cell Type of Origin, Underlies the Majority of Identifiable Regulatory Differences in iPSCs <http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005793>`_. PLOS Genetics.
* Ziller et al. 2015. `Coverage recommendations for methylation analysis by whole-genome bisulfite sequencing <http://www.nature.com/nmeth/journal/v12/n3/full/nmeth.3152.html>`_. Nature Methods.
* Li et al. 2015. `Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR <https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0843-6>`_. Genome Biology.
* Schmied et al. 2015. `An automated workflow for parallel processing of large multiview SPIM recordings <http://bioinformatics.oxfordjournals.org/content/32/7/1112>`_. Bioinformatics.
* Chung et al. 2015. `Whole-Genome Sequencing and Integrative Genomic Analysis Approach on Two 22q11.2 Deletion Syndrome Family Trios for Genotype to Phenotype Correlations <http://onlinelibrary.wiley.com/doi/10.1002/humu.22814/full>`_. Human Mutation.
* Kim et al. 2015. `TUT7 controls the fate of precursor microRNAs by using three different uridylation mechanisms <http://emboj.embopress.org/content/34/13/1801.long>`_. The EMBO Journal.
* Park et al. 2015. `Ebola Virus Epidemiology, Transmission, and Evolution during Seven Months in Sierra Leone <http://doi.org/10.1016/j.cell.2015.06.007>`_. Cell.
* Břinda et al. 2015. `RNF: a general framework to evaluate NGS read mappers <http://bioinformatics.oxfordjournals.org/content/early/2015/09/30/bioinformatics.btv524>`_. Bioinformatics.
* Břinda et al. 2015. `Spaced seeds improve k-mer-based metagenomic classification <http://bioinformatics.oxfordjournals.org/content/early/2015/08/10/bioinformatics.btv419>`_. Bioinformatics.
* Spjuth et al. 2015. `Experiences with workflows for automating data-intensive bioinformatics <http://www.biologydirect.com/content/10/1/43>`_. Biology Direct.
* Schramm et al. 2015. `Mutational dynamics between primary and relapse neuroblastomas <http://www.nature.com/ng/journal/v47/n8/full/ng.3349.html>`_. Nature Genetics.
* Bray et al. 2015. `Near-optimal RNA-Seq quantification <http://arxiv.org/abs/1505.02710>`_. Arxiv preprint.
* Berulava et al. 2015. `N6-Adenosine Methylation in MiRNAs <http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118438>`_. PLOS ONE.
* The Genome of the Netherlands Consortium 2014. `Whole-genome sequence variation, population structure and demographic history of the Dutch population <http://www.nature.com/ng/journal/v46/n8/full/ng.3021.html>`_. Nature Genetics.
*  Patterson et al. 2014. `WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads <http://online.liebertpub.com/doi/10.1089/cmb.2014.0157>`_. Journal of Computational Biology.
* Fernández et al. 2014. `H3K4me1 marks DNA regions hypomethylated during aging in human stem and differentiated cells <http://genome.cshlp.org/content/25/1/27.long>`_. Genome Research.
* Köster et al. 2014. `Massively parallel read mapping on GPUs with the q-group index and PEANUT <https://peerj.com/articles/606/>`_. PeerJ.
* Chang et al. 2014. `TAIL-seq: Genome-wide Determination of Poly(A) Tail Length and 3′ End Modifications <http://www.cell.com/molecular-cell/abstract/S1097-2765(14)00121-X>`_. Molecular Cell.
* Althoff et al. 2013. `MiR-137 functions as a tumor suppressor in neuroblastoma by downregulating KDM1A <http://onlinelibrary.wiley.com/doi/10.1002/ijc.28091/abstract;jsessionid=33613A834E2A2FDCCA49246C23DF777E.f04t02>`_. International Journal of Cancer.
* Marschall et al. 2013. `MATE-CLEVER: Mendelian-Inheritance-Aware Discovery and Genotyping of Midsize and Long Indels <http://bioinformatics.oxfordjournals.org/content/29/24/3143.long>`_. Bioinformatics.
* Rahmann et al. 2013. `Identifying transcriptional miRNA biomarkers by integrating high-throughput sequencing and real-time PCR data <http://www.sciencedirect.com/science/article/pii/S1046202312002605>`_. Methods.
* Martin et al. 2013. `Exome sequencing identifies recurrent somatic mutations in EIF1AX and SF3B1 in uveal melanoma with disomy 3 <http://www.nature.com/ng/journal/v45/n8/full/ng.2674.html>`_. Nature Genetics.
* Czeschik et al. 2013. `Clinical and mutation data in 12 patients with the clinical diagnosis of Nager syndrome <http://link.springer.com/article/10.1007%2Fs00439-013-1295-2>`_. Human Genetics.
* Marschall et al. 2012. `CLEVER: Clique-Enumerating Variant Finder <http://bioinformatics.oxfordjournals.org/content/28/22/2875.long>`_. Bioinformatics.


.. toctree::
   :caption: Installation
   :name: installation
   :hidden:
   :maxdepth: 1

   getting_started/installation
   getting_started/examples


.. toctree::
   :caption: Tutorial
   :name: tutorial
   :hidden:
   :maxdepth: 1

   tutorial/welcome
   tutorial/basics
   tutorial/advanced
   tutorial/additional_features

.. toctree::
  :caption: Executing workflows
  :name: execution
  :hidden:
  :maxdepth: 1

  executable.rst

.. toctree::
    :caption: Defining workflows
    :name: snakefiles
    :hidden:
    :maxdepth: 1

    snakefiles/writing_snakefiles
    snakefiles/rules
    snakefiles/configuration
    snakefiles/modularization
    snakefiles/remote_files
    snakefiles/utils
    snakefiles/deployment


.. toctree::
    :caption: API Reference
    :name: api-reference
    :hidden:
    :maxdepth: 1

    api_reference/snakemake
    api_reference/snakemake_utils


.. toctree::
    :caption: Project Info
    :name: project-info
    :hidden:
    :maxdepth: 1

    project_info/citations
    project_info/more_resources
    project_info/faq
    project_info/contributing
    project_info/authors
    project_info/history
    project_info/license
