# Setting up ## Bioinformatics Tools We are going to be using the following tools throughout the course: - `fastqc` - `multiqc` - `trimmomatic` - `bwa` - `samtools` (version > 1.0) - `bcftools` (version > 1.0) - `singularity` - `nextflow ` The easiest way to install all of the above is through [`conda`](https://conda.io/miniconda.html) as follows: 1. Download and install the latest version of miniconda: ``` curl -O -L https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh ``` 2. Add the correct channels in the conda (so that it knows where to look for packages): ``` conda config --add channels defaults conda config --add channels bioconda conda config --add channels conda-forge ``` 3. Install the required tools: ``` conda install -c bioconda fastqc multiqc trimmomatic bwa samtools bcftools singularity nextflow ``` You might also want to download and install [`IGV`](https://software.broadinstitute.org/software/igv/download), as it may be useful in several visualizations. ## Base Tools Finally, the following tools are in all likelihood already installed in most major OSs, but you should also check for them: - `git` - `curl` - `gunzip` - `java` ## R / RStudio [R](http://www.r-project.org/) is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use [RStudio](http://www.rstudio.com/). **Windows** Install R by downloading and running the [correct installer file](http://cran.r-project.org/bin/windows/base/release.htm) from [CRAN](http://cran.r-project.org/index.html). Also, please install the [RStudio IDE](http://www.rstudio.com/ide/download/desktop). Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select "Run as administrator" instead of double-clicking). Otherwise problems may occur later, for example when installing R packages. **Linux** You can download the binary files for your distribution from [CRAN](http://cran.r-project.org/index.html). Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R). Also, please install the [RStudio IDE](http://www.rstudio.com/ide/download/desktop). ### Install the required packages We will install some specific packages as well as all packages that might be needed / useful. ``` install.packages(c("tidyverse", "ggplot2", "rafalib", "caret", "ggpubr", "GGally", "ROCR", "Boruta", "party", "earth", "mlbench", "glmnet", "e1071", "randomForest", "neuralnet"), dependencies=c("Depends", "Suggests") ); ``` Also install bioconductor, and some bioconductor-related libraries. ``` source("https://bioconductor.org/biocLite.R") BiocManager::install() BiocManager::install("factoextra") BiocManager::install("fpc") BiocManager::install("knitr") BiocManager::install("kableExtra") BiocManager::install("TxDb.Hsapiens.UCSC.hg38.knownGene") BiocManager::install("BSgenome.Hsapiens.UCSC.hg38") BiocManager::install("DiffBind") BiocManager::install('MotifDb') BiocManager::install('methylKit') BiocManager::install('genomation') BiocManager::install('ggplot2') BiocManager::install('TxDb.Mmusculus.UCSC.mm10.knownGene') BiocManager::install("AnnotationHub") BiocManager::install("annotatr") BiocManager::install("bsseq") BiocManager::install("DSS") ``` That's it, all done! You have now all the tools in place!