Skip to contents

Setup

Load packages. tidyverse is a collection of useful packages for data science.

# install.packages("pak")
# pak::pak("tidyverse")
library(conflicted) # for safety
library(tidyverse)
library(ranemone)

Set options for your data and cache directories. See the Get started page for details.

options(ranemone.directory_prefix = "~/db")
options(ranemone.cache_dir = "~/.cache/ranemone")

Reading files

Read all files with the given file name recursively in directory_prefix().

sample_df = ranemone::read_tsv_xz("sample.tsv.xz")
experiment_df = ranemone::read_tsv_xz("experiment.tsv.xz")
community_df = ranemone::read_tsv_xz("community_qc3nn_target.tsv.xz")

Using community tables

The community table contains many columns. You may want to select a smaller number of columns for analysis.

dim(community_df)
str(community_df)

comm = community_df |>
  dplyr::select(samplename, family, genus, species, sequence, nreads, ncopiesperml)

dim(comm)
str(comm)
print(comm)

Taxonomy of available species can be extracted with a single function.

taxonomy = community_df |> ranemone::distinct_taxonomy()

Alpha diversity can be calculated using the ncopiesperml column after aggregating them for each species in each sample.

species_df = community_df |>
  ranemone::summarize_ncopies(.by = species)

dim(species_df)
str(species_df)
print(species_df)

diversity = ranemone::summarize_alpha_diversity(species_df)

You can replace species with genus, family, or other levels as needed.

Making community composition tables

A community composition matrix in wider format is also supported.

species_mat = community_df |>
  ranemone::pivot_wider_ncopies(.by = species)

ranemone::calc_alpha_div(species_mat)