Skip to contents

List the simulated count tables by level of prevalence

Usage

simulate_by_prevalence(
  data_with_annotation,
  prev_list,
  graph_file = NULL,
  col_module_id,
  annotation_level,
  sample_size = 500,
  seed,
  verbatim = FALSE,
  data_type = "shotgun"
)

Arguments

data_with_annotation

Dataframe. The abundance table merged with the module names. Required format: modules are the rows and samples are the columns. The first column must be the modules name (e.g. species), the second is the module ID (e.g. msp), and each subsequent column is a sample

prev_list

List of numeric. The prevalences to be studied. Required format is decimal: 0.20 for 20% of prevalence.

graph_file

Dataframe. The object generated by graph_step() function

col_module_id

String. The name of the column with the module names in data_with_annotation

annotation_level

String. The name of the column with the level to be studied. Examples: species, genus, level_1

sample_size

Numeric. The size to be considerated, the value of 500 is recommended

seed

Numeric. The seed number, ensuring reproducibility

verbatim

Boolean. Controls verbosity

data_type

String. Enables the treatment of 16S data with "16S", default value is "shotgun"

Value

List of dataframes. Each element of the list corresponds to a level of prevalence and is a simulated abundance table

Examples

tiny_data <- data.frame(
  species = c("One bacteria", "One bacterium L", "One bacterium G", "Two bact"),
  msp_name = c("msp_1", "msp_2", "msp_3", "msp_4"),
  SAMPLE1 = c(0, 1.328425e-06, 0, 1.527688e-07),
  SAMPLE2 = c(1.251707e-07, 1.251707e-07, 3.985320e-07, 0),
  SAMPLE3 = c(0, 0, 4.926046e-09, 5.626392e-06),
  SAMPLE4 = c(0, 0, 2.98320e-05, 0)
)

tiny_graph <- graph_step(tiny_data, col_module_id = "msp_name", annotation_level = "species", seed = 20242025) %>% suppressWarnings()

tiny_sims <- simulate_by_prevalence(tiny_data, prev_list = c(0.20, 0.30), graph_file = tiny_graph, col_module_id = "msp_name", annotation_level = "species", sample_size = 500, seed = 20242025)