Comparing and optimizing performance of phyloytypr to mothur (CC304)
Riffomonas Project Riffomonas Project
21.9K subscribers
338 views
12

 Published On Oct 7, 2024

Pat compares the performance of phylotypr to mothur and finds that mothur is faster. After revisiting his code he is able to use Rfast's rowsums instead of colsums to match mothur's performance with one processor using the purrr package. Then he shows how to use the furrr package to parallelize the R code. Finally, he shows how to include data files into a package and write a function to access the data in the file. This episode is part of an ongoing effort to develop an R package that implements the naive Bayesian classifier for classifying 16S rRNA gene sequences.

If you want to get a physical copy of R Packages: https://amzn.to/43pMR8L
If you want a free, online version of R packages: https://r-pkgs.org/

You can find my blog post for this episode at https://www.riffomonas.org/code_club/....

Check out the GitHub repository at the:
Beginning of the episode: https://github.com/riffomonas/phyloty...
End of the episode: https://github.com/riffomonas/phyloty...

#rstats #furrr #purrr #usethis #pkgdown #devtools #rdp #16S #classification #classifier #microbialecology #microbiome

Support Riffomonas by becoming a Patreon member!
  / riffomonas  

Want more practice on the concepts covered in Code Club? You can sign up for my weekly newsletter at https://shop.riffomonas.org/youtube to get practice problems, tips, and insights.

If you're interested in purchasing a video workshop be sure to check out https://riffomonas.org/workshops/

You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: https://www.riffomonas.org/minimalR/
General data: https://www.riffomonas.org/generalR/

If you want to cite this video, please consider citing https://journals.asm.org/doi/10.1128/...

0:00 Introduction
1:58 Assessing performance of mothur's classify.seqs
4:45 Classifying thousands of sequences with phylotypr
10:31 Improving phylotypr's performance with TDD
21:20 Parallelizing classification in R
29:28 Adding a data file to phylotypr

show more

Share/Embed