Skip to content

Specialization Cheat Sheets

Big Data

Parallel Computing



sparklyr



Data mining and modeling

  • Data Mining. PDF only.
    • association rules, sequential patterns, classification & prediction, regression, clustering, outliers, time series, text mining, socila networks, graph mining, spatial data, statistics, graphics, data manipulation, data access, big data, parallel computing, reports, weka, editors, guis

data.table



dplyr



forcats



sjmisc



Import and Tidy up



Machine Learning

caret





estimatr



h2o



Keras



Machine Learning

  • Machine Learning. PDF only.

    • Supervised Learning;
    • Unsupervised Learning;
    • Deep Learning;
    • Machine Learning Tips and Tricks;
    • Probabilities and Statistics;
    • Linear Algebra and Calculus.
  • Big Data Machine Learning. PDF only.

    • linear regression, logistic regression, regularization (ridge, lasso), neural network, support vector machine, nayesian network and naïve bayes, k-nearest neighbors, decision tree, tree ensembles (bagging or random forest, boosting)
  • Machine Learning Modelling in R. PDF.



mlr



Regressions

  • Regressions. PDF only.
    • linear model, variable selection, diagnostics, graphics, tests, variable transformation, ridge, segmented, gls, glm, nls, gnls, loess, splines, robust, structural equation, simultaneous equation, pls, principal components, quantile, linear and nonlinear mixed effects, generalized additive, survival analysis, classification & regression trees, beta

Survival Analysis

  • survminer. PDF only.
    • curve, ggplot2, cox model

NLP

quanteda



Regex



stringr



xplain



Probabilities and randomness

Probabilities



randomizr



vtree



Programming

purrr



rlang



Python

reticulate



Quandl



Time Series

lubridate



nardl



Time series

  • Time Series. PDF only.
    • input, decomposition, tests, stochastic, graphics, miscellaneous

tsbox



xts



Tidyverse



Syntax