Skip to main content

Posts

Showing posts with the label R

Random Forest for predicting diabetes based on diagnostic measures

Random forests or decision tree forests focuses only on ensembles of decision trees. This method combines the base principles of bagging with random feature selection to add additional diversity to the decision tree models. After the ensemble of trees (the forest) is generated, the model uses a vote to combine the trees’ predictions. As the ensembles uses only a small, random portion of the full feature set, random forests can handle extremely large datasets, where the so-called “curse of dimensionality” might cause other models to fail. At the same time, its error rates for most learning tasks are on par with nearly any other method. It is a all purpose model that performs well in most problems, but unlike a decision tree, the model is not easily interpretable. Can handle noisy, missing data as well as categorical or continuous features, but selects only the most important features. Here we will work with the Pima Indians Diabetes database to predict the onset of diabetes base...

Artificial Neuronal Network in R (neuralnet package)

An Artificial Neural Network (ANN) models the relationship between a set of input signals and an output signal using a model derived from our undestanding of how a biological brain responds to stimuli from sensory inputs. ANN uses a network of artificial neurons or nodes to solve learning problems. At first ANNs were used to simulate learning simple functions like the  AND  function or the logical  OR  function, but nowadays as cumputers have become more powerfull that complexity of the ANNs has increased so much that they are now frequently applied to more practical problems including speech and handwrinting recognition programs, automation of smart devices, sophisticated models of weather and climate patterns, etc… ANNs are very versatile learners that can be applied to nearly any learning task, classification, numeric prediction, and even unsuppervised pattern recognition. ANNs are best applied to problems where the input data and output data are well defin...