Model Selection & Boosting In Python

What Is Model Selection ?

Model selection is the process of select final model of machine learning problem for training dataset.

k-fold Cross Validation

Cross validation is a estimate model prediction using statistical method. In machine learning problem, Have A Two Type OF Data one is training data set and second is testing data set. using Cross validation data scientists check model overfitting or not on testing data predication. variance Reducing in training data set and Bias Reducing in test data set using Cross Validation.

k-fold Cross Validation Is Give K Value for Sampling original data set subsets. each subset known as fold. k-1 rsubsets remaining for training dataset Cross validation. based on k value check cross validation all data set and Estimate accuracy average of k cases Validation.

from sklearn.model_selection import KFold
# prepare cross validation
kfold = KFold(11, True, 1)
# enumerate splits
for train, test in kfold.split(data):
    print('train: %s, test: %s' % (data[train], data[test]))

Full Code: Click Hear

What is Parameter Tuning ?

Parameter Tuning is Set Parameter of Internal Model Required when making Model. Parameter Values can be estimate using data.If Parameter Require model to make faster called as parametric model and Not Required Parameter when model building time call as nonparametric. parameters model example: support vectors machine in the define support vectors and linear regression in define coefficients. nonparametric algorithm example: K-nearest neighbour, Decision Trees in this not mandatory.

A hyperparameter is an external parameter value when before model building. hyper parameter is help for make model very fast and better accurate.

Grid Search

Grid Search Is Collect data that’s provided in to Grid dictionary and select best parameter them same as trial-and-error method.

from sklearn.model_selection import GridSearchCV 
# defining parameter range 
param_grid = {'C': [0.1, 1, 10, 100, 1000],  
              'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 
              'kernel': ['rbf']}  
  
grid = GridSearchCV(SVC(), param_grid, refit = True, verbose = 3) 
# fitting the model for grid search 
grid.fit(X_train, y_train) 

full code: Click Hear

What Is Boosting?

Boosting is an ensemble method for improving the weak model predictions of any given learning algorithm.

XGBoost (eXtreme Gradient Boosting)

XGBoost full form is eXtreme Gradient boosting. Make A gradient boosted decision trees. it provide better Speed and performance than GBM(Gradient Boosting). XGboost also handle overfitting and missing values in model.

#XGBRegressor model
import xgboost as xgb
xg_reg = xgb.XGBRegressor(objective ='reg:linear', colsample_bytree = 0.3, learning_rate = 0.1,
                max_depth = 5, alpha = 10, n_estimators = 10)

Full Code: Click Hear

Final OutCome

XGBoost, Grid Search, k-fold Cross Validation using we make better model speed with accuracy.

Leave a Comment