Assignment
1. Utilising Python 3 Build the following regression models:
- Decision Tree
- Gradient Boosted Tree
- Linear regression
2. Select a dataset (other than the example dataset given in section 3) and apply the Decision Tree and Linear regression models created above. Choose a dataset from Kaggle
3. Build the following in relation to the gradient boost tree and the dataset choosen in step 2
a) Gradient boost tree iterations (see section 6.1)
b) Gradient boost tree Max Bins (see section 7.2)
4. Build the following in relation to the decision tree and the dataset choosen in step 2
a) Decision Tree Categorical features
b) Decision Tree Log (see section 5.4)
c) Decision Tree Max Bins (see section 7.2)
d) Decision Tree Max Depth (see section 7.1)
5. Build the following in relation to the linear regression and the dataset choosen in step 2
a) Linear regression Cross Validation
i. Intercept (see section 6.5)
ii. Iterations (see section 6.1)
iii. Step size (see section 6.2)
iv. L1 Regularization (see section 6.4)
v. L2 Regularization (see section 6.3)
b) Linear regression Log (see section 5.4)
6. Follow the provided example of the Bike sharing data set and the guide lines in the sections that follow this section to develop the requirements given in steps 1,3,4 and 5
Task 1
Task 1 is compromised of developing:
1. Decision Tree
a) Decision Tree Categorical features
b) Decision Tree Log (see section 5.4)
c) Decision Tree Max Bins (see section 7.2)
d) Decision Tree Max Depth (see section 7.1)
Task 2
Task 2 is compromised of developing:
1. Gradient boost tree
a) Gradient boost tree iterations (see section 6.1)
b) Gradient boost tree Max Bins (see section 7.2)
c) Gradient boost tree Max Depth (see section 7.1)
Task 3
Task 3 is compromised of developing:
1. Linear regression model
a) Linear regression Cross Validation
i. Intercept (see section 6.5)
ii. Iterations (see section 6.1)
iii. Step size (see section 6.2)
iv. L1 Regularization (see section 6.4)
v. L2 Regularization (see section 6.3)
b) Linear regression Log (see section 5.4)
Attachment:- Big-Data Assignment.rar