Project Data

It is expected that the paper be free from grammatical errors and

appropriately use the same style for citations and reference list.

The minimum requirement for the report will be a minimum of 10 pages, double-spaced, 1-inch margins, using Arial or Times Roman 12-point font.

The submitted report should also include a separate cover page that includes your name and the title of your paper as well list of references used.

Final Data Science Project Report

· Each group will submit a final report that follows the CRISP-DM framework.

· It is expected that the paper be free from grammatical errors and

· appropriately use the same style for citations and reference list.

· The minimum requirement for the report will be a minimum of 10 pages, double-spaced, 1-inch margins, using Arial or Times Roman 12-point font.

· The submitted report should also include a separate cover page that includes your name and the title of your paper as well list of references used.

· You are required to also turn in your R code.

· Late papers will not be accepted.

· The paper will be submitted for grading via software that checks for plagiarism. Plagiarism is a violation of the Student Code of Conduct and will be handled per university policy.

 

If you are working with BANK dataset, please check out the project expectations below:

Client Requirements:

Your client would like to embark on a direct marketing campaign to increase bank revenues. To do that,  they would like to understand what drives a customer to try new products/ services (B_TGT),  understand what drives the total new sales (INT_TGT) as well as the total number of new products and  services purchased by customers (CNT_TGT). Thus, you have been given sample data from which you  are to build, validate, and test your predictive models. To be clear, your client requires three models, one  for each of the three variables they would like to predict to better help them target their direct marketing  campaign. Your client would also like to see how your model performs against the test holdout dataset  and will use your model’s performance on the test dataset for their long-term consulting relationships as  well as to determine whose model will be deployed.

 

 Model #1: Develop a classification model for the b_tgt variable using any of the variables as  predictors (except account, cnt_tgt or int_tgt). Fit at least four candidate models using the  training data and evaluate the fitted models using the validation data. Once you have selected  your final model, generate predictions for the test dataset, and evaluate its performance  against the test dataset.

Model #2: Develop a prediction model for the int_tgt variable using any of the variables as  predictors (except account, b_tgt, or cnt_tgt). Fit at least four candidate models using the  training data and evaluate the fitted models using the validation data. Use “root mean  squared error error” as the evaluation criteria and use your final selected prediction model to  predict int_tgt responses in the test dataset, and evaluate its performance against the test  dataset.

Model #3: Develop a prediction model for the cnt_tgt variable using any of the variables as  predictors (except account, b_tgt, or int_tgt). Fit one or more models using the training data and evaluate the fitted models using the validation data. Use “root mean squared error error” as the evaluation criteria and use your final selected prediction model to predict cnt_tgt responses in the test dataset, and evaluate its performance against the test dataset.