代写BSAN2205 MACHINE LEARNING FOR BUSINESS调试SPSS
- 首页 >> Java编程BSAN2205 MACHINE LEARNING FOR BUSINESS
Project Plan
The course BSAN2205 Machine Learning for Business has three assessment items including a Project Plan, a Project Report and Presentation, and a School-based Take-home Assessment (weighted 20%, 50%, and 30%, respectively). These notes outline my expectations for the Project Plan and introduce the context for the project work. I intend the Plan or proposal to be a formative piece of assessment. The Plan should set the groundwork for your project and project report. I will provide feedback on your Plan that you can incorporate into your project.
Background and Context
In competitive markets, businesses face the challenge of acquiring and retaining customers. Consider subscription services, for example, subscriptions to digital editions of newspapers and magazines, subscriptions to streaming services (film and television, music, news, sport, etc.), and subscriptions to cable television services (Foxtel). Other businesses face the same challenges, for example, airlines, banks, insurance companies, telecommunication companies, and retailers, restaurants,and personal services businesses. One retention strategy is to deepen relationships with customers through “upselling” – convincing a customer to buy something in addition to or more expensive than that they have previously purchased from a business. Streaming services like Netflix and Spotify strive to build customer “engagement” – increasing the number of downloads and/or the time spent streaming.
Bank marketing provides the specific context for the project. Like many consumer businesses, banks confront the challenges of attracting new customers and retaining existing customers. Strategies for retaining customers provides the setting for the project. For banks, engagement is reflected in the number of products (active accounts) customers maintain. Often retention strategies have the goal of deepening engagement by encouraging customers to open new accounts. Consolidating accounts with one rather than many banks may offer consumers some benefits at the margin. For example, highly engaged customers maybe offered lower rates on loans, access to services for which they do not have to pay (at least, not directly), and minimising the overall burden of managing multiple banking relationships. For banks, the benefits of more highly engaged customers are larger and more stable cash flows, lower marketing expenses (with the costs of attracting a customer higher than the costs of retaining a customer, per customer relationship economics), and thus potentially higher profits.
Before moving on, I would like you to appreciate that in problems in business can be solved through effective predictive models of binary outcomes. The decision to purchase or not purchase shares in a company, to acquire or merge with another business, to hire or not hire a prospective employee, etc. All of these decisions involve binary outcomes (in some cases, they can be characterised as “go/no go” decisions). The specific focus of the project is customer acceptance of a marketing offer, but the concepts and models have much broader application.
Aims of the Proposal
The Project Plan has two broad aims. Firstly, the Plan is a marketing document. Second, the Plan is a roadmap. As a marketing document, the Project Plan must sell the project to the stakeholder(s) and/or client. Thus, the Plan should emphasis the emphasis of doing the project. As a proposal or “ roadmap,” the Project Plan should outline in some detail the likely direction of the project. This might include identifying the key variables and methods of analysis.
Key Sections of the Project Plan
More specifically, you might consider including the following sections in your Plan.
1. Background statement
2. Conceptual development
3. Variableselection
4. Methods of analysis/analysis plan
5. Form. of the results
6. Nextsteps
In the background statement (section 1), you may wish to sketch out the initial motivation for the study. This might include reference to the keystakeholder(s) and/or client. I recommend targeting the proposal at a (hypothetical) client to bring a degree of realism to project and to help focus the project (for example, you could contextualise the study with reference to an Australian bank). In this section, also make sure to sell the project. What are the likely benefits of doing the project, what new insights do you anticipate and how will these improve decision making for example?
You might find value in a section 2 that outlines the conceptual framework for your project work. If you focus your project on customer engagement with banks, for example, you might give some thought to advantages to banks and their customers from greater engagement and the process that might drive customers to respond favourably to a bank’s marketing efforts. My preference is you use your own commonsense and logic to define the key concepts and to develop a rationale for their links. Ido not expect a review of the literature, but you might find some desk (Google) research helpful in identifying past studies that have explored similar issues to the ones you are. A boxes and arrows diagram might help to illustrate the core concepts and relationships.
The section on variableselection is probably the key section (section 3). Be very specific about the variables you intend to study. In the social science tradition, much emphasis is placed on explaining why the variables selected for study have been selected – the focus is explanation rather than prediction. This is less the case with the data science paradigm with its focus on prediction – business analysts/data scientists may wish to specific a (initial) model that includes all of the possible feature variables. My minimum expectation for this section is that you provide some description of the output and feature variables you intend to study, and why these feature variables.
Section 4 outlines the methods of analysis. Here I would you to be specific about the models you might use to analyse the data. You may have completed the course BSAN2204 Methods of Business Analytics. A focus of that course was predicting a numeric output variable (“song hotness”) using linear regression. For this course (BSAN2205 Machine Learning for Business), our target variable is categorical: it records whether customers opened or did not open a new account in response to the Bank’s marketing efforts. My expectations for section 4 are that you can identify an appropriate statistical model(s) for analysing the data, state something about the assumptions of the model, and perhaps list the key steps in employing the model. You could also write out the specific model you intend estimating (write out the regression equation, for example, with reference to they- and x- variables).
Section 5 – form. of the results – should give an indication of what the outputs might look like. You could do mock-up of the results. You could also say that you will document the results in PowerPoint format and present them verbally. The nextsteps section concludes the proposal. Here you might remind the client of the core benefits and indicate you need to initialise the project (final client sign-off, for example). You could also add a timeline or perhaps Gantt chart (timetabling the key activities, when you will do them, and identifying any critical paths). At this stage, refrain from doing any statistical analysis of the data – save the analysis for the project reports. Use the Plan to develop some general knowledge of the models you intend to use and sketch out your best plan for the analysis you intend to implement.
The final section of your Plan might address nextsteps (Section 6). You can briefly restate the main motivation for your Plan and highlight the key “nextsteps.” Remember the Plan is a marketing document – perhaps remind the reader of the Plan that this project is an important one and should be completed now.
The Bank Marketing Dataset
The project work for this Semester uses the Bank Marketing dataset. Several variations of the dataset exist. There is one variation available from the UCI Machine Learning Repository and another variation on Kaggle. We will use the version of the dataset available from Kaggle (with some minor variations). Owned by Google, Kaggle is an online community of business analysts and data scientists. Users can freely upload and download data to and from the site (kaggle.com). Kaggle runs competitions often sponsored by third parties. I encourage you to explore the Kaggle website and join the Kaggle community. Kaggle is a great place for those with an interest in machine learning.
I have downloaded the dataset from Kaggle, introduced some further variations, and placed the dataset to the Blackboard site. Please use this version of the dataset for your project. Appendix A provides a list of the variables in the Bank Marketing dataset, including brief descriptions. The target or output variable is customers’ responses to a recent marketing campaign run by the Bank (the Bank being a European bank, specifically, a Portuguese bank). The data is real-world data offered freely by the Bank to the data science community. The data consists of 21 variables (the target variable and 20 feature variables) and observations on approximately 40,000 customers targeted with a particular marketing campaign. The output variable is a binary categorical variable – customers responded to the marketing campaign by either opening a new account or not. The 20 feature variables include a mix of variables reflecting customers’ characteristics (age, education,
etc.), the nature and status of their existing accounts with the Bank (type of accounts, accounts in debit, etc.), variables describing the campaign (number of customer contacts during the campaign), and socio-economic variables (consumer confidence, etc.). The feature variables area mix of categorical and numeric variables.
Given the output variable is a (binary) categorical variable you should explore model forms other than linear regression. As a starting point, I recommend you fit a logistic regression model to the data and subsequently use tree-based methods. A comparison of these methods could bean important of your overall project (logistic regression vs decision trees). Further, you might explore ensemble methods to enhance your implementation of tree-based methods. We will cover these methods in the coming weeks!
Submission Guidelines
The Project Plan has a weight of 20 percent of your score for the course. Please submit your Plan in the form. of a written Word document. I expect you could easily write 2,000 words. Try not to write more than 3,000. I will give your Plan a score out of 100. I will also provide you with written feedback. When marking the Project Plan, I will be looking closely at the links between the sections as much as what you write in each individual section. For example, the background statement should set-up the conceptual development that in turn should set-up the variableselection etc. A high scoring Plan will have a degree of novelty to it (a unique and/or compelling contextualisation, a thoughtfully specified analysis plan – including appropriate performance metrics, etc.). Finally, these notes area guide only to preparing your Project Plan. You may find other ways to present it that are more compelling, more compact, and more complete. If in doubt, do what you think is best.
I will separately provide you with the marking criteria for the Project Plan. Note they will closely follow the criteria of the Project Plan for the course BSAN2204 Methods of Business Analytics.