Credit Risk, Classification, Churn Prediction & KNIME for Data Science Practice

Q1. What is typically the biggest challenge in assessing credit risk?

A. High-income applicants
B. People with the best credit don't need loans, and those with the worst likely won’t repay
C. Bank staff being biased
D. Low interest rates
Answer: B. People with the best credit don't need loans, and those with the worst likely won’t repay

Q2. Which of the following is considered when evaluating someone's credit history?

A. Tax returns
B. Loan type
C. Whether previous loans were paid on time
D. Age of the applicant
Answer: C. Whether previous loans were paid on time

Q3. Why are middle-segment customers typically preferred by banks?

A. They demand higher interest
B. They are more likely to need loans and repay them
C. They have government backing
D. They always default
Answer: B. They are more likely to need loans and repay them

Q4. What attribute does NOT typically influence credit risk?

A. Marital status
B. Annual income
C. Favorite color
D. Credit history
Answer: C. Favorite color

Q5. Which of these is a common loan term parameter?

A. Type of currency
B. Duration of loan repayment
C. Loan repayment office
D. Weekly groceries
Answer: B. Duration of loan repayment

Q6. Income of a loan applicant is an example of a:

A. Dependent variable
B. Noise variable
C. Independent variable
D. Categorical label
Answer: C. Independent variable

Q7. A married couple applying for a home loan provides which kind of data?

A. Loan default record
B. Personal information
C. Output variable
D. Noise factor
Answer: B. Personal information

Q8. Credit risk models are typically built using:

A. Surveys
B. Machine learning
C. Manual inspection
D. Stock market trends
Answer: B. Machine learning

Q9. Successful credit risk modeling has led to:

A. Decrease in credit card usage
B. Proliferation of mortgages and credit cards
C. Higher default rates
D. Less bank automation
Answer: B. Proliferation of mortgages and credit cards

Q10. A bank rejecting a loan because of poor credit score is a:

A. Random process
B. Classification decision
C. Regression decision
D. Time series output
Answer: B. Classification decision

Q11. Credit risk classification is an example of:

A. Supervised learning
B. Unsupervised learning
C. Clustering
D. Reinforcement learning
Answer: A. Supervised learning

Q12. Which of the following is the class label in credit risk classification?

A. Annual income
B. Home ownership
C. Loan default (Yes/No)
D. Marital status
Answer: C. Loan default (Yes/No)

Q13. In a training set, what is meant by x?

A. Class label
B. Outcome
C. Attributes or input variables
D. Test case
Answer: C. Attributes or input variables

Q14. What type of classifier technique uses similarity to known points?

A. Naïve Bayes
B. Decision Tree
C. Nearest-Neighbor
D. Logistic Regression
Answer: C. Nearest-Neighbor

Q15. Which classifier method builds hierarchical tree-like structures?

A. Neural networks
B. K-means
C. Decision Trees
D. SVM
Answer: C. Decision Trees

Q16. Which ensemble method uses multiple trees?

A. Naïve Bayes
B. Random Forest
C. KNN
D. SVM
Answer: B. Random Forest

Q17. Why are decision trees easy to interpret?

A. They use linear equations
B. They model logic rules
C. They cluster data automatically
D. They need no training data
Answer: B. They model logic rules

Q18. What is a necessary requirement for a decision tree model?

A. No class labels
B. Image data
C. Predefined discrete classes
D. Only numeric input
Answer: C. Predefined discrete classes

Q19. Which of the following would most likely result in a loan default?

A. Income over $150K
B. Married and homeowner
C. No home, low income, single
D. Good credit history
Answer: C. No home, low income, single

Q20. A model that learns from labeled examples is called:

A. Unsupervised
B. Reinforced
C. Supervised
D. Clustering
Answer: C. Supervised

Q21. What does a decision tree split data based on?

A. Correlations
B. Random noise
C. Attribute values
D. Time
Answer: C. Attribute values

Q22. In a decision tree, leaves represent:

A. Feature sets
B. Nodes
C. Final decisions
D. Regression equations
Answer: C. Final decisions

Q23. Decision trees are useful because they:

A. Require no computation
B. Can represent rules
C. Always yield 100% accuracy
D. Are slower than neural nets
Answer: B. Can represent rules

Q24. If two trees fit the same data, this is because:

A. Data is corrupted
B. Classification is not needed
C. Multiple rules can produce same results
D. Machine learning failed
Answer: C. Multiple rules can produce same results

Q25. Classification assumes:

A. No training data
B. Continuous outcome variable
C. Predefined class labels
D. Supervised clustering
Answer: C. Predefined class labels

Q26. A person who owns a home and earns >80K is:

A. Likely to default
B. Likely not to default
C. Indeterminate
D. Not a loan candidate
Answer: B. Likely not to default

Q27. If a person is single, not a homeowner, and earns <80K, the decision tree would likely:

A. Approve the loan
B. Request more documents
C. Predict default
D. Ask for a co-signer
Answer: C. Predict default

Q28. Marital status and home ownership are examples of:

A. Output labels
B. Class variables
C. Input attributes
D. Noise features
Answer: C. Input attributes

Q29. What is used to test a decision tree after it is trained?

A. Training data
B. Evaluation metrics
C. Test set
D. Parameters
Answer: C. Test set

Q30. What is the process of making predictions called?

A. Induction
B. Deduction
C. Regression
D. Forecasting
Answer: B. Deduction

Q31. Which of the following is NOT typically used in a decision tree?

A. Root node
B. Leaf node
C. Branch
D. Circle loop
Answer: D. Circle loop

Q32. A classification rule such as “If A and B, then Yes” is represented in:

A. Linear Regression
B. Naïve Bayes
C. Decision Tree
D. Clustering
Answer: C. Decision Tree

Q33. The process of using training data to build a model is called:

A. Pruning
B. Testing
C. Training
D. Validating
Answer: C. Training

Q34. Which classification technique is most sensitive to outliers?

A. Random Forest
B. Decision Trees
C. K-Nearest Neighbors
D. SVM
Answer: C. K-Nearest Neighbors

Q35. Which of the following is true about overfitting?

A. The model generalizes well
B. It performs poorly on training data
C. It performs well on training but poorly on new data
D. It increases prediction accuracy
Answer: C. It performs well on training but poorly on new data

Q36. Pruning in decision trees helps to:

A. Increase tree depth
B. Improve training accuracy
C. Reduce overfitting
D. Add more features
Answer: C. Reduce overfitting

Q37. Which metric would you use to assess classification performance?

A. R-squared
B. Mean Absolute Error
C. Accuracy
D. Median
Answer: C. Accuracy

Q38. In credit modeling, a confusion matrix helps analyze:

A. Decision speed
B. Loan tenure
C. Model performance
D. Customer feedback
Answer: C. Model performance

Q39. A model that has high precision but low recall:

A. Returns many false positives
B. Misses many actual positives
C. Is ideal for all cases
D. Overfits always
Answer: B. Misses many actual positives

Q40. Decision trees handle:

A. Only numeric data
B. Only categorical data
C. Both numeric and categorical data
D. Only binary output
Answer: C. Both numeric and categorical data

Q51. Customer churn refers to:

A. Customers buying more
B. Customers canceling services or subscriptions
C. Returning customers
D. Product returns
Answer: B. Customers canceling services or subscriptions

Q52. Churn is a key concern in:

A. Manufacturing
B. Telecom and banking
C. Construction
D. Government
Answer: B. Telecom and banking

Q53. What kind of machine learning is churn prediction?

A. Unsupervised learning
B. Supervised learning
C. Reinforcement learning
D. Dimensionality reduction
Answer: B. Supervised learning

Q54. In churn prediction, the class label is usually:

A. Type of service
B. Subscription duration
C. Churn (Yes/No)
D. Age group
Answer: C. Churn (Yes/No)

Q55. A churn prediction model can help businesses to:

A. Expand manufacturing
B. Improve recruitment
C. Retain customers
D. Lower taxes
Answer: C. Retain customers

Q56. Which of these features is least useful in churn prediction?

A. Customer complaints
B. Tenure with company
C. Monthly charges
D. Customer’s height
Answer: D. Customer’s height

Q57. High monthly charges may indicate:

A. Happy customer
B. Potential churn
C. No correlation
D. Data error
Answer: B. Potential churn

Q58. Churn prediction models must be:

A. Perfectly accurate
B. Interpretable and timely
C. Expensive and slow
D. Written in Java
Answer: B. Interpretable and timely

Q59. An unbalanced dataset for churn prediction may have:

A. Equal churn and no-churn
B. Mostly churn cases
C. Mostly non-churn cases
D. Only new customers
Answer: C. Mostly non-churn cases

Q60. In churn prediction, recall is more important than:

A. F1-score
B. Precision (in some cases)
C. Support
D. Accuracy
Answer: B. Precision (in some cases)

Q61. Upselling to high-risk churn customers is:

A. Profitable strategy
B. Bad business logic
C. Churn mitigation
D. Data cleansing
Answer: C. Churn mitigation

Q62. A false negative in churn prediction means:

A. Predicting churn when it’s not
B. Missing an actual churn
C. Predicting loyalty correctly
D. None of the above
Answer: B. Missing an actual churn

Q63. The goal of churn prediction is to:

A. Hire better agents
B. Know which customers to offer retention deals to
C. Reduce server load
D. Create ads
Answer: B. Know which customers to offer retention deals to

Q64. In logistic regression used for churn, the output is:

A. A class label directly
B. A probability between 0 and 1
C. A decision tree
D. A clustering group
Answer: B. A probability between 0 and 1

Q65. The feature “Tenure < 1 year” might signal:

A. Loyal customer
B. High risk of churn
C. No relation
D. Data error
Answer: B. High risk of churn

Q66. A high churn rate indicates:

A. High retention
B. Many new customers
C. Many lost customers
D. None of the above
Answer: C. Many lost customers

Q67. Which action can reduce churn?

A. Ignoring complaints
B. Offering long-term discounts
C. Reducing support team
D. Increasing fees
Answer: B. Offering long-term discounts

Q68. Which of the following is a retention strategy?

A. Cancelling all low-usage users
B. Sending loyalty rewards
C. Reducing data collection
D. Lowering response times
Answer: B. Sending loyalty rewards

Q69. Segmenting customers by churn risk helps:

A. Reduce ads
B. Prioritize outreach
C. Increase complaints
D. Decrease accuracy
Answer: B. Prioritize outreach

Q70. Decision tree models for churn are valued because:

A. They predict probabilities
B. They model time series
C. They provide simple if-then rules
D. They cluster customers
Answer: C. They provide simple if-then rules

Q81. KNIME is primarily used for:

A. Video editing
B. Data analysis and machine learning
C. Web development
D. Drawing diagrams
Answer: B. Data analysis and machine learning

Q82. KNIME stands for:

A. Knowledge Integration Mining Environment
B. Known Intelligent Machine Evaluator
C. Kernel in Machine Exploration
D. Knowledge Network Infrastructure for Modelling Experiments
Answer: A. Knowledge Integration Mining Environment

Q83. In KNIME, what is a “workflow”?

A. A document file
B. A series of connected nodes
C. An Excel macro
D. A Python script
Answer: B. A series of connected nodes

Q84. Which file type is commonly used to import data in KNIME?

A. .mp3
B. .xml
C. .csv
D. .exe
Answer: C. .csv

Q85. KNIME nodes are:

A. Scripts
B. Plugins
C. Visual blocks with specific tasks
D. Databases
Answer: C. Visual blocks with specific tasks

Q86. KNIME’s visual interface allows for:

A. Text-only inputs
B. Manual script writing only
C. Drag-and-drop modeling
D. GPU rendering
Answer: C. Drag-and-drop modeling

Q87. What kind of learning does KNIME support?

A. Supervised only
B. Unsupervised only
C. Both supervised and unsupervised
D. None
Answer: C. Both supervised and unsupervised

Q88. KNIME integrates with which programming language?

A. Java
B. Python
C. R
D. All of the above
Answer: D. All of the above

Q89. KNIME workflows can be saved and:

A. Shared and reused
B. Only run once
C. Sent via email as MP3
D. Uploaded to Instagram
Answer: A. Shared and reused

Q90. KNIME excels in:

A. High-end graphics
B. Text editing
C. Visual programming for data science
D. Backend development
Answer: C. Visual programming for data science

Q91. To build a decision tree in KNIME, you must:

A. Write Java code
B. Connect a decision tree learner and predictor node
C. Train a deep learning model
D. Use Excel macros
Answer: B. Connect a decision tree learner and predictor node

Q92. KNIME’s output nodes are used to:

A. Hide the data
B. Export or visualize results
C. Encrypt files
D. Delete logs
Answer: B. Export or visualize results

Q93. KNIME supports integration with:

A. Azure
B. TensorFlow
C. H2O.ai
D. All of the above
Answer: D. All of the above

Q94. What makes KNIME attractive for non-programmers?

A. Free food coupons
B. Command-line only tools
C. Visual, no-code workflow design
D. Mandatory Java knowledge
Answer: C. Visual, no-code workflow design

Q95. You can automate workflows in KNIME by:

A. Repeating steps manually
B. Using workflow loops and scheduling
C. Installing more RAM
D. None of the above
Answer: B. Using workflow loops and scheduling

Q96. Which node would you use to read an Excel file in KNIME?

A. Java Reader
B. Table Reader
C. Excel Reader
D. Data Generator
Answer: C. Excel Reader

Q97. KNIME's key advantage over coding tools is:

A. Data loss
B. Visual drag-and-drop interface
C. More bugs
D. Poor compatibility
Answer: B. Visual drag-and-drop interface

Q98. KNIME Server adds functionality such as:

A. No extra features
B. Scheduled workflow execution and collaboration
C. Video editing tools
D. Animation libraries
Answer: B. Scheduled workflow execution and collaboration

Q99. KNIME offers:

A. Paid-only plans
B. A free open-source desktop version
C. Only cloud-based usage
D. Subscription-based email
Answer: B. A free open-source desktop version

Q100. KNIME is best suited for:

A. Video streaming
B. Office automation
C. Data science and predictive analytics
D. Entertainment
Answer: C. Data science and predictive analytics

Credit Risk, Classification, Churn Prediction & KNIME for Data Science Practice | MCQs

Statistics and Probability Course Outline

Mergers and Acquisitions

Events in Probability: An In-Depth Exploration

Mergers and Acquisitions | 100+ MCQs with Answers | Part 1

Credit Risk, Classification, Churn Prediction & KNIME for Data Science Practice | MCQs

You might like