Machine Learning IMPORTANT QUESTIONS
Unit I:
- What is the difference between Supervised and Unsupervised Learning? Provide examples of algorithms used in both types.
- What is Deep Learning, and how is it different from traditional Machine Learning?
- Explain the main challenges in Machine Learning. How do challenges like overfitting, underfitting, and data scarcity affect model performance?
- What are the trade-offs in Statistical Learning? Discuss the bias-variance trade-off and its impact on model performance.
- What are the different types of Machine Learning systems? Discuss the difference between batch learning and online learning.
- What is Empirical Risk Minimization (ERM)? How is it used in training machine learning models?
Unit II
- Explain how Decision Trees work. What are their strengths and weaknesses compared to other classification algorithms like Naïve Bayes and SVM?
- What is the difference between Linear Regression and Logistic Regression? When would you choose one over the other?
- What are Support Vector Machines (SVM)? Explain how SVM works for binary classification and what the role of the hyperplane and margin is.
- How do Multiclass classification and Structured outputs differ from Binary classification? Explain methods like One-vs-All and One-vs-One.
- What are Generalized Linear Models (GLMs)? Explain their role in extending linear regression to handle non-linear relationships and different types of output distributions.
Unit III:
- What is Ensemble Learning? Explain how Bagging and Boosting improve model performance.
- What are Random Forests? How do they combine multiple decision trees to improve predictive accuracy and reduce overfitting?
- What is the difference between Linear and Nonlinear SVM classification? How does the kernel trick help SVM handle non-linear decision boundaries?
- Explain the working of the Naïve Bayes classifier. Under what assumptions does it operate, and how do they affect its performance?
- What are Voting Classifiers? How do they combine predictions from multiple models?
- What is Boosting? Discuss popular algorithms like AdaBoost and Gradient Boosting, and explain how they reduce bias and variance.
- What is Stacking? How does it combine multiple classifiers, and how is it different from bagging and boosting?
Unit IV:
- What is K-Means clustering? Describe its algorithm and discuss its advantages and limitations.
- What are the main approaches for Dimensionality Reduction? Discuss techniques like PCA, LDA, t-SNE, and their role in reducing the complexity of high-dimensional data.
- What is Principal Component Analysis (PCA)? Explain how PCA reduces dimensionality and the importance of eigenvectors and eigenvalues in this process.
- What is Randomized PCA? How does it differ from standard PCA, and why is it considered more computationally efficient for large datasets?
- What is Kernel PCA? Explain how it extends PCA to handle non-linear transformations using kernel functions.
- How can Clustering be used for preprocessing in Semi-Supervised Learning? Discuss how clustering can help in creating labeled data for model training.
Unit V:
- What are Multi-Layer Perceptrons (MLPs)? Explain their architecture and how they are trained using backpropagation.
- Explain the basics of Artificial Neural Networks (ANNs). What are the components like input layers, hidden layers, activation functions, and output layers?
- How would you implement a simple Multi-Layer Perceptron (MLP) using Keras? Describe the key steps involved in defining, training, and evaluating the model.
- What is the role of TensorFlow in building and training deep learning models? Discuss its features and how it differs from other frameworks like PyTorch.
- What are common preprocessing techniques used in TensorFlow for preparing data before feeding it to a neural network? Discuss normalization, data augmentation, and handling missing data.