22 Machine Learning Intermediate Python Interview Questions & Answers
Below is a list of our Machine Learning Intermediate Python interview questions. Click on any interview question to view our answer advice and answer examples. You may view 5 answer examples before our paywall loads. Afterwards, you'll be asked to upgrade to view the rest of our answers.
1. Is the SVM machine learning method supervised or unsupervised?
The technical interviewer may ask a question targeted specifically at a particular algorithm to allow you to demonstrate your knowledge and take control somewhat in terms of what you know and what interests you about the algorithm or specific use cases for the algorithm.
In this question, SVM stands for support vector machines. This is an example of a supervised machine learning technique. This algorithm can be used for both regression and classification however it is mainly used for classification.
SVM algorithms are particularly useful for text recognition and image classification.
In an advanced interview specifically targeted at SVM algorithms, the interviewer may ask about non-linear SVM classification however that will be covered in the more advanced machine learning course.
Written by Ryan Brown on July 6th, 2021
2. State a real life use case of a neural network.
This question shows the developer's understanding of machine learning processes.
One of the primary benefits of training and using neural networks is their versatility and widespread real-world applications.
Neural networks can be used for predictions/forecasting, self-driving cars, and natural language processing. It is entirely possible that you have used a neural network in your day-to-day life. Whether that be a self-driving car or a recommendation algorithm on a social media platform.#
As you advance in your career as a machine learning engineer and data scientist you will leverage neural networks to build complex prediction algorithms and systems.
Written by Ryan Brown on July 6th, 2021
3. What is a neural network?
This question shows the developer's knowledge of machine learning terminology and algorithms.
A neural network is a computational network loosely inspired by biological networks. They are made up of multiple layers of perceptrons.
In the field of machine learning, neural networks are at the frontier of research and innovation. The technical interviewers will aim to determine your level of understanding of this type of machine learning algorithm.
The implementation of a neural network is as follows:
1. Takes inputs
2. Makes predictions
3. Compares the prediction with the desired output
4. Adjust the internal state to predict correctly next time.
Written by Ryan Brown on July 6th, 2021
4. What is a perceptron? use the Perceptron function from the scikit learn library to demonstrate this technique
This question shows the developer's knowledge of machine learning terminology.
A perceptron is a building block of a neural network. The technical interview may follow up this question with a more in-depth analysis of neural networks and their use cases.
A perceptron is comprised of 3 components:
1. Input values
2. Weights and Biases
3. Activation function
Below is an example of the scikit learn function Perceptron(). Again we are using the famous "iris" data set to demonstrate this machine learning method.
# Perceptron
from sklearn.datasets import load_iris
from sklearn.linear_model import Perceptron
X ,y = load_iris(return_X_y = True)
perceptron = Perceptron(tol = 1e-3, random_state = 0)
perceptron.fit(X, y)
perceptron.score(X,y)
Written by Tiarnan Brady on May 24th, 2021
5. Where do you source/find data in order to build machine learning algorithms?
This question is an informal question that a technical interviewer may ask to attempt to understand your level of experience and knowledge of systems used to source and share data.
Kaggle is a very well-known site that supplies machine learning engineers with high-quality data in order to facilitate the building of machine learning techniques.
It may also be worth creating a Kaggle profile with some examples of your work in order to demonstrate that you have an active interest in the field and are involved in unique and interesting projects.
https://www.kaggle.com/
Written by Ryan Brown on July 6th, 2021
6. Give an example of a supervised machine learning method and implement the algorithm using python. You may use scikit learn
A technical interview will ask this question in order to determine how familiar you are with machine learning methods. It is often advised to use a data set you are familiar with.
In this example, we use the "iris" data set as it is a well-known data set within machine learning and is a good basis for the implementation of both supervised and unsupervised machine learning techniques.
It is worth selecting a data set of your liking in order to familiarize yourself with the features. This will help you to practice machine learning techniques as well as providing material for you to demonstrate in a technical interview.
Python implementation of Naive Bayes Classifier Algorithm:
We use the famous "iris" data set. This set is particularly useful for demonstrating how machine learning methods work without running into further complexities in terms of cleaning data.
#Naive Bayes Classifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
X ,y = load_iris(return_X_y = True)
X_train , X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
gnb = GaussianNB()
y_pred = gnb.fit(X_train, y_train).predict(X_test)
print("Number of Points incorrectly identified:", (y_test != y_pred).sum())
Written by Tiarnan Brady on May 24th, 2021
7. Is a Naive Bayes Classifier algorithm an example of a supervised or unsupervised algorithm? and state a use case for the Naive Bayes Classifier
The technical interviewer may ask these types of questions as lead questions into a more detailed examination of this family of machine learning algorithms.
Naive Bayes Classifiers are examples of supervised algorithms. They are based on Bayes Theorem with "naive" assumptions of conditional independence of variables.
Naive Bayes Classifiers can be used in real-time predictions as well as text classification. They are particularly useful in sentiment analysis. Sentiment analysis is a part of the Natural Language Processing field within machine learning.
Written by Ryan Brown on July 6th, 2021
8. What are Naive Bayes Classifiers?
This question shows the developer's knowledge of machine learning terminology.
Naive Bayes Classifiers are a family of probabilistic classifiers. Naive Bayes Classifiers share a common principle: they are based on Bayes Theorem and rely heavily on probabilistic independence assumptions.
One of the primary benefits of using a naive bayes classifier is that it performs better than algorithms such as logistic regression and requires less training data to make predictions.
One of the primary drawbacks of using naive bayes classifiers is that it relies heavily on the assumption of independence in terms of the variables involved. In real-world use cases, it is almost impossible to obtain truly independent variables.
Written by Ryan Brown on July 6th, 2021
9. Define Bayes Theorem.
The technical interviewer attempts to understand your level of understanding of some of the theories behind machine learning algorithms. Bayes theorem is a common source of questions in technical interviews.
Bayes Theorem was first demonstrated in the 18th century. It is a mathematical formula for determining probability. It describes the probability of an event based on prior knowledge of the conditions related to that given event.
Bayes Theorem Formula:
P(A l B) = (P(B l A) x P(A) ) / P(B)
Where:
P(A) = Probability of event A
P(B) = Probability of event B
P(A l B) = Probability of A given B
P(B l A) = Probability of B given A
Written by Ryan Brown on July 6th, 2021
10. What is a decision tree and when might they be used?
This question examines your knowledge of machine learning theory. The technical interviewer may ask this question in order to further test your understanding of machine learning concepts and use it as an opening question for further questions.
A decision tree is a tree-like model used to determine the possible outcomes or consequences and the probabilities associated with these outcomes. A decision tree may be used to decide what the best method of manufacturing a product is in terms of cost and time.
One of the advantages associated with decision trees is the simple intuitive structure of the tree. Below is an example of a decision tree, run the code in Jupyter notebook and examine the output.
Visualization of a decision tree. Ensure that you have correctly installed sklearn in order to use the following functions:
#Decision Tree visualisation
from sklearn import tree
from sklearn.datasets import load_iris
iris = load_iris()
x ,y = iris.data, iris.target
clf = tree.DecisionTreeClassifier()
clf = clf.fit(x,y)
tree.plot_tree(clf)
Written by Tiarnan Brady on May 24th, 2021
11. Demonstrate how a linear SVM works and give an example of a real world use case.
This question shows the developer's knowledge of machine learning terminology and its purpose.
SVM stands for support vector machines, they are supervised machine learning algorithms. This question examines your knowledge of the applications of machine learning algorithms.
SVMs are particularly useful for classification such as text classification or image.
The code below demonstrates and visualizes the basic theory behind SVM algorithms.
#SVM Visualisation
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.metrics import plot_confusion_matrix
random_seed = 42
centers = [(0,0),(5,5)]
cluster_std = 1
frac_test_split = 0.2
num_features_for_samples = 2
num_samples_total = 1500
inputs, targets = make_blobs(n_samples = num_samples_total, centers = centers, n_features = num_features_for_samples, cluster_std = cluster_std)
x_train, x_test, y_train, y_test = train_test_split(inputs, targets, test_size=frac_test_split,random_state=random_seed)
clf = svm.SVC(kernel='linear')
clf = clf.fit(x_train, y_train)
support_vectors = clf.support_vectors_
plt.scatter(x_train[:,0],x_train[:,1])
plt.scatter(support_vectors[:,0],support_vectors[:,1],color='red')
Written by Tiarnan Brady on May 24th, 2021
12. What is gradient descent and how is it used in Linear Regression?
This question shows the developer's knowledge of machine learning algorithms.
A technical interviewer may present you with some dummy data and ask you to write a specific machine learning algorithm.
When a technical interviewer asks such a question, they are expecting that you first explain the theory behind the algorithm either using a flow chart or pseudo code.
Linear Regression is an example of a supervised machine learning technique. It is used to predict values rather than classifying them into categories.
Gradient descent is an optimization algorithm used in machine learning to iteratively minimize a given function. This improves the accuracy of the predictions when using machine learning methods.
The gradient function takes 4 arguments:
1. x coordinate
2. y coordinate
3. Learning rate
4. Number of iterations
import matplotlib.pyplot as plt
months = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
revenue_generated = [48, 73, 78, 91, 105, 113, 123, 127, 139, 150]
plt.plot(months, revenue_generated, "o")
plt.show()
#Linear Regression - Gradient function
def gradient_descent(x, y, learning_rate, num_iterations):
b = 0
m = 0
for i in range(num_iterations):
b, m = step_gradient(b, m, x, y, learning_rate)
return [b,m]
# full code run in IDE for results
import matplotlib.pyplot as plt
def get_gradient_b(x, y, b, m):
N = len(x)
diff = 0
for i in range(N):
x_val = x[i]
y_val = y[i]
diff += (y_val - ((m * x_val) + b))
b_gradient = -(2/N) * diff
return b_gradient
def get_gradient_m(x, y, b, m):
N = len(x)
diff = 0
for i in range(N):
x_val = x[i]
y_val = y[i]
diff += x_val * (y_val - ((m * x_val) + b))
m_gradient = -(2/N) * diff
return m_gradient
#Your step_gradient function here
def step_gradient(b_current, m_current, x, y, learning_rate):
b_gradient = get_gradient_b(x, y, b_current, m_current)
m_gradient = get_gradient_m(x, y, b_current, m_current)
b = b_current - (learning_rate * b_gradient)
m = m_current - (learning_rate * m_gradient)
return [b, m]
#Your gradient_descent function here:
def gradient_descent(x, y, learning_rate, num_iterations):
b = 0
m = 0
for i in range(num_iterations):
b, m = step_gradient(b, m, x, y, learning_rate)
return [b,m]
months = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
revenue_generated = [48, 73, 78, 91, 105, 113, 123, 127, 139, 150]
b, m = gradient_descent(months, revenue_generated, 0.001, 10000)
y = [m*x + b for x in months]
plt.plot(months, revenue_generated, "o")
plt.plot(months, y)
plt.show()
Written by Tiarnan Brady on May 24th, 2021
13. What is a confusion matrix and how are they used in machine learning?
This question shows the developer's understanding of machine learning structures.
A confusion matrix is sometimes referred to as an "error matrix". It is a tabular layout that allows for the visualization of the performance of an algorithm.
It is an NxN matrix that can be best leveraged when analyzing a classification algorithm. These structures can help engineers calculate metrics such as the recall, precision, and accuracy of an algorithm.
Most confusion matrices are comprised of:
1. TRUE POSITIVES
2. FALSE POSITIVES
3. FALSE NEGATIVES
4. TRUE NEGATIVES
This makes them particularly useful for analyzing classification algorithms.
A technical interviewer will like to see that you are familiar with such structures and can define them using python. Having knowledge of confusion matrices will increase your effectiveness when working with the wider engineering team.
Run the following code in Jupyter notebook and examine the output:
#Confustion matrix visualisation
import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt
data = {'actual': [1,0,1,1,1,0,1,0], 'predicted': [1,1,1,0,1,1,1,0]}
df = pd.DataFrame(data, columns=['actual','predicted'])
df = pd.DataFrame(data, columns=['actual','predicted'])
confusion_matrix = pd.crosstab(df['actual'], df['predicted'], rownames = ['Actual'],colnames=['Predicted'])
sn.heatmap(confusion_matrix, annot =True)
Written by Tiarnan Brady on May 24th, 2021
14. What is the F1 score of an algorithm? And demonstrate how it may be calculated?
This interview question concentrates on machine learning recall metrics.
The F1 score is essentially the balance between the precision and the recall. The technical interviewer wants to examine your ability to understand multiple metrics in situ and their relationship in order to effectively analyze the performance of a machine learning algorithm.
F1 equation:
F1 = 2* ( ( precision*recall) / (precision+recall) )
These questions may provide the opportunity to elaborate and explain what the precision and recall metrics reveal about a machine learning algorithm.
Below shows the python code that leverages the sklearn library in order to calculate the f1 score:
#calcualting the f1_score
from sklearn.metrics import f1_score
y_true = [0,1,2,0,1,2,5,6,8,10]
y_prediction = [0,2,1,0,0,1,5,3,1,9]
f1_score(y_true,y_prediction, average = 'macro')
Written by Tiarnan Brady on May 24th, 2021
15. Define recall and use a scikit learn library to demonstrate how you would calculate this metric.
This interview question focuses on machine learning metrics and functions.
A recall is a measure of how the model correctly identifies True Positives. The formula used to derive recall is:
Recall = (TRUE POSITIVES) / (TRUE POSITIVES + FALSE NEGATIVES)
In summary, recall is the metric that allows us to determine how accurate our algorithm was in identifying the relevant data.
The recall_score function from the sklearn.metrics library is a particularly useful function for calculating recall.
The y_true array represents the correct value. The y_prediction represents a sample output of prediction from the algorithm.
The recall_score function takes 3 parameters: y_true array, y_prediction array, and the average setting. The average setting controls the average calculation performed on the results. For example, the "macro" average refers to an unweighted average of the results.
#Simple implementation of the recall calculation
from sklearn.metrics import recall_score
y_true = [0,1,2,0,1,2,5,6,8,10]
y_prediction = [0,2,1,0,0,1,5,3,1,9]
recall_score(y_true,y_prediction, average = 'macro')
Written by Tiarnan Brady on May 24th, 2021
16. Define and calculate the precision of the algorithm. How does this differ from the accuracy of the algorithm?
This question focuses on the developer's understanding of machine learning metrics and limitations.
Precision is another performance metric used to analyze machine learning algorithms.
Precision is the ratio of correctly predicted positive observations to the total number of predicted positive observations.
Precision = (TRUE POSITIVES) / (TRUE POSITIVE + FALSE POSITIVE)
Precision is represented as a decimal number between 0 and 1. A high precision relates to a low false-positive rate. Examining the equation above it is clear to see that the denominator would increase if the number of false positives increases therefore making the precision value lower.
Precision differs from accuracy in the sense that accuracy is more well suited to symmetric datasets where the values of the false positives and the false negatives are almost the same. The interviewer wants to afford you the opportunity to demonstrate your knowledge of multiple performance metrics and your understanding of some of the limitations when it comes to each metric.
Multiple performance metrics are often calculated to analyze the performance of a given algorithm.
Written by Ryan Brown on July 6th, 2021
17. What is meant when referring to the 'accuracy' of a machine learning algorithm? and how might you calculate the accuracy of an algorithm?
The technical interviewer is examining your knowledge of algorithm performance analysis when asking this question. In machine learning, it is important to know not only how to write algorithms but how to analyze their performance to determine if their predictions are reliable.
The 'Accuracy' of an algorithm is sometimes referred to as "classification accuracy". It is simply the ratio of the number of correct predictions to the total number of input samples.
Accuracy = (Number of Correct Predictions) / (Total Number of Predictions made)
Accuracy = (TRUE POSITIVES + TRUE NEGATIVES) / (TRUE POSITIVES +FALSE POSITIVES +FALSE NEGATIVES +TRUE NEGATIVES)
Accuracy is probably the simplest metric used to analyze machine learning algorithms. The interviewer may ask this question as a "starter" question in order to further examine performance metrics such as:
1. Logarithmic Loss
2. F1 scores
3. Precision
Written by Ryan Brown on July 6th, 2021
18. When might you use the k-nearest neighbor algorithm (KNN)?
This interview question concentrates on machine learning algorithms and the use of calculation.
The k-nearest neighbor's algorithm is a supervised machine learning algorithm. It is used to solve both regression and classification problems making it a particularly versatile algorithm.
An example of a potential use case for KNN is to classify a student within a class as to whether they will pass a test. The various classes could be: "Will Pass", "Will Fail", "Will Pass in the top 10%", "Will fail in the bottom 10%".
A student's profile or data points can then undergo KNN and the student can be classified based on other members of the class or legacy data.
Written by Ryan Brown on July 6th, 2021
19. Demonstrate the k-nearest neighbor algorithm (KNN) using the following data
This interview question concentrates on machine learning algorithms and the use of calculation.
The k-nearest neighbors (KNN) algorithm can be split into three parts:
1. Calculate the Euclidean Distance
2. Determine the nearest neighbors
3. Make a prediction
The Euclidean distance simply refers to the distance between two points in Euclidean space. It is calculated using cartesian coordinates.
The following link contains the data that we will use in the examples below:
https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv
Step 1: Calculate the Euclidean distance
Step 2: Determine the nearest neighbors
Step 3: Make a prediction
#STEP 1 Calculate the Euclidean distance
from math import sqrt
def euclidean_distance(a,b):
distance = 0.0
for i in range(len(a)-1):
euclid_distance += (a[i]-b[i])**2
return sqrt(euclid_distance)
#STEP 2 Determine the nearest neighbours
def calc_neighbours(train, test_row, number_neighbours):
distances = list()
for train_row in train:
distance = euclidean_distance(test_row, train_row)
distances.append((train_row),dist)
distances.sort(key=lambda tup: tup[1])
neighbours =list()
for i in range(number_neighbours):
neighbours.append(distances[i][0])
return neighbours
#STEP 3 Make a prediction
def predict(train, test_row, number_neighbours):
neighbours = calc_neighbours(train, test_row, number_neighbours)
output_values = [row[-1] for row in neighbours]
prediction = max(set(output_values), key=output_values.count)
return prediction
Written by Tiarnan Brady on May 24th, 2021
20. What is the difference between linear regression and multiple linear regression?
The technical interviewer may ask this question to determine your understanding of the details of both simple linear and multiple linear regression. This question may lead to a further discussion on supervised machine learning techniques.
Simple linear regression has only one x and y variable. Multiple linear regression has one y variable but one or more x variables. The core concepts of gradient descent remain the same for both techniques. They are also both supervised techniques that are used in the prediction of discrete variables.
Written by Ryan Brown on July 6th, 2021
21. Outline the basic steps or stages you use when implementing a machine learning algorithm
The technical interviewer is attempting to determine your thought process and logic when approaching a problem.
It is important to clearly outline and elaborate on each point to show the interviewer that you can effectively communicate your thinking to other members of the development team. Communication skills are often overlooked when preparing for technical interviewers. However, they are vital for the role of developer and the interviewer will rank you much higher if you have strong communication skills.
Below is an example of how to answer such a question. This acts as a guide but you can add or subtract any details that are specific to your process or logic:
1. Determine the question you would like to answer
2. Gather and understand the data that you have access to
3. Clean the data and carry out feature engineering
4. Choose a machine learning model that will be best suited to your requirements and provide the necessary insights
5. Evaluate the model and tune different parameters to increase performance
6. Determine the accuracy, precision, recall, or F1 of the algorithm
7. Present your findings and insights
Your method may be a variation of this process however the core principles remain constant.
Written by Ryan Brown on July 6th, 2021
22. Implement the train test model for splitting data using the scikit learn python library
This question shows the developer's understanding of model implementation.
When training machine learning models it is important to separate your data set into:
1. Train set
2. Test set
Usually, the train set accounts for 80% of the original data set and the test set accounts for 20%. The test train model is used so that no train data is used when testing the algorithm and therefore it may impact the result of the test as the algorithm has "seen" this data before.
Ensure that you have installed sci-kit learn via the command line using the following command:
pip install sci-kit-learn
You can use your favorite IDE to run the code, Jupyter notebook is a common IDE amongst data scientists.
The interviewer wants to determine if you have an understanding of why the test train model is necessary and how to implement such a model using python.
The sklearn.model selection library is used to import the train_test_split() function. This will allow us to separate our data set into test and train sets.
The test_train_split() function takes 4 arguments:
1. features data
2. labels data
3. test_size (which is the percentage of the overall data set that is separated from the test data set)
3. random_state (this controls the shuffling of the data before splitting it into test and train sets)
The code below outlines the test train model using python's library sci-kit.
If you run this code in your IDE you will see that the features data will be randomly split 80/20 into a features_train set and a features_test set.
from sklearn.model_selection import train_test_split
features = [1,2,3,4,5]
labels = [6,7,8,9,10]
features_train, features_test, labels_train, labels_test = train_test_split( features, labels , test_size = 0.2, random_state = 42)
print(features_train)
print(features_test)
Written by Tiarnan Brady on May 24th, 2021