Program 1 With the Pima Indian diabetes data set, check to see how appropriate it is to be used with the multiple regression technique. The example on multiple linear regression discussed in the lecture from scikit-learn can be used with this data file (y

I’m working on a Python exercise and need support.

regression technique. The example on multiple linear regression discussed in the lecture from scikit-learn can be used with this data file (you have to add column labels to the data file, and it is probably easiest if you save it as a CSV file in Excel with comma as the delimiter). Check the predictions as well as the “metric root mean squared error Vs mean of the y values” and explain what you are thoughts are either via comments on print statements in your program.

Program 2

Much like Hierarchical clustering, scikit-learn has algorithms for Kmeans clustering techniques. You can find an example that explains how to use Kmeans in detail at: https://stackabuse.com/k-means-clustering- with-scikit-learn/ . The first part of this blog explains the Kmeans algorithm itself, and the second part it illustrates how to use scikit-learn to achieve this with a hard-coded data set.

As a first step, this assignment is to understand how this all works with scikit-learn by following through with the example in the web site above, downloading and running it in a program. Since multiple plots are drawn on the same figure window it would help if you create a figure window for every plot (or create subplots whichever is easier) to see how different each plot is with respect to the previous one.

Clustering is really a mechanism where you try to find relationships in the data. Regression techniques do the same, although when we think of linear regression it’s strictly trying to determine how closely one variable follows another. The