Are the random test points consistently on the correct side of the line?

Place your order now for a similar assignment and have exceptional work written by our team of experts, At affordable rates

For This or a Similar Paper Click To Order Now

Problem 1 – Decision Trees (20 points)
Pull the data from https://archive.ics.uci.edu/ml/datasets/Credit+Approval
Create a decision tree to determine if credit should be extended based on a test case.
Grading criteria: Demonstrate that you evaluated the data set and applied aduquate preprocessing to the data
Make sure you comment you code and the cleaning process so we can follow your logic in grading
Provide a confusion matrix for your results. Text based is fine.
Provide a visualization with explanation that demonstrates logical evaluation of the model
Actual accuracy can depend on how you split the training and test data and other random variationsIf you get below 70% accuracy, there may be a problem with your model
Spoiler Alert: If you don’t start with some exploration to
determine how to approach data cleaning, this will be more difficult
than it should be.
Put the explanation of your model here:
Problem 2 – K-means (10 points)
Use the Ecoli dataset at https://archive.ics.uci.edu/ml/datasets/Ecoli
Ignore the label and create clusters using k values between 4 and 6.
Pick the best k value and explain why you picked it
Show any calculations you used to pick the best cluster
Create two visualizationOne colors the nodes with the cluster membership
The other colors the nodes based on the actual label
Grading criteria: Adequately describe how to pick the best cluster and successful create the required visualization
Provide an explanation of your model:
Problem 3 – Support Vector Machines (10 points)
Use the Iris trainging set
Explore the data to find the best two features to useWe are mostly doing this so we can visualize the results
Split the data set into 80% training and 20% testing
Create a SVM to model the data
Create a visualization that shows the line and the margins
Create anonther visualization that shows the decision surfaceDo not include the test data points
Randomly select 10 test points and add them to the visualization. Color them based on their label
Are the random test points consistently on the correct side of the line?
Predict the label for ALL of the test data Show a confusion matrix
Calculate the F1 measure
Grading criteria: SVM graphically appears to correctly to use a reasonable line
F1 measure is consistent with what we showed in class
Explanation of your model:

Place your order now for a similar assignment and have exceptional work written by our team of experts, At affordable rates

For This or a Similar Paper Click To Order Now

Leave a Reply

Your email address will not be published. Required fields are marked *