- KNN = k-nearest neighbor
- To calculate according to the data present
- An example
- constant is k, which is fixed
- value of k is 3 in this case
- Name Age Gender Sport Distance
- Ajay 32 0 Football 9
- Mark 40 0 Neither 17
- Sara 16 1 Cricket 7
- Zaira 34 1 Cricket 66
- Sachin 55 0 Neither 32
- Rahul 40 0 Cricket 17
- Pooja 20 1 Neither 77
- smith 15 0 Cricket 8
- Laxmi 55 1 Football 8
- Michael 15 0 Football 8
- Shayan 23 0
- Calculating the Euclean Distance of each
- d = sqrt((x1-x2)^2(y1-y2)^2)
- Calculate distance for Ajay:
- d = sqrt((23-32)^2(0-0)^2)
- =9
- Find the closest values:
- 7,8,8
- Check age factor if values are same
- Cricket would be estimated.
- if more data is added, then it'll be 3d distance. Another variable will be added.
- d = sqrt((x1-x2)^2(y1-y2)^2(z1-z2)^2)
- # coding: utf-8
- # In[14]:
- #KNN Algorithm Implementation
- # In[15]:
- import numpy as np
- import matplotlib.pyplot as pit
- import pandas as pd
- # In[16]:
- url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
- names = {'sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class'}
- dataset = pd.read_csv(url, names=names)
- # In[17]:
- dataset.head()
- # In[19]:
- X = dataset.iloc(:, :-1).values
- y = dataset.iloc(:, 4).values
- # In[ ]:
- from sklearn.model_selection import test_train_split
- X_train, X_test, y_train, y_test = test_train_split(X,y, test_size=0.20)
- # In[ ]:
- from sklearn.preprocessing import StandardScalar
- scalar = StandardScalar()
- scaler.fit(X_train)
- X_train = scaler.transform(X_train)
- X_test = scaler.transform(X_test)
- # In[ ]:
- y=pred = classifier.predict(X_test)
- # In[ ]:
- from sklearn.metrics import classification_report, confusion_matrix
- print(classification_report(y_test, y_pred))
- print(confusion_matrix(y_test, y_pred))
Recent Pastes