{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyMOceJDJDYI5M3GHowbGW9D"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["# **Tugas 4 | Implementasi K-Means Clustering**"],"metadata":{"id":"IjAlnsiFkXvv"}},{"cell_type":"markdown","source":["## K-Means Clustering"],"metadata":{"id":"4Az2jUE2kjFA"}},{"cell_type":"markdown","source":["K-Means Clustering adalah salah satu algoritma dalam menentukan klasifikasi terhadap objek berdasarkan attribut / fitur dari objek tersebut kedalam K kluster/partisi. K adalah angka positif yang menyatakan jumlah grup/kluster partisi terhadap objek. Pemartisian data dilakukan dengan mencari nilai jarak minimum antara data dan nilai ***centroid*** yang telah di set baik secara random atau pun dengan ***Initial Set of Centroids***, kita juga dapat menentukan nilai centroid berdasarkan ***K object*** yang berurutan"],"metadata":{"id":"OLXLbUYJmcrZ"}},{"cell_type":"markdown","source":["***Centroid*** adalah nilai rata-rata aritmetik dari sebuah bentuk objek dari seluruh titik dalam objek tersebut. Penerapan K-Means Clustering ini dapat dilakukan dengan prosedur step by step berikut :\n","\n","- Siapkan data training berbentuk vector.\n","- Set nilai K cluster.\n","- Set nilai awal centroids.\n","- Hitung jarak antara data dan centroid menggunakan rumus ***Euclidean Distance***.\n"," \n"," Rumus Menghitung Jarak :\n","\n"," > $𝙙(p,q) = \\sqrt {Σ_{i=1}^{n}(q_i - p_i)^2} $\n"," \n"," ```\n"," ket :\n"," p,q\t =\tdua titik di ruang-n Euclidean\n"," qi,pi =\tvektor Euclidean, dimulai dari asal ruang (titik awal)\n"," n =\truang-n\n","\n"," ```\n"," \n"," \n","- Partisi data berdasarkan nilai minimum.\n","- Kemudian lakukan iterasi selama partisi data masih bergerak (tidak ada lagi objek yang bergerak ke partisi lain), bila masih maka ke poin 3.\n","- Bila grup data sekarang sama dengan grup data sebelumnya, maka hentikan iterasi.\n","- Data telah dipartisi sesuai nilai centroid akhir."],"metadata":{"id":"S6x4PUjSm3pu"}},{"cell_type":"markdown","source":["## Implementasi ke Bahasa Pemrograman Python"],"metadata":{"id":"jLzKGEdHrXGf"}},{"cell_type":"markdown","source":["### Persiapan data"],"metadata":{"id":"i03mCjd1reNz"}},{"cell_type":"markdown","source":["Data yang akan digunakan adalah data iris, yang dapat diperoleh [disini](https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data)."],"metadata":{"id":"tJdD0y4WtTR3"}},{"cell_type":"code","execution_count":null,"metadata":{"id":"jrFm_H2aFQko","executionInfo":{"status":"ok","timestamp":1664672349949,"user_tz":-420,"elapsed":387,"user":{"displayName":"Caca Erha","userId":"13359221303846732984"}},"colab":{"base_uri":"https://localhost:8080/","height":424},"outputId":"d942ca60-169b-4454-88fa-60f47de9e46c"},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" sepal-length sepal-width petal-length petal-width Class\n","0 5.1 3.5 1.4 0.2 Iris-setosa\n","1 4.9 3.0 1.4 0.2 Iris-setosa\n","2 4.7 3.2 1.3 0.2 Iris-setosa\n","3 4.6 3.1 1.5 0.2 Iris-setosa\n","4 5.0 3.6 1.4 0.2 Iris-setosa\n",".. ... ... ... ... ...\n","145 6.7 3.0 5.2 2.3 Iris-virginica\n","146 6.3 2.5 5.0 1.9 Iris-virginica\n","147 6.5 3.0 5.2 2.0 Iris-virginica\n","148 6.2 3.4 5.4 2.3 Iris-virginica\n","149 5.9 3.0 5.1 1.8 Iris-virginica\n","\n","[150 rows x 5 columns]"],"text/html":["\n","
\n"," | sepal-length | \n","sepal-width | \n","petal-length | \n","petal-width | \n","Class | \n","
---|---|---|---|---|---|
0 | \n","5.1 | \n","3.5 | \n","1.4 | \n","0.2 | \n","Iris-setosa | \n","
1 | \n","4.9 | \n","3.0 | \n","1.4 | \n","0.2 | \n","Iris-setosa | \n","
2 | \n","4.7 | \n","3.2 | \n","1.3 | \n","0.2 | \n","Iris-setosa | \n","
3 | \n","4.6 | \n","3.1 | \n","1.5 | \n","0.2 | \n","Iris-setosa | \n","
4 | \n","5.0 | \n","3.6 | \n","1.4 | \n","0.2 | \n","Iris-setosa | \n","
... | \n","... | \n","... | \n","... | \n","... | \n","... | \n","
145 | \n","6.7 | \n","3.0 | \n","5.2 | \n","2.3 | \n","Iris-virginica | \n","
146 | \n","6.3 | \n","2.5 | \n","5.0 | \n","1.9 | \n","Iris-virginica | \n","
147 | \n","6.5 | \n","3.0 | \n","5.2 | \n","2.0 | \n","Iris-virginica | \n","
148 | \n","6.2 | \n","3.4 | \n","5.4 | \n","2.3 | \n","Iris-virginica | \n","
149 | \n","5.9 | \n","3.0 | \n","5.1 | \n","1.8 | \n","Iris-virginica | \n","
150 rows × 5 columns
\n","\n"," | sepal-length | \n","sepal-width | \n","petal-length | \n","petal-width | \n","Class | \n","
---|---|---|---|---|---|
0 | \n","5.1 | \n","3.5 | \n","1.4 | \n","0.2 | \n","0 | \n","
1 | \n","4.9 | \n","3.0 | \n","1.4 | \n","0.2 | \n","0 | \n","
2 | \n","4.7 | \n","3.2 | \n","1.3 | \n","0.2 | \n","0 | \n","
3 | \n","4.6 | \n","3.1 | \n","1.5 | \n","0.2 | \n","0 | \n","
4 | \n","5.0 | \n","3.6 | \n","1.4 | \n","0.2 | \n","0 | \n","
... | \n","... | \n","... | \n","... | \n","... | \n","... | \n","
145 | \n","6.7 | \n","3.0 | \n","5.2 | \n","2.3 | \n","2 | \n","
146 | \n","6.3 | \n","2.5 | \n","5.0 | \n","1.9 | \n","2 | \n","
147 | \n","6.5 | \n","3.0 | \n","5.2 | \n","2.0 | \n","2 | \n","
148 | \n","6.2 | \n","3.4 | \n","5.4 | \n","2.3 | \n","2 | \n","
149 | \n","5.9 | \n","3.0 | \n","5.1 | \n","1.8 | \n","2 | \n","
150 rows × 5 columns
\n","\n"," | sepal-length | \n","sepal-width | \n","petal-length | \n","petal-width | \n","
---|---|---|---|---|
0 | \n","5.1 | \n","3.5 | \n","1.4 | \n","0.2 | \n","
1 | \n","4.9 | \n","3.0 | \n","1.4 | \n","0.2 | \n","
2 | \n","4.7 | \n","3.2 | \n","1.3 | \n","0.2 | \n","
3 | \n","4.6 | \n","3.1 | \n","1.5 | \n","0.2 | \n","
4 | \n","5.0 | \n","3.6 | \n","1.4 | \n","0.2 | \n","
... | \n","... | \n","... | \n","... | \n","... | \n","
145 | \n","6.7 | \n","3.0 | \n","5.2 | \n","2.3 | \n","
146 | \n","6.3 | \n","2.5 | \n","5.0 | \n","1.9 | \n","
147 | \n","6.5 | \n","3.0 | \n","5.2 | \n","2.0 | \n","
148 | \n","6.2 | \n","3.4 | \n","5.4 | \n","2.3 | \n","
149 | \n","5.9 | \n","3.0 | \n","5.1 | \n","1.8 | \n","
150 rows × 4 columns
\n","