Skip to content

Quantum-Software-Development/14-DataMining_DBSCAN_and_Spectral-Clustering

Repository files navigation


[🇧🇷 Português] [🇺🇸 English]







Sponsor Quantum Software Development




Institution: Pontifical Catholic University of São Paulo (PUC-SP)
School: Faculty of Interdisciplinary Studies
Program: Humanistic AI and Data Science Semester: 2nd Semester 2025
Professor: Professor Doctor in Mathematics Daniel Rodrigues da Silva





🎶 Prelude Suite no.1 (J. S. Bach) - Sound Design Remix
Statistical.Measures.and.Banking.Sector.Analysis.at.Bovespa.mp4

📺 For better resolution, watch the video on YouTube.




Tip

This repository is a review of the Statistics course from the undergraduate program Humanities, AI and Data Science at PUC-SP.

Access Data Mining Main Repository







Important

⚠️ Heads Up





Welcome to your repository guide for DataMining DBSCAN_and_Spectral Clustering. This Repo is written so anyone even kids, can understand the two powerful clustering algorithms: DBSCAN and Spectral Clustering.



Table of Contents




Clustering is a way for computers to group things that are similar—like organizing marbles by color, or animals by species. The computer looks for natural groups in the data, so points in the same group are more like each other than points in other groups. Some points might not fit anywhere; finding them is important too!




DBSCAN stands for "Density-Based Spatial Clustering of Applications with Noise." It helps find groups in data where points are close together, based on how many neighbors each point has.



How DBSCAN Works (Step-by-Step)


  1. Pick any point not yet checked.

  2. Draw a circle around it: The size the circle (called epsilon, $ \varepsilon $) says what counts as "close."

  3. Count all the neighbors inside the circle.

    • If enough neighbors (at least MinPts), this is a core point—start a new group!
    • If not enough: Maybe a border point or "noise."
  4. Grow the group: For each direct neighbor that is a core point, include their neighbors too—so the group grows!

  5. Repeat: Until every point is grouped or marked as noise.




  • Core point: Has lots of friends (enough neighbors within $ \varepsilon $).

  • Border point: Doesn't have enough direct neighbors, but is close to a core point.

  • Noise: Too far from any busy area. Not in a group at all!
























1 Abdi, H. & WilliamsC, L.J. Principal Component Analysis. Wiley Interdisciplinary Reviews, 2010.

2. Castro, L. N. & Ferrari, D. G. (2016). Introdução à mineração de dados: conceitos básicos, algoritmos e aplicações. Saraiva.

3. Dunteman, J. Principal Component Analysis. SAGE Publications, 1989.

4. Ferreira, A. C. P. L. et al. (2024). Inteligência Artificial - Uma Abordagem de Aprendizado de Máquina. 2nd Ed. LTC.

5. Larson & Farber (2015). Estatística Aplicada. Pearson.

6. Liu, F.T. et al. Isolation Forest. IEEE ICDM, 2008.





🛸๋ My Contacts Hub





────────────── 🔭⋆ ──────────────

➣➢➤ Back to Top

Copyright 2025 Quantum Software Development. Code released under the MIT License license.

About

👩🏻‍🚀 14-DataMining - DBSCAN and Spectral Clustering

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Sponsor this project