InTDS ArchivebyJosh TaylorFuzzy matching at scaleFrom 3.7 hours to 0.2 seconds. How to perform intelligent string matching in a way that can scale to even the biggest data sets.Jul 1, 201917Jul 1, 201917
InTDS ArchivebyVadim MarkovtsevFuzzy matching people namesI’ve recently had to solve an interesting problem: given two unordered lists with real people names, match identities in between.Feb 25, 20211Feb 25, 20211
InTDS ArchivebyMala DeepSurprisingly Effective Way To Name Matching In PythonData Matching, Fuzzy Matching, Data DeduplicationJun 30, 20201Jun 30, 20201
Audhi AprilliantThe Optimization of Fuzzy String Matching Using TF-IDF and KNNHow to accelerate the computation time of fuzzy string matching from hours to secondsFeb 13, 20214Feb 13, 20214
InTDS ArchivebyJonathan KernesLocality Sensitive Hashing: How to Find Similar Items in a Large Set, with PrecisionWe offer a guide to the art of locality sensitive hashing, with applications to document comparison and vector similarity.Feb 4, 20211Feb 4, 20211
Lukas RistLocality Sensitive Fuzzy HashingUsing hashes maximized for collision probability (in Golang)Dec 12, 2018Dec 12, 2018
InTDS ArchivebyAviad AtlasHybrid Fuzzy Name MatchingHow can I match between two different names in a DB that are actually the same person?Nov 13, 201813Nov 13, 201813