Data matching

Data matching (also known as record matching, data deduplication, entity resolution, object identification, or field matching) is the process of identifying, comparing, and merging records that represent the same entities from multiple data sources. It is a crucial component of data integration, data quality assurance, and data warehousing.

Data matching is typically performed in order to consolidate multiple collections of data into an integrated whole. This unified view can then be further used to detect trends, target customers, identify duplicate records, and improve the accuracy of decision-making.

Data matching is usually done using a comparison algorithm that compares various fields of two different records that are thought to be the same entity, such as a person name, or a customer account number. The algorithm then generates a score based on the degree of similarity between the two records. Depending on the required accuracy, the scores resulting from the comparison may either be accepted, or further refined and manually verified.

In addition to individuals, data matching can be used to identify movements of assets and goods, anomalies in financial transactions, and changes in geographical features (e.g. roads).

Data matching techniques are becoming increasingly automated with the use of machine learning algorithms which are suited to data matching tasks. This allows for faster processing times as well as increased accuracy.

Data matching is essential for a variety of modern applications and services, such as maintaining the integrity of databases, optimizing supply chain performance, detecting fraud and money laundering, and developing personalized services. Data matching is also used in healthcare to protect patient privacy and facilitate access to patient information for healthcare professionals.

