The first step to quality data is a Data Quality Assessment (DQA). This typically is the monitoring, analysing and reporting on the quality, type and quantity of information in your organisation. What we do is to measure how suitable your data/customer information is to meet your business requirements.
We identify areas that can be improved or enhanced because of data that is incorrect, incomplete, inconsistent, duplicated or redundant.
The transformation of your company data in it’s current state to a pre-defined, standardised format using off-the-shelf software as well as customised queries and functions. Our aim is to remove or correct the “dirty data” (i.e. incomplete, inconsistent, out-of-date, incorrectly formatted) on your database systems.
The main purpose of data cleansing is not just to clean up the data, but also to bring consistency and add value to the multiple sets of data in your organisation.
Typical Data Cleansing tasks we perform include the following:
Correcting spelling mistakes and abbreviations
Identifying and dealing with duplicate records
Identifying erroneous records
Fixing or deleting erroneous/foreign characters
Updating missing information e.g. postcodes, titles, suburbs etc. where possible
Checking postcode and boxcode ranges
Standardising data
Unnecessary spaces
Correct data in correct field (data shuffling)
Changing the case of records to sentence case where appropriate
Translating between English and Afrikaans
Quality checks on telephone and cellphone numbers (correct length, dial codes, cellphone prefixes, etc.)
We can also assign household and personal income estimates based on the suburb
In relational database systems a language exists called DML, or Data Manipulation Language. This involves the Selecting, Inserting, Deleting and Updating of records. These concepts or functions are used in the process of data cleansing and enhancing customer information.
This is the computerised comparison of 2 or more sets of records relating to the same objects, e.g. addresses or personal information. Sometimes old legacy systems or valuable customer information on desktop databases (e.g. MS Access) need to be incorporated into a master database or a data warehouse. In this regard we do a matching or comparison exercise to find out which records need to be transferred, amended, updated or deleted.
In many organisations, customer and valuable company data are scattered across the organisation. The format and systems this is stored on is also varied and can include mainframe applications, CRM applications, flat files, spreadsheets and smaller desktop databases like MS Access.
The true potential of this information can only be realised and analysed once this forms part of a master database or stored on or linked to a data warehouse. Thus the overall “big picture” view of the customers can be seen for improved business intelligence.
ETL is the process of reading data from a source, cleaning it and reformatting to a specified, uniform standard, and then writing it to the target destination.
In the transformation part of ETL, data is not just reformatted or converted, but also cleaned to remove duplication and enforce consistency.
Emtech Data Solutions can help you integrate your systems and transform your data to be a valuable asset to your company.
We can create customised small to medium sized databases specific to your company or organisation's requirements. We can also create custom front-ends to help you capture information easier and more accurately.
Do you have many smaller databases and spreadsheets in your company or organisation that contain similar customer information? Let us create a duplicate-free, error-free and cleansed Master database for you from all the disparate databases and spreadsheets.
PAMSS stands for Postal Address Management Service Suppliers. An address grading and checking service provided by SAPO (South African Postal Office) or by Service Suppliers whereby a certificate is issued to clients with an address quality of around 98%. These clients then qualify for rebates at the Postal Office when sending out bulk mail.
Emtech Data Solutions will assist you in achieving this quality in data so that you can qualify for rebates at SAPO. This could save you thousands if not millions of rands!
This is also known as Data Cleansing and is the process of finding, deleting, and/or correcting a database's dirty data (incorrect, inconsistent, incomplete, redundant or out-of-date).
Data scrubbing should not just be the process of cleaning the data, but also to enhance and add value to the data, as well as to bring consistency to different sets of data that have been amalgamated from a variety of data sources across an organisation.