DataMagic Technical Column Vol.20
Data Masking
Introduction
A data mart (purpose-specific database) extracts data tailored to a specific purpose from a data warehouse, allowing each user department to retain only the data appropriate for that purpose. For example, in the insurance and credit card industries, it is common to link input data from a contract management system when building a contract inquiry system. However, in contract inquiry systems, personal information such as phone numbers and addresses in the input data are often masked to prevent disclosure to users of the contract inquiry system. Furthermore, contract data is used in the testing phase of system construction to prepare data as close as possible to the production data, and personal information is often masked before testing. When performing this type of data masking, while there are often rules such as "which digits of an address should be masked," the masking method is often manually implemented, making it difficult to systematize and standardize.
By using DataMagic, it is possible to automatically convert personal information in the target data into specified masking characters in accordance with the data cleansing required when building a data mart, thereby reducing the amount of work required for data masking.
To use this tip, you must have DataMagic installed on your computer. For information on where to get DataMagic and how to install it, please refer to the separate article "Installing DataMagic."
How to operate
Step 1 - Prepare the source data
First, prepare the CSV data to be converted. Once the data is ready, save it in the following folder on the PC where DataMagic is installed.
C:\work\SAMPLE20\ (saved with the file name "in")
»Download the source data (Note: The sample file is in zip format. Please unzip it before use.)
Step 2 - Download and configure DataMagic script file
Download the script file below and save it in the folder where DataMagic is installed.
C:\work\SAMPLE20\ (saved as "sample20.igen")
Launch DataMagic management screen and import the downloaded script file into DataMagic. To import, click the [Import Management Information] icon under [Tools] on the start screen. Specify the downloaded file on the Import Management Information screen and click the Import button. The ID SAMPLE20 will then be registered on the Data Processing Information List screen.
»Download the script file (Note: The sample file is in zip format. Please unzip it before use.)
Step 3 - Run DataMagic
Double-click the data processing information ID SAMPLE20 registered in step 2 and check that the file names for the input and output settings are set correctly. Open the data processing information ID SAMPLE20 from the screen and click the "Execute" button at the top of the screen.
Step 4 - Check the execution results
When execution is complete in step 3, a file called "out" will be created in C:\work\SAMPLE20\, which is specified in the output settings. Check this file using an editor such as Notepad.
lastly
This data masking is achieved using a string replacement function called REPLACE_REG. This function can use regular expressions to search for characters and allows you to set very detailed replacement conditions, making it possible to search for and replace a wide variety of strings. It can be used for a wide range of purposes, not just data masking.
Please download DataMagic trial version and try out the technical column explanations.
- The trial version is free to use for 60 days.
- After you sign up for the trial version, you will receive 90 days of free technical support.
DataMagic Column List
- DataMagic Manager Installation
- "DataMagic" is recommended for its "fast, cheap, and easy" features.
- Vol.1 Character code conversion: EBCDIC⇔Shift_JIS
- Vol.2 Character Code Conversion: External Character Codes
- Vol.3 Character Code Conversion: Unicode (UTF-8)
- Vol.4 Data Conversion: Pack⇔Zone Edition
- Vol.5 (Fixed length ⇒ CSV data conversion)
- Vol.6 (Special Edition) HUB+ DataMagic Collaboration Case Study
- Vol.7 <Distribution BMS ⇒ CSV Conversion>
- Vol.8 <Matching Processing>
- Vol.9 (Full-width⇔Half-width Conversion)
- Vol.10 (DB table data extraction)
- Vol.11 Data Correction
- Vol.12 〈String Replacement〉
- Vol.13 (Special Edition) DataMagic Implementation Case Studies
- Vol.14 (Data sorting and file division)
- Vol.15 Data Collection
- Vol.16 File Merging
- Vol.17 HULFT Definition Bulk Registration
- Vol.18 (Special Edition) HULFT-WebFiletransfer + DataMagic Integration Case Study
- Vol.19 (Introduction Case Studies) Our company's use cases
- Vol.20 Data Masking
- Vol.21 (DB Migration Edition)
- Vol.22 <Common Components: EBCDIC Code Replacement>
- Vol.22 Appendix <Common Component Import and Execution Procedures>
- Vol.23 <Common Components: Date Conversion Function Extension>
- Vol.24 〈Resolving customer requests!: Common components that can process data even if there is invalid data〉