The Data Engineering and Machine Learning Center (hereinafter referred to as the DEML Center), established in July 2019 through a collaboration agreement between Shiga University and Teikoku Databank Ltd., is engaged in practical and interdisciplinary research using open data and other methods, focusing on data management, such as data polishing and processing, which is essential for improving the accuracy of data analysis. However, polishing and processing data using programming is inefficient, and issues such as handover between staff who are replaced upon graduation and completion of work arise. To solve these problems, they chose DataSpider, which enables data polishing and processing without programming.
Customer Issues
Programming development for data polishing and processing is a high hurdle for research support staff (students), and the low readability makes it difficult to hand over when staff change every year.
Benefits of implementation
Data polishing and processing
Programming
Efficiency
Flow Visualization
and smooth
Achieving handover
Research results
Horizontal deployment and secondary use
easily
Background: The efficiency and readability of data polishing and processing programming development are issues
Professor, School of Data Science
Tomoyuki Sugimoto
One of Shiga University's key industry-academia collaborative research projects is the DEML Center, established in July 2019 through a collaboration agreement with Teikoku Databank Ltd. Since its establishment, the DEML Center has used big data released by companies and government agencies to achieve numerous research results, including a model for predicting the number of bankruptcies during the COVID-19 pandemic and automating optimal delivery route and vehicle allocation. Professor Tomoyuki Sugimoto of the School of Data Science, however, commented on the challenges the Center faced in its early days: "In university courses, students are taught analytical techniques and approaches based on clean, prepared data. However, in the real world, data is disparate and scattered everywhere. It typically takes more time to polish and process the collected data than to analyze it. While we also teach programming to automate data polishing and processing in university courses, there are limitations due to time commitments. The aim of the DEML Center is to bridge this gap between academia and the real world and to develop data scientists with more practical skills."
In addition to the 12 researchers, the DEML Center employs nearly 20 students who have taken intensive courses in data polishing and acquired a certain level of know-how as research assistants. However, these staff members change significantly each year, requiring each to take over their responsibilities. New employees must start by deciphering the programs of their predecessors, which is extremely unproductive. In order to produce results while cultivating excellent data scientists in a short period of time, not only was the programming development inefficient, but the extremely low readability due to its individuality was also a problem.
Decided to use DataSpider to streamline and visualize implementation programming
Meanwhile, Takaya Osato, a special lecturer at the Data Science Education and Research Center seconded from Teikoku Databank, explains the aim of opening the DEML Center: "We wanted to accumulate analytical know-how on the vast amount of data available in the world, in addition to the corporate data that Teikoku Databank specializes in, so we approached the establishment of this center, which would allow us to connect with a wide range of companies and local governments." However, this real data comes in a variety of formats, and the pre-processing of polishing and processing the data is unavoidable before proceeding with analysis. This necessitates solving the aforementioned issues of inefficiency in manual programming development and the difficulty of handing over due to low readability.
Specially Appointed Lecturer, Data Science Education and Research Center
Takaya Osato
The solution that the DEML Center chose to solve this problem was DataSpider. "We learned about DataSpider through a data engineering project with Teikoku Databank. We were surprised that data polishing and processing work that previously took two months could be completed in two days. Also, since anyone can easily create a data polishing and processing system without programming, and the GUI-based icons are highly readable, we wanted to introduce it to the DEML Center as a way to promote 'visualization of flows'," says Osato.
Following a proposal from Teikoku Databank, Saison Technology Co., Ltd. joined the DEML Center in December 2019. Through support for the use of "DataSpider," the company will receive practical data polishing and processing technology and know-how.
Effect: DataSpider makes it easy to modify algorithms through flow
Since its opening, the DEML Center has been holding online training sessions for students every year with the cooperation of Saison Technology to introduce DataSpider, and tools like DataSpider, which are not taught in lectures, are a fresh experience for students, with many positive reactions. "In particular, students who have participated in statistical data analysis competitions and the like know the importance of working together as a team, and they commented that using DataSpider would allow for smoother information sharing within the team and improve work efficiency," says Sugimoto.
The product was also used in pre-processing data polishing and processing in a joint research project with Nose Steel Co., Ltd. (headquartered in Osaka), the results of which were announced in October 2020. An optimization algorithm was developed for the company's 15 trucks, which deliver to an average of 250 destinations per day, and this was delivered to the company as optimization software. Regarding this result, Sugimoto said, "We hope to be able to provide the technology we have cultivated here to not only the transportation industry, but also to any company that handles the delivery of its own products. In doing so, we will need to adjust the data polishing and algorithms to suit each company's requirements, but by creating a flow for the logic portion with DataSpider, it will be relatively easy to adapt it to each company's needs."
Professor and Head of Data Science
Mr. Akimichi Takemura
This spring, the DEML Center saw its first major staff turnover as the first batch of students from the School of Data Science took off into the workforce. Dean Takemura Akimichi spoke about the DEML Center's efforts to date and their significance: "In the real world, data polishing and processing as preprocessing for analysis plays a significant role. By working as a research assistant on DEML Center projects, students will be able to understand the importance and difficulty of this process, and I believe it will be a valuable experience for them as they work as data scientists after graduation."
Shiga University, National University Corporation
Founded in 1949 with the history and culture of Omi as its foundation, the university has two liberal arts faculties, the Faculty of Education and the Faculty of Economics, and in April 2017 it became the first university in Japan to establish a Faculty of Data Science. The university aims to be a "university that combines mathematics and data science in the era of Society 5.0," and aims to cultivate talent with mathematical thinking, data analysis and utilization skills, and the ability to create value from data. In 2019, the Graduate School of Data Science was added. The university also focuses on industry-academia-government collaboration with various companies and local governments, such as Teikoku Databank Ltd., and university collaboration, promoting the use of data in a wide range of fields beyond the boundaries of academic research.
- The content of this case study is current as of the time of the interview. The content of this case study may change without notice.


