IIS Data Science
Data Science Training Focuses on Stats and ML
IBM Information Server is the key to Actionable Data Science
Why make this bold statement? Most people think IBM’s flagship data integration and governance product is only in the back room building enterprise data warehouses (EDWs) for the largest organizations on earth. Let’s look at four critical points:
- 80% of the time for data science is spent in data integration and cleansing (DI/C)
- Most data science training favors statistics and machine learning over DI/C techniques
- Most open source focuses on stats techniques and process rather than DI/C
- IIS users tend to focus on infrastructure like EDWs rather than exploration like data science. They are usually separate groups under separate management.
We don’t want our data scientists to type row after row of data integration code into Jupyter notebooks when they can drag, drop, copy, share, and deploy transformations in a fraction of the time:
Is this robust enough for your data?
Let’s look at six big opportunities for organizations to bring data science into their mainstream:
- Many already use IIS for data integration, provisioning, and governance on the data you need.
- IIS installations work mostly at night, leaving capacity for data science during the day.
- Management wants a fast prototype-to-production process—where IIS excels.
- IE will teach Data Scientists the parts of IIS they need to integrate and cleanse.
- Management wants the results of data science to be mainstreamed quickly: IE will help.
- If you don’t have IIS, talk to Integration Expert about cost-effective options