Technology AI Researcher Warns Data Science Could Face a Reproducibility...

-

AI Researcher Warns Data Science Could Face a Reproducibility Crisis

AI Researcher Warns Data Science Could Face a Reproducibility Crisis

Long-time Slashdot reader theodp shared this warning from a long-time AI researcher arguing that data science “is due” for a reckoning over whether results can be reproduced. “Few technological revolutions came with such a low barrier of entry as Machine Learning…”
Unlike Machine Learning, Data Science is not an academic discipline, with its own set of algorithms and methods… There is an immense diversity, but also disparities in skill, expertise, and knowledge among Data Scientists… In practice, depending on their backgrounds, data scientists may have large knowledge gaps in computer science, software engineering, theory of computation, and even statistics in the context of machine learning, despite those topics being fundamental to any ML project. But it’s ok, because you can just call the API, and Python is easy to learn. Right…?

Building products using Machine Learning and data is still difficult. The tooling infrastructure is still very immature and the non-standard combination of data and software creates unforeseen challenges for engineering teams. But in my views, a lot of the failures come from this explosive cocktail of ritualistic Machine Learning:

– Weak software engineering knowledge and practices compounded by the tools themselves;
– Knowledge gap in mathematical, statistical, and computational methods, encouraged black boxing API;
– Ill-defined range of competence for the role of data scientist, reinforced by a pool of candidates with an unusually wide range of backgrounds;
– A tendency to follow the hype rather than the science. –

What can you do?

– Hold your data scientists accountable using Science.
– At a minimum, any AI/ML project should include an Exploratory Data Analysis, whose results directly support the design choices for feature engineering and model selection.
– Data scientists should be encouraged to think outside-of-the box of ML, which is a very small box
– Data scientists should be trained to use eXplainable AI methods to provide context about the algorithm’s performance beyond the traditional performance metrics like accuracy, FPR, or FNR.
– Data scientists should be held at similar standards than other software engineering specialties, with code review, code documentation, and architectural designs.

The article concludes, “Until such practices are established as the norm, I’ll remain skeptical of Data Science.”

Read more of this story at Slashdot.

News for nerds, stuff that matters
Source : https://slashdot.org/story/24/06/16/0131202/ai-researcher-warns-data-science-could-face-a-reproducibility-crisis?utm_source=rss1.0mainlinkanon&utm_medium=feed

Latest news

Judge dismisses coders’ DMCA claims against Microsoft, OpenAI and GitHub

The partial dismissal indicates complainants failed to demonstrate that GitHub reproduces human-created code. The partial dismissal indicates complainants...

The German Government Is Selling More Bitcoin – $28 Million Moves to Exchanges

The German government shifted another $56 million in Bitcoin to different platforms, continuing to offload portions of...

German gov’t shifts further 3K BTC in 1 hour

The latest transactions by the German government follow MP outcries to stop the sell-off and protect the...

Pepe price falls over 20% while Pepe Unchained presale raises $2.5m

Pepe price falls over 20% while Pepe Unchained presale raises $2.5m Pepe is down 22% in a week, while new...

HashKey teases launch date for its Ethereum layer-2 network

HashKey teases launch date for its Ethereum layer-2 network Hong Kong-based crypto exchange HashKey has announced the launch timeframe for...

Korean banker arrested for $15m loan scheme, crypto spending: report

Korean banker arrested for $15m loan scheme, crypto spending: report A Woori Bank employee has reportedly been arrested for embezzling...
Advertisement

Must read

Advertisement

You might also likeRELATED
Recommended to you