Programs focus more on math, statistics, and computer science
The big idea
According to our new study, basic training for data scientists identified by Harvard Business Review as the 21st sexiest job prepares a specific field such as astronomy, linguistics, medicine, psychology or sociology. The idea behind this computing is to use big data to address otherwise unsolvable problems, such as how healthcare providers can create personalized medicine based on a patient’s genes, and how companies can make purchase predictions based on customer behavior.
The US Bureau of Labor Statistics is forecasting 15% growth in data science majors in 2019-2029, reflecting increased demand for data science education. Universities and colleges have responded to the demand by creating new programs or revamping existing ones. The number of bachelor’s degrees in data science in the US has increased from 13 in 2014 to at least 50 in September 2020. As data science teachers and professionals, the growth of programs has led us to examine what is covered and what is not covered in undergraduate training in data science.
In our study, we compared the undergraduate data science curricula with expectations for undergraduate data science presented by the National Academies of Sciences, Engineering and Medicine. Ethics training is one of these expectations. We found most programs dedicated considerable coursework to mathematics, statistics and computer science, but little training in ethical considerations such as data protection and systemic biases. Only 50% of the courses we examined required an ethics course.
Why it matters
As with any powerful tool, a responsible data science application requires training in using data science and understanding its implications. Our results are consistent with previous work that found that ethics received little attention in data science programs. Data science programs can produce a workforce without the training and judgment to apply data science methodology responsibly.
It’s not difficult to find examples of irresponsible use of data science. For example, surveillance models with built-in data distortion can lead to a high police presence in historically overcrowded neighborhoods. In another example, the algorithms used by the USA. The system is so distorted that black patients receive less care than white patients with similar needs. We believe that explicit training in ethical practices would better prepare a workforce with socially responsible data on science.
What still isn’t known
While data science is a relatively new field still defined as a discipline, there are guidelines for training undergraduate students in data science that raise the question: How much education can we expect in a undergraduate degree? The National Academies recommend training in 10 areas, including ethical problem solving, communication, and data management. Our work focused on undergraduate data science degrees at schools classified as R1, meaning they engage in high levels of research activity. Further research could examine the amount of training and preparation in various aspects of Data science at master’s and doctoral level and the type of undergraduate data science training at schools at different research levels.
Given that many data science programs are new, there is considerable opportunity to compare the training that students receive with the expectations of employers.
What’s next
We plan to expand on our findings by investigating the pressures that might be driving curriculum development for degrees in other disciplines that are seeing similar job market growth.
Source link