By Gregory Piatetsky, KDnuggets.
commentsAs in the past, we bring you a roundup of predictions and analysis from experts.
We have asked
What were the main developments in __data Science and Analytics, AI was the dominant topic in most answers. Key themes touched by these experts include AI advances, both real and hype, Democratization of data Science and Analytics including self-service, Automation of everything including data Science, GDPR, AI Risks, real-time analytics, and more.
Fig. 1: Word Cloud from the answers to "Main developments in data Science and Analytics in 2018 and key trends in 2019"
Meta Brown, @metabrown312, is the author of data Mining for Dummies, President of A4A Brown, Inc. Cultivating effective communication between management and technical people.
The hot analytics topic for 2018 has been artificial intelligence.
Artificial intelligence may have generated more talk than any other application of analytics in recent memory. Sadly, much of that talk makes little sense.
Computing pioneer Alan Turing envisioned computers with capabilities that would rival human intelligence. Artificial intelligence would make it impossible to distinguish computer generated conversation from human conversation.
Think about interactions with today's artificial intelligence applications. Personal assistants, such as Siri or Alexa, may be useful, but are hardly indistinguishable from interacting with a human in the same way. Chatbots that power online help applications are so disappointing. Ask one about a real-life problem and you'll soon know there's no real brain behind it.
By Turing's definition, artificial intelligence doesn't exist yet. Gary Marcus, Professor of Psychology and Neural Science at New York University, says that the biggest misconception about artificial intelligence is "that people think we're close to it."
We do have useful real-life applications of computer-driven logic. They don't think just like people, but they are fast and consistent, and those are valuable characteristics. These applications enable machines to do practical work like flagging potentially fraudulent transactions and operating cars.
Despite the obvious limitations of the technology, the public, and even the tech community, is awash in unrealistic claims and expectations about artificial intelligence. Hyperbole is sparking fear among many. It's also beginning to disappoint others, with a lot more disappointment on the horizon.
Tom Davenport, @tdav, is the President's Distinguished Professor of Information Technology and Management at Babson College, the co-founder of the International Institute for Analytics, a Fellow of the MIT Initiative on the Digital Economy, and a Senior Advisor to Deloitte Analytics.
We predict annual trends at the International Institute for Analytics, and here are a couple that stand out for me:
- Organizations are paying increasing attention to model deployment rates - According to the Rexer data Science Survey, only 10-15% of companies "almost always" deploy results and another 50% only deploy "often." That leaves 35% - 40% of companies that only occasionally or rarely successfully deploy analytical models. I have encountered some organizations that say their successful deployment rates are less than 10%. Of course, there is no economic value to an analytical model that isn't deployed. Companies will need to measure and improve their deployment rates in 2019.
- Citizen data scientists and business analysts are here to stay. The rise of graphical and search-based analytics, as well as increasingly automated machine learning on the data science front, mean that we will see increasing amounts of analytical work done by amateurs. Fighting the trend is a losing battle, so focus instead on enabling it and putting guardrails around it. It also means that quantitative professionals will either need to move toward highly complex and difficult modeling tasks, or toward understanding business problems and addressing organizational change.
Carla Gentry, @data_nerd, is a consulting data Scientist and Owner of Analytical-Solution.
2018 was a stellar year for analytics and data science but we also saw the explosion of AI, Neural Nets and Machine learning, with and without the talent and or experience to back up claims. We saw an increase in the use of AI in the medical field and policing, again, with and without dangers of bias, talent and or experience, I think some have forgotten data equals lives in these instances and with wearables and IoT (Google Home, Alexa, etc.), expect that to continue.
2019 will be more of the same buzzwords and companies will start to realize it takes neural net thousands or millions of examples to learn from, what's worse, each time you want a neural net to recognize a new type of item, you have to start from the beginning (time consuming to say the least) - Talent is another issue, besides Geoffrey Hinton, Yejin Choi or Yann LeCun you really aren't an expert in neural nets, so don't expect a big talent pool to hire from.
data Science is about gleaning data insight and in some cases, it's not correct to expect us to be experts at AL, machine learning, or neural nets, so the differences will have to be more carefully explored and novice users will have to reskill to compete in this new future of tech. My fear is that a lack of true understanding of how machines learn and how artificial intelligence can be used without harm will continue to expose weaknesses with some companies/algorithms/firms.
Let's cheerfully move forward with all these technologies but understand, there are consequences if you get it wrong!
Bob E. Hayes, @bobehayes, is a researcher, writer and consultant, publisher of Business over Broadway and holds a PhD in industrial-organizational psychology.
The field of data science and analytics saw continued interest in all things machine learning, including reinforcement learning, chat bots and its impact on society.
In 2019, I expect to see a growing focus on ethics in AI, including privacy and security issues. There will be a growing emphasis on trying to understand how algorithms lead to particular decisions; we need to not only know that machine learning works to help us make decisions, but how it works (how did it arrive at the decision it did). Also, US companies will focus efforts on how they use consumers' personal data. California adopted the California Consumer Privacy Act (goes into effect in January 2020) and I expect (hope) that other states will follow their lead.
I fear a growth in the use of AI / machine learning in the creation and dissemination of fake news. Deep fakes have shown the ease with which people can manufacture video content showing people saying things they haven't said or acting in ways that they didn't act. As Max Tegmark says, being cognizant of how AI can be bad is not fear mongering, it's simply "safety engineering."
While there are many ways to learn about data science for data professionals through bootcamps, MOOCs and universities, I expect to see growth in attempts to educate the non-data professionals (e.g., managers and frontline employees) in the ways of analytics.
Cassie Kozyrkov, @quaesita, is Chief Decision Engineer, Google Cloud. Loves Stats, AI, data, puns, art, sci-fi, theatre, decision science.
One of the major developments for 2018 is the democratization of data science. From cloud technologies, which allow people to give resource-intensive big data and AI applications a whirl without having to build a data center first to tools like Kubeflow which bring scalable data science to folks without infrastructure expertise. This trend towards tools that make data science accessible to everyone will accelerate even more in 2019.
Doug Laney, @Doug_Laney, is a VP, Distinguished Analyst, and Chief data Officer Research at Gartner and author of "Infonomics".
Gartner's 2019 data & analytics strategy predictions was just published. They include that we're seeing an increase of corporate strategies explicitly mentioning that information is a critical enterprise asset and that analytics is an essential competency. These are not just IT strategies mentioning this, but corporate strategies and plans.
Also we're expecting that data literacy programs will become commonplace to help business people and data & analytics professionals communicate better, especially as analytic needs become more complex. As the principles and practices of infonomics become increasingly adopted, we're expecting chief data officers to partner more often with their CFOs to formally value the organization's information assets. Doing so has been shown to yield significant information management and business benefits for a number of our clients. But analytic and digital ethics continues to be an issue in which we believe organizations will begin introducing professional codes of conduct for their data science teams.
Also, we expect over the next 3-5 years that most new business systems will incorporate continuous intelligence that uses real-time context data; quantum computing proof-of-concept projects will dramatically outperform existing analytics techniques; augmented and automated insights will replace the vast majority of prebuilt reports; the use of location analysis will increase nearly 10x; and machine learning will ease the scramble to find data scientists.
Gregory Piatetsky, @kdnuggets, is the President of KDnuggets, data Scientist, co-founder of KDD conferences and SIGKDD, and no. 1 on LinkedIn 2018 Top Voices in data Science and Analytics
Main Developments in 2018
Key Trends for 2019
- GDPR, which took effect in May 2018, was a significant milestone not only in Europe, but also in the US and other areas, with many companies updating their privacy policies. However, it remains to be seen whether there will be actual improvement in consumer privacy or it will be business as usual under the cover of new privacy pages.
- Democratization of data Science continued, with many more tools wider giving access to data Science insights. I note major new tools announced at AWS reinvent.
- AI Risks: First fatality from a self-driving car happened when a self-driving car was confused by a pedestrian walking with a bicycle. This increased spotlight on inevitable risks of AI. At the same time, self-driving cars (and automated AI) should not be held to an impossible zero errors standard, but compared to current risks. For example, human driving is extremely dangerous, with 37,000 traffic deaths just in the US in 2017.
- data Science Automation will continue at accelerating pace, but data Scientist jobs are safe from full automation at least for the next few years.
- AI progress and Hype: while AI progress is real, AI Hype will grow even faster
- China has become a major player in AI, with many Chinese firms doing their own innovations and not just copying from the US.
- Reinforcement Learning will play an increasingly central role in AI progress. See for example amazing progress of RL in solving Montezuma's Revenge Atari game, reaching level 100 and exceeding by far all previous records, computer or human, in this game.
Bill Schmarzo, @schmarzo, is CTO, IoT & Analytics Hitachi Vantara .
Main developments in Big Data, data Science or Analytics in 2018
Key trends in 2019
- Dramatic increase in awareness by business stakeholders of the business-changing potential of machine learning and deep learning. That was driven by a growing plethora of published use cases.
- data lakes still continue to be a mis-cast asset. Too many organizations look at the data lake as a way to drive out expensive data warehouse and ETL costs, but don't fully comprehend the data lake as the collaborative value creation platform around which the business stakeholders and the data science teams can derive and drive business value.
- For leading organizations, big data and data science initiatives will be driven by the business, not IT. The business leaders will own identifying, validating, vetting, valuing and prioritizing the areas of the business where big data, IoT and data science (machine learning, deep learning, artificial intelligence) can drive business outcomes.
- More than just using data Science to optimize key business and operational processes (still a great place to start with compelling ROI's), leading organizations will realize that the customer, product, and operational insights buried in the data are the drivers of new monetization opportunities.
Kate Strachnyi, @StorybyData, is a data Visualization Specialist, Author of The Disruptors: data Science Leaders and Journey to data Scientist; host of Humans of data Science video podcast.
Main Developments in data Science and Analytics in 2018
Key Trends in 2019
- General data Protection Regulation (GDPR): The EU regulation that took effect in May 2018, provides a set of rules designed to give EU citizens more control over their personal data. This encouraged similar standards to be set in other locations. For example, California passed its own digital privacy law; that allows consumers to know what information organizations are collecting about them, why they are collecting that data and who they are sharing it with.
- Self-Serve Business Intelligence (BI)Tools: BI tools are becoming even more common-place amongst data analysts and business analysts. However, it's not clear whether the users of the tools are keeping up with the analytics that's taking place behind the scenes. There appears to be a gap between the speed at which users are learning to drag and drop fields into the tool and create charts, and the actual understanding of what is happening on the back-end.
- Data Ethics & Privacy: There will be an increased focus on considering the ethics/ privacy of working with data; at every step of the data science process. Those working with data need to understand that they hold significant power and need to consider the implications of their work.As our world becomes increasingly digital, this is a growing concern for individuals, companies, and governments.
- Process Automation: Companies will continue to automate processes to reduce costs and become more efficient. This automation may also result in job loss amongst the individuals responsible for carrying out the processes that are being automated. People need to focus on learning new skills that are growing in demand to stay current in this quickly-changing environment.
Ronald van Loon, @Ronald_vanLoon, a Director of Adversitement, Helping data Driven Companies Generating Success. Top10 Big Data, data Science, IoT, AI Influencer
In 2018 end-to-end data management grew as companies are using all data sources to gain trustworthy insights and support an infrastructure and business model that's aligned with the digital economy while moving upwards in analytics maturity. Machine Learning was widely embraced as all software vendors built it into their applications with domain-specific solutions.
In 2019, there will be more integrated hardware and software frameworks for a sophisticated approach to supporting next level Deep Learning applications that will further increase innovation. Deep Learning applications need fully optimized hardware and software stacks to promote a new, modern AI architecture. We'll see the rise of this full-stack approach by vendors across all domains in response to the accelerating demand optimal Deep Learning performance and capabilities.
Real-time Edge Analytics will exponentially grow along with the growth of IoT devices, making real-time analytics easier and facilitating immediate responses based on real-time insights.
Favio Vazquez, @FavioVaz, data Scientist. Physicist and computational engineer, Founder, Ciencia y Datos
2018 was an amazing year for data Science (DS), with great advances both in theory and practice. Several methodologies for DS were proposed, which may help transform the field into an actual science. I've been talking about it for more than a year, and saw more people discussing it recently. Regarding Machine Learning (ML), AutoML was huge, and that includes auto Deep Learning too.
Key trends for 2019:
- AutoX: We will see more companies developing and including into their stack technologies and libraries for automatic Machine and Deep Learning. The X here means that this auto-tools will be extended to data ingestion, data integration, data cleansing, exploration and deployment. Automation is here to stay.
- Semantic technologies: On the most interesting discoveries for me this year was the connection between DS and semantics. It's not a new field in the data-world but I see more people getting an interest in the field of semantics, ontologies, knowledge-graphs and its connection to DS and ML.
- Programming less: This is a hard thing to say, but with automation in almost every step of the DS process we will program less and less everyday. We will have tools for creating code and that will understand what we want with NLP and then transform that into queries, sentences and full programs. I think [programming] it's still a very important thing to learn, but it will be more easy soon.
- Digital education: This is something growing each year, but for the next year we will see more people going into MOOCs, digital classes, online courses and more than ever before. Someone can call this the "democratization of education", and I think in a big part this is true, but I have a message for all people doing this: Be careful on what you see and how you learn, investigaste before expending time and money into whatever course, the good ones will change your life for good, but the others are very dangerous.
Jen Underwood, @idigdata, is a Senior Director at DataRobot and founder of Impact Analytix, LLC.
AI hype and transformational impact was everywhere in 2018. Several years ago, big data was all the rage, then cloud, and now machine learning dominates the stage. Apps, bots, and business intelligence solutions wide and far have cooked in AI. Today even beer is AI-driven.
This year we also saw a surge in automation market momentum. Many machine learning solutions today are touting human guided, automated data analysis to deeper automated machine learning (AutoML) across the entire project life-cycle. From simple drag and drop, button click wizards that create basic models to sophisticated feature engineering, model search, hyperparameter tuning, deployment, model management and monitoring, the capabilities of AutoML vary widely - so do the quality of results.
In 2019, concerns about governing citizen data science, privacy, bias, ethics, and the rise of deep fakes will test our faith in AI. Innovative technologies such as blockchain will start to change how we store, share, and track data. I also expect to see far more emphasis on fair, transparent, and accountable AI that non-data scientists can understand, explain, and trust. A massive gap currently exists trying to translate data scientist lingo into common language for everyone else. As organizations adopt AI in our imperfect world while concurrently scaling development to citizen data scientists, many more people need to become data literate soon to avoid AI gone wrong pain.
- Data Science, Machine Learning: Main Developments in 2017 and Key Trends in 2018
- Big Data: Main Developments in 2017 and Key Trends in 2018
- Machine Learning & Artificial Intelligence: Main Developments in 2017 and Key Trends in 2018
- Industry Predictions: Main AI, Big Data, data Science Developments in 2017 and Trends for 2018