Companies and other business entities are actively involved in defining data products and applied research every year. Academia has always played a role in creating new methods and solutions/algorithms in the fields of machine learning and artificial intelligence.
However, there is doubt about how powerful and effective such research efforts are.
Is studying AI in academia a waste of time?
Our Sponsors
Explore the Complex World of Regulations. Compliance can be overwhelming. Multiple frameworks. Overlapping requirements. Let Arctic Wolf be your guide.
Check it out at https://arcticwolf.com/datascience
Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve. We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.
Transcript
1
00:00:04,270 –> 00:00:09,214
And here we are again with the season four of the Data Science at Home podcast.
2
00:00:09,322 –> 00:00:19,326
This time we have something for you if you want to help us shape the data science leaders of the future, we have created this the Data Science at Home’s Ambassador Program.
3
00:00:19,508 –> 00:00:28,558
Ambassadors are volunteers who are passionate about data science and want to give back to our growing community of data science professionals and enthusiasts.
4
00:00:28,714 –> 00:00:37,400
You will be instrumental in helping us achieve our goal of raising awareness about the critical role of data science in cutting edge technology.
5
00:00:37,850 –> 00:00:45,920
If you want to learn more about this program, visit the Ambassadors page on our website@datascienceathome.com.
6
00:00:46,490 –> 00:00:49,474
Welcome back to another episode of Data Science at Home podcast.
7
00:00:49,522 –> 00:00:51,190
I’m Francesco, your host.
8
00:00:51,250 –> 00:00:58,890
For the next 20 minutes or so, I’m podcasting from the regular office of A Methods Technologies based in Belgium.
9
00:00:59,270 –> 00:01:24,570
In this episode I would like to report some statements from a very respectable individual like Jeremy Howard, creator of Fast AI, also ex President of Kaggle, who said something very recently in an interview that is going to, let’s say, make some people angry or just disappointed or just pissed.
10
00:01:25,070 –> 00:01:31,758
Well, he basically said that research in the deep learning world is a total waste of time.
11
00:01:31,904 –> 00:01:39,450
And so this statement in fact is first of all the statement of a person who knows what he’s talking about.
12
00:01:39,560 –> 00:01:57,666
I respect Jeremy very much and for those who don’t know him, please google his story and you will understand immediately that this is not a charlatan or a person just blubs about things to get noticed on the web.
13
00:01:57,728 –> 00:02:02,194
And as a matter of fact, he has a point that is definitely undeniable.
14
00:02:02,302 –> 00:02:17,758
The fact that there is a lot of research in the deep learning world that is, let’s say, useless to say the least, or it’s definitely leading nowhere with respect to the big picture of artificial intelligence.
15
00:02:17,914 –> 00:02:24,750
It’s worth spending some minutes about this statement and also share my opinion.
16
00:02:25,190 –> 00:02:38,850
I’ve been a researcher myself, so I think I can contribute if there is a way to contribute to this statement and to give you my personal thoughts.
17
00:02:39,470 –> 00:02:43,926
Well, there are two things that essentially Mr.
18
00:02:43,988 –> 00:02:58,220
Jeremy Howard is reporting and he thinks that in fact there is lack of research or lack of interest in these two fields in particular, transfer learning and active learning.
19
00:02:58,550 –> 00:03:09,620
In his opinion, these are the two most, not the most important, but very important fields that paradoxically nobody is spending time on and on this.
20
00:03:10,250 –> 00:03:14,766
I’m 100% aboard, 100% on the same page here.
21
00:03:14,948 –> 00:03:40,530
I do believe transfer learning and active learning are extremely important and definitely would raise the bar and definitely facilitate or improve a lot of things happening in the deep learning world if these two things are improving or if more effort and more time and more resources would be spent around transfer learning and active learning.
22
00:03:40,580 –> 00:03:43,280
And that’s true in the academic world.
23
00:03:43,610 –> 00:03:49,640
Not so many people are putting or not so many institutions are putting enough effort on these two things.
24
00:03:50,690 –> 00:03:58,460
Now, what is transfer learning? Transfer learning is of course we have covered transfer learning a long time ago on this show.
25
00:03:58,910 –> 00:04:10,530
But it’s essentially a way, a methodology to save some training costs, for example, and move across domains, essentially.
26
00:04:11,030 –> 00:04:22,650
And so having models that can be tuned and be kind of ready for domains in which they were not thought for example.
27
00:04:22,760 –> 00:04:34,734
You can have an image classifier that has been trained on general purpose images and then you can do transfer learning to adapt that method or that neural network to.
28
00:04:34,772 –> 00:04:35,202
For example.
29
00:04:35,276 –> 00:04:43,702
Medical images or images in a more narrow sector you would just retrain.
30
00:04:43,786 –> 00:05:00,570
You would transfer the initial or the first layers of the network and then you would tune or retrain completely the last layers and that could save you a lot of time or a lot of costs or a lot of energy when you need to train.
31
00:05:00,620 –> 00:05:00,990
For example.
32
00:05:01,040 –> 00:05:06,810
A massive neural network of several million parameters or even billions of parameters.
33
00:05:07,490 –> 00:05:15,910
Not only that, so not only you save energy, you save costs, you save time when it comes to retraining a network.
34
00:05:16,030 –> 00:05:17,938
But transfer learning also helps.
35
00:05:17,974 –> 00:05:30,166
You dealing with lack of data in the medical imaging example, definitely general public images are much more available or much easier available than medical images.
36
00:05:30,298 –> 00:05:47,682
Think about medical images of a particular or a rare disease or disorder, right? So in that case, retraining a network from scratch would definitely lead you nowhere in terms of accuracy or robustness of the model.
37
00:05:47,816 –> 00:05:58,290
What would help, however, is utilizing or taking advantage of the initial layers of networks trained on the most available images that are out there.
38
00:05:58,340 –> 00:06:05,442
And that’s why in this case you’re dealing with lack of data in a smart way which is indeed using transfer learning.
39
00:06:05,576 –> 00:06:09,570
Another important benefit of transfer learning is definitely generalization.
40
00:06:09,890 –> 00:06:30,250
If you can transfer the learning or what a neural network learned from one sector to another it means that a lot of the inner internal parameters can be reutilized across sectors, across domains and that’s something that leads you towards a better generalization.
41
00:06:30,370 –> 00:06:52,674
The fact that the network can still perform maybe losing a bit of accuracy in percentage points but it could still perform in a relatively different scenario, relatively different domain energy, lack of data, moving from one domain to another, generalization costs, time are all benefits.
42
00:06:52,772 –> 00:07:11,060
So the statement of Jeremy Howard is if all these things are there like we get all these things for free once we put effort in transfer learning in the field of transfer why nobody’s putting effort on it, right, which makes sense.
43
00:07:11,930 –> 00:07:22,602
Another problem is another field that Jeremy Howard thinks or believes is not gaining, is not getting the attention that it deserves is active learning.
44
00:07:22,676 –> 00:07:26,802
And that’s also something that almost nobody is working on it.
45
00:07:26,816 –> 00:07:27,846
And that’s true.
46
00:07:28,028 –> 00:11:09,920
Active learning is very important because it allows to help a neural network or a machine learning model train on data that first of all are not there but they can be or they will be annotated by human beings and so there is kind of a human in the loop while we train such models for some domains for some particular use cases having a human in the loop is extremely important and again I refer to the medical sector one more time because we want humans to be in the loop during training especially at the beginning when we don’t really know if that particular machine learning model is going to perform and when it’s going to be ready to replace almost entirely for example a medical doctor it’s a lot of responsibility that we are putting in the hands so to speak of an artificial intelligence so we would better keep the human in the loop during training which is one of the most critical aspects of building a machine learning model the training part is in my opinion much more important than designing for example the topology because in the training there is a loss that is involved for example the quality of the data and the unpredictable results that the network can give you especially at the beginning of the training process and then all of a sudden or smoothly the network starts changing behavior according to the data that we feed the network with this statement of course opens it’s kind of provocative I read it across these lines I think it’s a provocation because Jeremy Howard is a very intelligent person and in my opinion he didn’t throw it just for the sake of getting on the news but he definitely was provocative with that statement and I think or at least that’s how I’ve interpreted this interview there is a problem in academic research and industry there is not enough bound between the two unfortunately and that’s something that we know is the case and it’s not something that has been appeared only in the last few years it’s something that has always been like that more or less in every country in every continent so in my opinion academic research and industry are not necessarily bound in fact I would add unfortunately but some other times it’s quite impossible to bind these two worlds or fill that gap between these two very diverse words that have a very different objectives in fact the fact that there’s no active research in transfer learning and active learning is probably because academia doesn’t have that need that in fact is for example felt by the industry the industrial world in the industry there is a need for transfer learning and active learning don’t forget that many companies out there kind of reinvent the wheel all the time for that particular use case and so we have seen this over and over again in different factors they take a neural network from an online repository and they tweak and tune it and change the topology and tailor to their needs to the need of the business use case.
47
00:11:10,730 –> 00:11:12,346
Be it fintech.
48
00:11:12,418 –> 00:11:13,774
Be it healthcare.
49
00:11:13,882 –> 00:11:15,270
Pharmaceuticals.
50
00:11:16,610 –> 00:11:17,206
Insured.
51
00:11:17,278 –> 00:11:18,910
Tech or automotive.
52
00:11:19,030 –> 00:11:27,838
So in my opinion, and that’s kind of a consequence of what Jeremy Howard is saying, nobody is bleeding for that problem in academia.
53
00:11:27,874 –> 00:11:31,074
Nobody is bleeding for lack of transfer learning and active learning.
54
00:11:31,112 –> 00:11:38,840
And that’s, my opinion, is the reason why these two fields are not really trendy in the academic world.
55
00:11:39,350 –> 00:11:42,226
Also, I have to say something about academia.
56
00:11:42,298 –> 00:11:58,446
I’ve been a researcher myself, I also published on internationally peer reviewed papers back in the days and probably my point of view is a bit outdated now, though I don’t believe that.
57
00:11:58,508 –> 00:12:13,100
But academia produces something that has to be publishable in a way that is usually novel, that improves on some state of the art, even not necessarily in a critical way.
58
00:12:13,790 –> 00:12:17,060
We have seen micro improvements on everything.
59
00:12:17,810 –> 00:12:26,600
We have a research group that publishes something claims that improved zero point something percent over the state of the art.
60
00:12:28,950 –> 00:12:34,586
That’s the currency that academia takes and considers for publications.
61
00:12:34,658 –> 00:12:41,230
At least that was the currency in my days, which is several years ago, it’s not decades.
62
00:12:41,970 –> 00:12:55,058
But more importantly, academia needs something that is scientifically sound, right, and can be explained with the tools that are pertinent to academics.
63
00:12:55,214 –> 00:13:24,010
They try to explain something that can be represented with a formula, a method that is sound, that can be proved as you prove ethereum, right? That’s the currency, that’s the language academia wants to talk, which is not necessarily the same language, in fact, it’s never the same language that the industrial world wants to speak.
64
00:13:24,180 –> 00:13:39,218
Now, there have been several attempts to explain deep learning with more consolidated methods or in a mathematically rigorous fashion and they all failed.
65
00:13:39,314 –> 00:13:40,870
To the best of my knowledge.
66
00:13:41,370 –> 00:13:48,698
We have been trying to, for example, compare or explain deep learning with thermodynamics or theoretical physics.
67
00:13:48,794 –> 00:13:56,326
Then some people, even with equilibria game theory, maybe I’m going to make an episode about all these things.
68
00:13:56,388 –> 00:14:06,250
But there have been used a lot of theoretical concepts to borrowed from other disciplines, usually physics, to explain deep learning.
69
00:14:06,420 –> 00:14:22,810
And the problem of deep learning is that many deep learning concepts are not fully mathematically rigorous as one can say, for example for physics or for abstract mathematics, for pure mathematics.
70
00:14:23,850 –> 00:14:34,030
There’s no formula that tells you if the data is in this shape and the topology of the network is in this other shape, then you’re going to get this as a formula.
71
00:14:34,410 –> 00:14:37,582
And that’s a problem because that’s what academia wants.
72
00:14:37,656 –> 00:14:51,010
And deep learning is definitely not that deep learning can offer everything, but a way to be formalized regardless of the attempts that have been of course by the community.
73
00:14:51,120 –> 00:15:10,346
And we have to be very grateful for these research groups that have tried their best to make deep learning to formalize the concept behind deep learning function optimization, for example, has been seen as an energy minimization problem in physics.
74
00:15:10,418 –> 00:15:28,826
So we can name many scenarios in which we take some mature or very consolidated theory from academia and we try to utilize it to explain deep learning and neural networks.
75
00:15:28,898 –> 00:15:29,866
That’s a matter of fact.
76
00:15:29,928 –> 00:15:40,930
Now, behind the statement of Jeremy Howard, of course, I don’t really have, I’m not in the position to say who’s right, who’s wrong.
77
00:15:41,040 –> 00:15:42,314
It’s definitely provocative.
78
00:15:42,362 –> 00:16:03,550
It’s definitely something that gives you food for thought and a debate should be open about this, if we will, but it doesn’t have to raise an eyebrow in that respect, in my opinion, because we have been doing this for several decades now.
79
00:16:03,600 –> 00:16:07,262
I see an analogy with, for example, computer programming.
80
00:16:07,406 –> 00:16:15,120
Now, academia taught me the basic constructs, the data structures and algorithms courses that I had.
81
00:16:15,510 –> 00:16:25,378
Another one on the theory of compilers, the pumping Llama or whatever other concept, academic concept you might think of.
82
00:16:25,524 –> 00:16:33,058
But at the end of the day, the actual programming language constructs were left to, let’s say, my passion, my time.
83
00:16:33,204 –> 00:16:37,474
It was not something that was taught in academia and it should stay like that.
84
00:16:37,512 –> 00:16:43,906
So I see an analogy there with deep learning and academic research.
85
00:16:43,968 –> 00:16:44,880
In deep learning.
86
00:16:45,870 –> 00:17:01,020
I agree with Jeremy that indeed there is some sort of incompatibility with what is the objective of the academic world with respect to what the objective of deep learning models would be.
87
00:17:02,190 –> 00:17:06,900
But I would not say it’s a total waste of time, that’s for sure.
88
00:17:07,470 –> 00:17:13,980
If you ask me, is researching deploying a waste of time? I would say not always.
89
00:17:14,490 –> 00:17:19,680
I would not say no, because it is sometimes, but not always.
90
00:17:20,250 –> 00:17:26,650
Many of the concepts related to, for example, function optimization are mostly coming from academic effort.
91
00:17:27,270 –> 00:17:42,540
Some other micro optimizations on network topology, I’ve seen dozens and dozens of papers where they literally slightly change the topology of a network and they boom, they have a new paper.
92
00:17:42,930 –> 00:17:56,890
Well, these are truly experimental and probably, yes, doing academic research around this very narrow concept would be a waste of time, in my opinion, but not entirely.
93
00:17:58,710 –> 00:18:05,090
I would definitely keep in mind the intrinsic gap, let’s say, between academia and industry.
94
00:18:05,150 –> 00:18:08,460
That, in my opinion, is definitely here to stay.
95
00:18:08,850 –> 00:18:09,946
Well, that’s it for today.
96
00:18:10,008 –> 00:18:11,402
Thank you very much for listening.
97
00:18:11,486 –> 00:18:13,140
I’ll speak with you next time.
98
00:18:13,830 –> 00:18:16,862
You’ve been listening to Data Science at home podcast.
99
00:18:16,946 –> 00:18:21,554
Be sure to subscribe on itunes, Stitcher or Pot Bean to get new, fresh episodes.
100
00:18:21,602 –> 00:18:27,640
For more, please follow us on Instagram, Twitter and Facebook or visit our website at datasciencehome.com