The foundation of every ML model is the data that it is trained on. In many cases you will be working with tabular or unstructured information, but there is a growing trend toward networked, or graph data sets. Benedek Rozemberczki has focused his research and career around graph machine learning applications.
In this episode he discusses the common sources of networked data, the challenges of working with graph data in machine learning projects, and describes the libraries that he has created to help him in his work. If you are dealing with connected data then this interview will provide a wealth of context and resources to improve your graph data applications.
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by giving an overview of when you might want to do machine learning on networked/graph data?
- How do networked data sets change the way that you approach machine learning tasks?
- Can you describe the current state of the ecosystem for machine learning on graphs?
- You have created a number of libraries to address different aspects of machine learning on graph networks. Can you list them and share some of the stories behind their creation?
- How do the different tools relate to each other?
- Can you talk through some of the structural and user experience design principles that you lean on when building these libraries?
- When you are working with networked data sets, what is your current workflow from idea to completion?
- What are the most difficult aspects of working with networked data sets for machine learning applications?
- What are the most interesting, innovative, or unexpected ways that you have seen graph ML used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on graph ML problems?
- What are some examples of when you would choose not to use some or all of your own libraries?
- What do you have planned for the future of your libraries/what new libraries do you anticipate needing to build?
This article has been published from the source link without modifications to the text. Only the headline has been changed.